Questions about distillation algorithms

#25
by min123456 - opened

The distillation models related to qwen are all distilled based on the DMD2 algorithm. The distillation models related to video generation are based on improved training of self-forcing. I don’t know if this is correct.

Sign up or log in to comment