Controllable emotional/voice-acting TTS (now with v1.1)
Interact with a multimodal chatbot using text and images
OmniGen2: Unified Image Understanding and Generation.
Expressive Zeroshot TTS
Scalable and Versatile 3D Generation from images