DC-AR Collection [ICCV 2025] DC-AR: Efficient Masked Autoregressive Image Generation with Deep Compression Hybrid Tokenizer • 2 items • Updated 10 days ago • 1
Gauss Gym Datasets Collection Datasets used for the gauss gym photorealistic simulator • 4 items • Updated 11 days ago • 5
X-Streamer: Unified Human World Modeling with Audiovisual Interaction Paper • 2509.21574 • Published Sep 25 • 7
ARC-Encoders Collection Pretrained ARC-Encoders and a fine-tuning dataset: context compression for unmodified LLMs. • 6 items • Updated 5 days ago • 3
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published 22 days ago • 133
AndesVL Collection AndesVL is a suite of mobile-optimized Multimodal Large Language Models (MLLMs) with 0.6B to 4B parameters. • 8 items • Updated 14 days ago • 10
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published 22 days ago • 454
EditScore: Unlocking Online RL for Image Editing via High-Fidelity Reward Modeling Paper • 2509.23909 • Published about 1 month ago • 29
EliGen: Entity-Level Controlled Image Generation with Regional Attention Paper • 2501.01097 • Published Jan 2 • 2
RDT 2 Collection RDT 2, the sequel to RDT-1B, is the first foundation model that achieves zero-shot deployment on unseen embodiments for simple open-vocabulary tasks. • 4 items • Updated Sep 26 • 15
Orpheus Music Transformer Collection All HF spaces demos for the Orpheus Music Transformer • 8 items • Updated Jul 9 • 9