UniVideo: Unified Understanding, Generation, and Editing for Videos Paper • 2510.08377 • Published 20 days ago • 67
EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing Paper • 2509.26346 • Published 29 days ago • 18
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion Paper • 2509.01215 • Published Sep 1 • 50
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering Paper • 2507.08776 • Published Jul 11 • 54
CLiFT: Compressive Light-Field Tokens for Compute-Efficient and Adaptive Neural Rendering Paper • 2507.08776 • Published Jul 11 • 54
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published Jun 20 • 63
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published Jun 20 • 63
BlenderFusion: 3D-Grounded Visual Editing and Generative Compositing Paper • 2506.17450 • Published Jun 20 • 63 • 1
Ego-R1: Chain-of-Tool-Thought for Ultra-Long Egocentric Video Reasoning Paper • 2506.13654 • Published Jun 16 • 43
Pixel Reasoner: Incentivizing Pixel-Space Reasoning with Curiosity-Driven Reinforcement Learning Paper • 2505.15966 • Published May 21 • 53