UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation Paper • 2510.18701 • Published 11 days ago • 66
PICABench: How Far Are We from Physically Realistic Image Editing? Paper • 2510.17681 • Published 12 days ago • 61
Factuality Matters: When Image Generation and Editing Meet Structured Visuals Paper • 2510.05091 • Published 26 days ago • 18
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published 25 days ago • 52
NovelSeek: When Agent Becomes the Scientist -- Building Closed-Loop System from Hypothesis to Verification Paper • 2505.16938 • Published May 22 • 120
Unified Multimodal Chain-of-Thought Reward Model through Reinforcement Fine-Tuning Paper • 2505.03318 • Published May 6 • 93
InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models Paper • 2504.10479 • Published Apr 14 • 298
Adversarial Data Collection: Human-Collaborative Perturbations for Efficient and Robust Robotic Imitation Learning Paper • 2503.11646 • Published Mar 14 • 35
MM-Eureka: Exploring Visual Aha Moment with Rule-based Large-scale Reinforcement Learning Paper • 2503.07365 • Published Mar 10 • 61
Running 43 43 Open LMM Reasoning Leaderboard 🥇 A Leaderboard that demonstrates LMM reasoning capabilities
POINTS1.5: Building a Vision-Language Model towards Real World Applications Paper • 2412.08443 • Published Dec 11, 2024 • 38
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline Paper • 2411.12814 • Published Nov 19, 2024 • 25
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI Paper • 2411.14522 • Published Nov 21, 2024 • 39
SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation Paper • 2411.14525 • Published Nov 21, 2024 • 21
SegBook: A Simple Baseline and Cookbook for Volumetric Medical Image Segmentation Paper • 2411.14525 • Published Nov 21, 2024 • 21
Interactive Medical Image Segmentation: A Benchmark Dataset and Baseline Paper • 2411.12814 • Published Nov 19, 2024 • 25
GMAI-VL & GMAI-VL-5.5M: A Large Vision-Language Model and A Comprehensive Multimodal Dataset Towards General Medical AI Paper • 2411.14522 • Published Nov 21, 2024 • 39
POINTS: Improving Your Vision-language Model with Affordable Strategies Paper • 2409.04828 • Published Sep 7, 2024 • 24
GMAI-MMBench: A Comprehensive Multimodal Evaluation Benchmark Towards General Medical AI Paper • 2408.03361 • Published Aug 6, 2024 • 85