arxiv:2505.11493
Yusu Qian
YusuQian
AI & ML interests
multimodal llm research
Recent Activity
upvoted
a
paper
6 days ago
PRISM-Bench: A Benchmark of Puzzle-Based Visual Tasks with CoT Error
Detection
upvoted
a
paper
12 days ago
Pico-Banana-400K: A Large-Scale Dataset for Text-Guided Image Editing
upvoted
a
paper
15 days ago
OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding
LLM