MMPersuade: A Dataset and Evaluation Framework for Multimodal Persuasion Paper • 2510.22768 • Published 8 days ago • 6
UNIDOC-BENCH: A Unified Benchmark for Document-Centric Multimodal RAG Paper • 2510.03663 • Published about 1 month ago • 15
GUI-KV: Efficient GUI Agents via KV Cache with Spatio-Temporal Awareness Paper • 2510.00536 • Published Oct 1 • 6
VL-Cogito: Progressive Curriculum Reinforcement Learning for Advanced Multimodal Reasoning Paper • 2507.22607 • Published Jul 30 • 46
Why Vision Language Models Struggle with Visual Arithmetic? Towards Enhanced Chart and Geometry Understanding Paper • 2502.11492 • Published Feb 17 • 2
CRMArena: Understanding the Capacity of LLM Agents to Perform Professional CRM Tasks in Realistic Environments Paper • 2411.02305 • Published Nov 4, 2024 • 1
CRMArena-Pro: Holistic Assessment of LLM Agents Across Diverse Business Scenarios and Interactions Paper • 2505.18878 • Published May 24 • 2
Do LVLMs Understand Charts? Analyzing and Correcting Factual Errors in Chart Captioning Paper • 2312.10160 • Published Dec 15, 2023 • 2
LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models Paper • 2308.16137 • Published Aug 30, 2023 • 40