All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages Paper • 2411.16508 • Published Nov 25, 2024 • 12
Time Travel: A Comprehensive Benchmark to Evaluate LMMs on Historical and Cultural Artifacts Paper • 2502.14865 • Published Feb 20 • 1
LLM Post-Training: A Deep Dive into Reasoning Large Language Models Paper • 2502.21321 • Published Feb 28 • 1
DriveLMM-o1: A Step-by-Step Reasoning Dataset and Large Multimodal Model for Driving Scenario Understanding Paper • 2503.10621 • Published Mar 13
Fann or Flop: A Multigenre, Multiera Benchmark for Arabic Poetry Understanding in LLMs Paper • 2505.18152 • Published May 23 • 1
How Good are Foundation Models in Step-by-Step Embodied Reasoning? Paper • 2509.15293 • Published Sep 18
Beyond Simple Edits: Composed Video Retrieval with Dense Modifications Paper • 2508.14039 • Published Aug 19
LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs Paper • 2501.06186 • Published Jan 10 • 65
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT Paper • 2402.16840 • Published Feb 26, 2024 • 26