olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models Paper • 2502.18443 • Published Feb 25 • 5
Attention Is All You Need for KV Cache in Diffusion LLMs Paper • 2510.14973 • Published 13 days ago • 36
Reasoning with Sampling: Your Base Model is Smarter Than You Think Paper • 2510.14901 • Published 13 days ago • 41
Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models Paper • 2510.10964 • Published 16 days ago • 2
Learning to Grasp Anything by Playing with Random Toys Paper • 2510.12866 • Published 15 days ago • 5
VLA-0: Building State-of-the-Art VLAs with Zero Modification Paper • 2510.13054 • Published 15 days ago • 9
Ctrl-World: A Controllable Generative World Model for Robot Manipulation Paper • 2510.10125 • Published 18 days ago • 1
The Alignment Waltz: Jointly Training Agents to Collaborate for Safety Paper • 2510.08240 • Published 20 days ago • 41
Demystifying Reinforcement Learning in Agentic Reasoning Paper • 2510.11701 • Published 16 days ago • 31
Which Heads Matter for Reasoning? RL-Guided KV Cache Compression Paper • 2510.08525 • Published 20 days ago • 22
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 16 days ago • 168
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published 20 days ago • 32
StreamingVLM: Real-Time Understanding for Infinite Video Streams Paper • 2510.09608 • Published 19 days ago • 49
ArcMemo: Abstract Reasoning Composition with Lifelong LLM Memory Paper • 2509.04439 • Published Sep 4 • 1
Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense Paper • 2510.07242 • Published 21 days ago • 30
LightMem: Lightweight and Efficient Memory-Augmented Generation Paper • 2510.18866 • Published 8 days ago • 105