Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published 11 days ago • 101
Efficient Long-context Language Model Training by Core Attention Disaggregation Paper • 2510.18121 • Published 12 days ago • 116
Reasoning with Sampling: Your Base Model is Smarter Than You Think Paper • 2510.14901 • Published 17 days ago • 44
DeepAgent: A General Reasoning Agent with Scalable Toolsets Paper • 2510.21618 • Published 9 days ago • 91
D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI Paper • 2510.05684 • Published 26 days ago • 134
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published 27 days ago • 462
Lumina-DiMOO: An Omni Diffusion Large Language Model for Multi-Modal Generation and Understanding Paper • 2510.06308 • Published 26 days ago • 52
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use Paper • 2510.05592 • Published 26 days ago • 94
Where LLM Agents Fail and How They can Learn From Failures Paper • 2509.25370 • Published Sep 29 • 11