A Survey of Data Agents: Emerging Paradigm or Overstated Hype? Paper • 2510.23587 • Published 7 days ago • 64
Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning Paper • 2510.19338 • Published 12 days ago • 101
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation Paper • 2504.11109 • Published Apr 15 • 2
ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory Paper • 2509.25140 • Published Sep 29 • 11
Cache-to-Cache: Direct Semantic Communication Between Large Language Models Paper • 2510.03215 • Published about 1 month ago • 94
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published 28 days ago • 462
DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published Sep 29 • 136
Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant Paper • 2508.20907 • Published Aug 28 • 1
QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL Paper • 2510.00967 • Published Oct 1 • 11 • 2
Fine-Tuning Large Language Models on Quantum Optimization Problems for Circuit Generation Paper • 2504.11109 • Published Apr 15 • 2
QUASAR: Quantum Assembly Code Generation Using Tool-Augmented LLMs via Agentic RL Paper • 2510.00967 • Published Oct 1 • 11
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework Paper • 2405.11143 • Published May 20, 2024 • 41