Submitted by derenlei 12 Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models AI at Meta 1
Submitted by jackzhang 41 The Alignment Waltz: Jointly Training Agents to Collaborate for Safety AI at Meta 2
Submitted by Kylin-ll 30 Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense AI at Meta 3
Submitted by nielsr 13 OneFlow: Concurrent Mixed-Modal and Interleaved Generation with Edit Flows AI at Meta 4
Submitted by jacobkahn 7 CWM: An Open-Weights LLM for Research on Code Generation with World Models AI at Meta 698 2
Submitted by weizhepei 53 TruthRL: Incentivizing Truthful LLMs via Reinforcement Learning AI at Meta 3