Front-Loading Reasoning: The Synergy between Pretraining and Post-Training Data Paper • 2510.03264 • Published Sep 26 • 23
Self-Imagine: Effective Unimodal Reasoning with Multimodal Models using Self-Imagination Paper • 2401.08025 • Published Jan 16, 2024
NVIDIA Nemotron Nano 2: An Accurate and Efficient Hybrid Mamba-Transformer Reasoning Model Paper • 2508.14444 • Published Aug 20 • 36
NEMOTRON-CROSSTHINK: Scaling Self-Learning beyond Math Reasoning Paper • 2504.13941 • Published Apr 15 • 11
VISREAS: Complex Visual Reasoning with Unanswerable Questions Paper • 2403.10534 • Published Feb 23, 2024 • 2
MIND: Math Informed syNthetic Dialogues for Pretraining LLMs Paper • 2410.12881 • Published Oct 15, 2024 • 1
Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models Paper • 2504.03624 • Published Apr 4 • 15
Difference-Masking: Choosing What to Mask in Continued Pretraining Paper • 2305.14577 • Published May 23, 2023