None defined yet.
SPICE: Self-Play In Corpus Environments Improves Reasoning
Beyond Reasoning Gains: Mitigating General Capabilities Forgetting in Large Reasoning Models