- 
	
	
	  spiral-rl/Spiral-Qwen3-4BText Generation • 4B • Updated • 23 • 4
- 
	
	
	  spiral-rl/Spiral-DeepSeek-R1-Distill-Qwen-7BText Generation • 8B • Updated • 5 • 2
- 
	
	
	spiral-rl/Spiral-Kuhn-Poker-Qwen3-32B-SFTViewer • Updated • 25.5k • 20
- 
	
	
	SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement LearningPaper • 2506.24119 • Published • 50
AI & ML interests
None defined yet.
Recent Activity
	View all activity
	
- 
	
	
	  spiral-rl/Spiral-Qwen3-4BText Generation • 4B • Updated • 23 • 4
- 
	
	
	  spiral-rl/Spiral-DeepSeek-R1-Distill-Qwen-7BText Generation • 8B • Updated • 5 • 2
- 
	
	
	spiral-rl/Spiral-Kuhn-Poker-Qwen3-32B-SFTViewer • Updated • 25.5k • 20
- 
	
	
	SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement LearningPaper • 2506.24119 • Published • 50