Generalizable Reward Models
			
	
	- 
	
	
	Ray2333/GRM-llama3-8B-sftregText Classification • 8B • Updated • 6 • 5
- 
	
	
	Ray2333/GRM-llama3-8B-distillText Classification • 8B • Updated • 686 • 6
- 
	
	
	Ray2333/GRM-Gemma-2B-sftregText Classification • 3B • Updated • 35 • 3
- 
	
	
	Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMsPaper • 2406.10216 • Published • 2







