Yihao Quan PRO
0x33B
		AI & ML interests
None yet
		Recent Activity
						upvoted 
								a
								paper
							
						17 days ago
						
					
						
						
						SRUM: Fine-Grained Self-Rewarding for Unified Multimodal Models
						
						upvoted 
								a
								paper
							
						about 1 month ago
						
					
						
						
						EPO: Entropy-regularized Policy Optimization for LLM Agents
  Reinforcement Learning
						Organizations
None yet