Yanjun Zhao
yanjunzhao97
ยท
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
17 days ago
RiskPO: Risk-based Policy Optimization via Verifiable Reward for LLM
Post-Training
upvoted
a
paper
20 days ago
Demystifying Reinforcement Learning in Agentic Reasoning
upvoted
a
paper
26 days ago
TaTToo: Tool-Grounded Thinking PRM for Test-Time Scaling in Tabular
Reasoning
Organizations
None yet