Submitted by XUANMINGZHANG 3 Generalization or Memorization: Dynamic Decoding for Mode Steering Stanford NLP 1
Submitted by simonycl 17 Verbalized Sampling: How to Mitigate Mode Collapse and Unlock LLM Diversity Stanford NLP 374 3
Submitted by fangwu97 136 DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Stanford NLP 3