Error Typing for Smarter Rewards: Improving Process Reward Models with Error-Aware Hierarchical Supervision Paper • 2505.19706 • Published May 26 • 3
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning Paper • 2412.11974 • Published Dec 16, 2024 • 9
Ferret: Faster and Effective Automated Red Teaming with Reward-Based Scoring Technique Paper • 2408.10701 • Published Aug 20, 2024 • 12
DELLA-Merging: Reducing Interference in Model Merging through Magnitude-Based Sampling Paper • 2406.11617 • Published Jun 17, 2024 • 8