10 18 2

Bo Liu

Benjamin-eecs

https://benjamin-eecs.github.io/

AI & ML interests

Reinforcement Learning, Reasoning, Machine Learning Systems

Recent Activity

authored a paper 16 days ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

upvoted a paper 16 days ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

authored a paper 19 days ago

Agent Learning via Early Experience

View all activity

Organizations

upvoted a paper 16 days ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published 20 days ago • 32

upvoted a paper 19 days ago

Large Reasoning Models Learn Better Alignment from Flawed Thinking

Paper • 2510.00938 • Published 28 days ago • 57

upvoted a paper 20 days ago

Agent Learning via Early Experience

Paper • 2510.08558 • Published 20 days ago • 252

upvoted 2 papers 28 days ago

GEM: A Gym for Agentic LLMs

Paper • 2510.01051 • Published 28 days ago • 86

Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play

Paper • 2509.25541 • Published 30 days ago • 137

upvoted a paper 29 days ago

The Era of Real-World Human Interaction: RL from User Conversations

Paper • 2509.25137 • Published 30 days ago • 18

upvoted a paper about 2 months ago

Bootstrapping Task Spaces for Self-Improvement

Paper • 2509.04575 • Published Sep 4 • 5

upvoted a collection about 2 months ago

LLaVA-Critic-R1

Collection

6 items • Updated Sep 3 • 2

upvoted a paper about 2 months ago

LLaVA-Critic-R1: Your Critic Model is Secretly a Strong Policy Model

Paper • 2509.00676 • Published Aug 31 • 83

upvoted 2 papers 4 months ago

The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements

Paper • 2506.22419 • Published Jun 27 • 14

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Paper • 2506.24119 • Published Jun 30 • 50

upvoted a paper 6 months ago

Beyond 'Aha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models

Paper • 2505.10554 • Published May 15 • 120

upvoted a paper 7 months ago

TextArena

Paper • 2504.11442 • Published Apr 15 • 29

upvoted a paper 8 months ago

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26 • 83

upvoted a paper 11 months ago

Natural Language Reinforcement Learning

Paper • 2411.14251 • Published Nov 21, 2024 • 31

upvoted 2 papers over 1 year ago

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Paper • 2405.04434 • Published May 7, 2024 • 21

DeepSeek-VL: Towards Real-World Vision-Language Understanding

Paper • 2403.05525 • Published Mar 8, 2024 • 46

upvoted a paper almost 2 years ago

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 48

Bo Liu

AI & ML interests

Recent Activity

Organizations

Benjamin-eecs's activity