AI2 Adapt Dev

community

AI & ML interests

Open science can (maybe) save the world

Recent Activity

DongfuJiang authored a paper 21 days ago

Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning

DongfuJiang authored a paper 21 days ago

VideoScore2: Think before You Score in Generative Video Evaluation

DongfuJiang authored a paper about 2 months ago

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

View all activity

DongfuJiang

authored 2 papers 21 days ago

Critique-Coder: Enhancing Coder Models by Critique Reinforcement Learning

Paper • 2509.22824 • Published Sep 26 • 20

VideoScore2: Think before You Score in Generative Video Evaluation

Paper • 2509.22799 • Published Sep 26 • 24

DongfuJiang

authored a paper about 2 months ago

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1 • 71

valpy

authored 4 papers 3 months ago

2 OLMo 2 Furious

Paper • 2501.00656 • Published Dec 31, 2024 • 22

IssueBench: Millions of Realistic Prompts for Measuring Issue Bias in LLM Writing Assistance

Paper • 2502.08395 • Published Feb 12

RewardBench 2: Advancing Reward Model Evaluation

Paper • 2506.01937 • Published Jun 2 • 7

Generalizing Verifiable Instruction Following

Paper • 2507.02833 • Published Jul 3 • 1

saumyamalik

authored 3 papers 5 months ago

QuRating: Selecting High-Quality Data for Training Language Models

Paper • 2402.09739 • Published Feb 15, 2024 • 4

Lost in the Logic: An Evaluation of Large Language Models' Reasoning Capabilities on LSAT Logic Games

Paper • 2409.19012 • Published Sep 23, 2024

2 OLMo 2 Furious

Paper • 2501.00656 • Published Dec 31, 2024 • 22

ljvmiranda921

authored a paper 5 months ago

R3: Robust Rubric-Agnostic Reward Models

Paper • 2505.13388 • Published May 19 • 11

DongfuJiang

authored 3 papers 5 months ago

StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs

Paper • 2505.20139 • Published May 26 • 19

QuickVideo: Real-Time Long Video Understanding with System Algorithm Co-Design

Paper • 2505.16175 • Published May 22 • 41

General-Reasoner: Advancing LLM Reasoning Across All Domains

Paper • 2505.14652 • Published May 20 • 23

PrasannSinghal

authored a paper 5 months ago

ChartMuseum: Testing Visual Reasoning Capabilities of Large Vision-Language Models

Paper • 2505.13444 • Published May 19 • 16

Muennighoff

authored 2 papers 6 months ago

Crosslingual Reasoning through Test-Time Scaling

Paper • 2505.05408 • Published May 8 • 8

ReasonIR: Training Retrievers for Reasoning Tasks

Paper • 2504.20595 • Published Apr 29 • 53

natolambert

authored a paper 6 months ago

Reinforcement Learning from Human Feedback

Paper • 2504.12501 • Published Apr 16 • 4

akshitab

authored 2 papers 6 months ago

Establishing Task Scaling Laws via Compute-Efficient Model Ladders

Paper • 2412.04403 • Published Dec 5, 2024 • 3

2 OLMo 2 Furious

Paper • 2501.00656 • Published Dec 31, 2024 • 22