3 17 22

AlphaSue

AI & ML interests

None yet

Recent Activity

upvoted an article about 2 months ago

DABStep: Data Agent Benchmark for Multi-step Reasoning

upvoted an article 2 months ago

The 4 Things Qwen-3's Chat Template Teaches Us

upvoted a paper 3 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

View all activity

Organizations

None yet

upvoted an article about 2 months ago

Article

DABStep: Data Agent Benchmark for Multi-step Reasoning

Feb 4

• 116

upvoted an article 2 months ago

Article

The 4 Things Qwen-3's Chat Template Teaches Us

Apr 30

• 75

upvoted a paper 3 months ago

GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models

Paper • 2508.06471 • Published Aug 8 • 186

upvoted an article 3 months ago

Article

Open-source DeepResearch – Freeing our search agents

Feb 4

• 1.31k

upvoted a collection 5 months ago

Whisper

Collection

OpenAI Whisper speech recognition models in MLX format • 48 items • Updated Oct 1, 2024 • 58

upvoted an article 5 months ago

Article

Vision Language Models (Better, Faster, Stronger)

May 12

• 555

upvoted a collection 6 months ago

ProX Refining Models

Collection

Adapted small language models used to generate data refining programs • 5 items • Updated Oct 10, 2024 • 5

upvoted 2 papers 6 months ago

How Instruction and Reasoning Data shape Post-Training: Data Quality through the Lens of Layer-wise Gradients

Paper • 2504.10766 • Published Apr 14 • 40

xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14 • 84

upvoted a paper 7 months ago

Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published Mar 26 • 56

upvoted an article 7 months ago

Article

Open R1: Update #3

and 9 others •

Mar 11

• 295

upvoted a paper 7 months ago

Modifying Large Language Model Post-Training for Diverse Creative Writing

Paper • 2503.17126 • Published Mar 21 • 36

upvoted a paper 8 months ago

Organize the Web: Constructing Domains Enhances Pre-Training Data Curation

Paper • 2502.10341 • Published Feb 14 • 3

upvoted an article 9 months ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 945

upvoted a collection 10 months ago

Papers I've read

Collection

16 items • Updated Jan 12 • 6

upvoted a paper 12 months ago

JudgeBench: A Benchmark for Evaluating LLM-based Judges

Paper • 2410.12784 • Published Oct 16, 2024 • 48

upvoted an article over 1 year ago

Article

Large-scale Near-deduplication Behind BigCode

May 16, 2023

• 35

AlphaSue

AI & ML interests

Recent Activity

Organizations

AlphaSue's activity

DABStep: Data Agent Benchmark for Multi-step Reasoning

The 4 Things Qwen-3's Chat Template Teaches Us

Open-source DeepResearch – Freeing our search agents

Vision Language Models (Better, Faster, Stronger)

Open R1: Update #3

Mixture of Experts Explained

Large-scale Near-deduplication Behind BigCode