Shaobai Jiang's picture

4 626

Shaobai Jiang

shaobaij

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

upvoted a paper 3 days ago

Attention Is All You Need for KV Cache in Diffusion LLMs

upvoted a paper 3 days ago

Reasoning with Sampling: Your Base Model is Smarter Than You Think

View all activity

Organizations

None yet

upvoted a paper 1 day ago

olmOCR: Unlocking Trillions of Tokens in PDFs with Vision Language Models

Paper • 2502.18443 • Published Feb 25 • 5

upvoted 9 papers 3 days ago

Attention Is All You Need for KV Cache in Diffusion LLMs

Paper • 2510.14973 • Published 13 days ago • 36

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published 13 days ago • 41

Not All Bits Are Equal: Scale-Dependent Memory Optimization Strategies for Reasoning Models

Paper • 2510.10964 • Published 16 days ago • 2

LLMs Can Get "Brain Rot"!

Paper • 2510.13928 • Published 14 days ago • 21

Tensor Logic: The Language of AI

Paper • 2510.12269 • Published 15 days ago • 6

BitNet Distillation

Paper • 2510.13998 • Published 14 days ago • 51

Learning to Grasp Anything by Playing with Random Toys

Paper • 2510.12866 • Published 15 days ago • 5

VLA-0: Building State-of-the-Art VLAs with Zero Modification

Paper • 2510.13054 • Published 15 days ago • 9

Ctrl-World: A Controllable Generative World Model for Robot Manipulation

Paper • 2510.10125 • Published 18 days ago • 1

upvoted 10 papers 4 days ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published 20 days ago • 41

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published 16 days ago • 31

Dr.LLM: Dynamic Layer Routing in LLMs

Paper • 2510.12773 • Published 15 days ago • 31

Which Heads Matter for Reasoning? RL-Guided KV Cache Compression

Paper • 2510.08525 • Published 20 days ago • 22

QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs

Paper • 2510.11696 • Published 16 days ago • 168

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published 20 days ago • 32

StreamingVLM: Real-Time Understanding for Infinite Video Streams

Paper • 2510.09608 • Published 19 days ago • 49

ArcMemo: Abstract Reasoning Composition with Lifelong LLM Memory

Paper • 2509.04439 • Published Sep 4 • 1

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Paper • 2510.07242 • Published 21 days ago • 30

LightMem: Lightweight and Efficient Memory-Augmented Generation

Paper • 2510.18866 • Published 8 days ago • 105