Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2506.07927

BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs

Paper • 2510.04721 • Published 23 days ago
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models

Paper • 2505.02735 • Published May 5 • 33
PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts

Paper • 2504.18428 • Published Apr 25
MathConstruct: Challenging LLM Reasoning with Constructive Proofs

Paper • 2502.10197 • Published Feb 14

about 10 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 263 • 96
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 54
Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published Jun 9 • 20
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1 • 79
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published Jul 8 • 43

Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published Jun 9 • 20
Mathesis: Towards Formal Theorem Proving from Natural Languages

Paper • 2506.07047 • Published Jun 8 • 5
Pre-trained Large Language Models Learn Hidden Markov Models In-context

Paper • 2506.07298 • Published Jun 8 • 26
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1 • 79

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Paper • 2505.10557 • Published May 15 • 47
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Paper • 2505.16400 • Published May 22 • 34
PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Paper • 2505.15929 • Published May 21 • 49
VideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Videos

Paper • 2506.05349 • Published Jun 5 • 24

BrokenMath: A Benchmark for Sycophancy in Theorem Proving with LLMs

Paper • 2510.04721 • Published 23 days ago
FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models

Paper • 2505.02735 • Published May 5 • 33
PolyMath: Evaluating Mathematical Reasoning in Multilingual Contexts

Paper • 2504.18428 • Published Apr 25
MathConstruct: Challenging LLM Reasoning with Constructive Proofs

Paper • 2502.10197 • Published Feb 14

Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published Jun 9 • 20
Mathesis: Towards Formal Theorem Proving from Natural Languages

Paper • 2506.07047 • Published Jun 8 • 5
Pre-trained Large Language Models Learn Hidden Markov Models In-context

Paper • 2506.07298 • Published Jun 8 • 26
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1 • 79

about 10 hours ago

lusxvr/nanoVLM-222M

Image-Text-to-Text • 0.2B • Updated May 8 • 263 • 96
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Paper • 2503.09516 • Published Mar 12 • 36
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time

Paper • 2505.24863 • Published May 30 • 97
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning

Paper • 2505.17667 • Published May 23 • 88

MathCoder-VL: Bridging Vision and Code for Enhanced Multimodal Mathematical Reasoning

Paper • 2505.10557 • Published May 15 • 47
AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning

Paper • 2505.16400 • Published May 22 • 34
PhyX: Does Your Model Have the "Wits" for Physical Reasoning?

Paper • 2505.15929 • Published May 21 • 49
VideoMathQA: Benchmarking Mathematical Reasoning via Multimodal Understanding in Videos

Paper • 2506.05349 • Published Jun 5 • 24

Transformers Can Do Arithmetic with the Right Embeddings

Paper • 2405.17399 • Published May 27, 2024 • 54
Solving Inequality Proofs with Large Language Models

Paper • 2506.07927 • Published Jun 9 • 20
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning

Paper • 2507.00432 • Published Jul 1 • 79
CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published Jul 8 • 43

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs