OpenEvals

community
Activity Feed

AI & ML interests

LLM evaluation

Recent Activity

SaylorTwiftΒ  updated a Space about 3 hours ago
OpenEvals/open_benchmark_index
SaylorTwiftΒ  updated a Space about 4 hours ago
OpenEvals/evals
SaylorTwiftΒ  published a Space about 4 hours ago
OpenEvals/evals
View all activity

Articles

OpenEvals 's collections 5

Research collaborations
A small overview of our research collabs through the years
Archived Open LLM Leaderboard (2024-2025)
This leaderboard has been evaluating LLMs from Jun 2024 on IFEval, MuSR, GPQA, MATH, BBH and MMLU-Pro