metadata
title: MCP-Bench Leaderboard
emoji: π
colorFrom: blue
colorTo: indigo
sdk: static
pinned: false
short_description: Leaderboard for MCP-Bench
tags:
- benchmark
- leaderboard
- llm
- mcp
- evaluation
- performance
- tool-use
- agents
MCP-Bench Leaderboard
A modern, interactive web application displaying performance metrics for various Language Learning Models (LLMs) in the MCP-Bench.
π Features
- Interactive Leaderboard: Sort by any metric column
- Real-time Search: Filter models by name
- Responsive Design: Optimized for desktop and mobile
- Visual Indicators: Color-coded performance levels and progress bars
- Modern UI: Clean, professional Material Design interface
- Dark Mode Support: Automatic dark/light theme detection
π Metrics Displayed
The leaderboard shows comprehensive performance metrics:
- Overall Score: Combined performance metric
- Valid Tool Schema: Percentage of valid tool schemas
- Compliance: Rule compliance percentage
- Task Success: Task completion success rate
- Schema Understanding: Understanding of tool schemas
- Task Completion: Task completion effectiveness
- Tool Usage: Tool utilization efficiency
- Planning Effectiveness: Planning and execution quality
π Quick Start
Local Development
- Clone this repository
- Open
index.htmlin your web browser - Or serve using a local HTTP server:
# Using Python
python -m http.server 8000
# Using Node.js
npx serve .
# Using PHP
php -S localhost:8000
Hugging Face Spaces Deployment
This project is optimized for deployment on Hugging Face Spaces:
- Create a new Space on Hugging Face
- Choose Gradio as the SDK
- Upload all files to your Space
- Rename
requirements-hf.txttorequirements.txt - Your Space will automatically build and deploy
The app.py file provides Gradio integration for Hugging Face Spaces compatibility.
π Project Structure
mcp-bench-leaderboard/
βββ index.html # Main HTML page
βββ style.css # Responsive CSS styling
βββ script.js # Interactive JavaScript functionality
βββ data.json # Leaderboard data
βββ app.py # Gradio app for HF Spaces
βββ requirements-hf.txt # Dependencies for HF deployment
βββ README.md # Documentation
π¨ Customization
Update Data
Modify data.json to add new models or update scores:
{
"lastUpdated": "2025-09-05",
"models": [
{
"name": "your-model-name",
"overall_score": 0.750,
"valid_tool_schema": 99.5,
"compliance": 98.2,
// ... other metrics
}
]
}
Styling
Edit style.css to customize:
- Colors and themes
- Layout and spacing
- Responsive breakpoints
- Animation effects
Functionality
Extend script.js to add:
- New sorting algorithms
- Additional filtering options
- Export functionality
- Chart visualizations
π Browser Support
- Chrome 60+
- Firefox 55+
- Safari 12+
- Edge 79+
π± Mobile Compatibility
The application is fully responsive and optimized for:
- Tablets (768px - 1024px)
- Mobile phones (320px - 767px)
- Large screens (1200px+)
π§ Technical Details
- Pure Frontend: No backend dependencies
- Vanilla JavaScript: No frameworks required
- Modern CSS: Flexbox, Grid, CSS Variables
- Progressive Enhancement: Works without JavaScript
- SEO Friendly: Semantic HTML structure
π Performance
- Lightweight (~50KB total)
- Fast loading times
- Optimized images and assets
- Efficient DOM updates
- Smooth animations
π€ Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Test across browsers
- Submit a pull request
π License
This project is open source and available under the MIT License.
π Acknowledgments
- Data sourced from MCP Benchmark Results
- Icons from Font Awesome
- Fonts from Google Fonts
- Hosted on Hugging Face Spaces
Last updated: September 2025