|
|
--- |
|
|
title: MCP-Bench Leaderboard |
|
|
emoji: π |
|
|
colorFrom: blue |
|
|
colorTo: indigo |
|
|
sdk: static |
|
|
pinned: false |
|
|
short_description: Leaderboard for MCP-Bench |
|
|
tags: |
|
|
- benchmark |
|
|
- leaderboard |
|
|
- llm |
|
|
- mcp |
|
|
- evaluation |
|
|
- performance |
|
|
- tool-use |
|
|
- agents |
|
|
--- |
|
|
|
|
|
# MCP-Bench Leaderboard |
|
|
|
|
|
A modern, interactive web application displaying performance metrics for various Language Learning Models (LLMs) in the MCP-Bench. |
|
|
|
|
|
## π Features |
|
|
|
|
|
- **Interactive Leaderboard**: Sort by any metric column |
|
|
- **Real-time Search**: Filter models by name |
|
|
- **Responsive Design**: Optimized for desktop and mobile |
|
|
- **Visual Indicators**: Color-coded performance levels and progress bars |
|
|
- **Modern UI**: Clean, professional Material Design interface |
|
|
- **Dark Mode Support**: Automatic dark/light theme detection |
|
|
|
|
|
## π Metrics Displayed |
|
|
|
|
|
The leaderboard shows comprehensive performance metrics: |
|
|
|
|
|
- **Overall Score**: Combined performance metric |
|
|
- **Valid Tool Schema**: Percentage of valid tool schemas |
|
|
- **Compliance**: Rule compliance percentage |
|
|
- **Task Success**: Task completion success rate |
|
|
- **Schema Understanding**: Understanding of tool schemas |
|
|
- **Task Completion**: Task completion effectiveness |
|
|
- **Tool Usage**: Tool utilization efficiency |
|
|
- **Planning Effectiveness**: Planning and execution quality |
|
|
|
|
|
## π Quick Start |
|
|
|
|
|
### Local Development |
|
|
|
|
|
1. Clone this repository |
|
|
2. Open `index.html` in your web browser |
|
|
3. Or serve using a local HTTP server: |
|
|
|
|
|
```bash |
|
|
# Using Python |
|
|
python -m http.server 8000 |
|
|
|
|
|
# Using Node.js |
|
|
npx serve . |
|
|
|
|
|
# Using PHP |
|
|
php -S localhost:8000 |
|
|
``` |
|
|
|
|
|
### Hugging Face Spaces Deployment |
|
|
|
|
|
This project is optimized for deployment on Hugging Face Spaces: |
|
|
|
|
|
1. Create a new Space on [Hugging Face](https://huggingface.co/spaces) |
|
|
2. Choose **Gradio** as the SDK |
|
|
3. Upload all files to your Space |
|
|
4. Rename `requirements-hf.txt` to `requirements.txt` |
|
|
5. Your Space will automatically build and deploy |
|
|
|
|
|
The `app.py` file provides Gradio integration for Hugging Face Spaces compatibility. |
|
|
|
|
|
## π Project Structure |
|
|
|
|
|
``` |
|
|
mcp-bench-leaderboard/ |
|
|
βββ index.html # Main HTML page |
|
|
βββ style.css # Responsive CSS styling |
|
|
βββ script.js # Interactive JavaScript functionality |
|
|
βββ data.json # Leaderboard data |
|
|
βββ app.py # Gradio app for HF Spaces |
|
|
βββ requirements-hf.txt # Dependencies for HF deployment |
|
|
βββ README.md # Documentation |
|
|
``` |
|
|
|
|
|
## π¨ Customization |
|
|
|
|
|
### Update Data |
|
|
|
|
|
Modify `data.json` to add new models or update scores: |
|
|
|
|
|
```json |
|
|
{ |
|
|
"lastUpdated": "2025-09-05", |
|
|
"models": [ |
|
|
{ |
|
|
"name": "your-model-name", |
|
|
"overall_score": 0.750, |
|
|
"valid_tool_schema": 99.5, |
|
|
"compliance": 98.2, |
|
|
// ... other metrics |
|
|
} |
|
|
] |
|
|
} |
|
|
``` |
|
|
|
|
|
### Styling |
|
|
|
|
|
Edit `style.css` to customize: |
|
|
- Colors and themes |
|
|
- Layout and spacing |
|
|
- Responsive breakpoints |
|
|
- Animation effects |
|
|
|
|
|
### Functionality |
|
|
|
|
|
Extend `script.js` to add: |
|
|
- New sorting algorithms |
|
|
- Additional filtering options |
|
|
- Export functionality |
|
|
- Chart visualizations |
|
|
|
|
|
## π Browser Support |
|
|
|
|
|
- Chrome 60+ |
|
|
- Firefox 55+ |
|
|
- Safari 12+ |
|
|
- Edge 79+ |
|
|
|
|
|
## π± Mobile Compatibility |
|
|
|
|
|
The application is fully responsive and optimized for: |
|
|
- Tablets (768px - 1024px) |
|
|
- Mobile phones (320px - 767px) |
|
|
- Large screens (1200px+) |
|
|
|
|
|
## π§ Technical Details |
|
|
|
|
|
- **Pure Frontend**: No backend dependencies |
|
|
- **Vanilla JavaScript**: No frameworks required |
|
|
- **Modern CSS**: Flexbox, Grid, CSS Variables |
|
|
- **Progressive Enhancement**: Works without JavaScript |
|
|
- **SEO Friendly**: Semantic HTML structure |
|
|
|
|
|
## π Performance |
|
|
|
|
|
- Lightweight (~50KB total) |
|
|
- Fast loading times |
|
|
- Optimized images and assets |
|
|
- Efficient DOM updates |
|
|
- Smooth animations |
|
|
|
|
|
## π€ Contributing |
|
|
|
|
|
1. Fork the repository |
|
|
2. Create a feature branch |
|
|
3. Make your changes |
|
|
4. Test across browsers |
|
|
5. Submit a pull request |
|
|
|
|
|
## π License |
|
|
|
|
|
This project is open source and available under the MIT License. |
|
|
|
|
|
## π Acknowledgments |
|
|
|
|
|
- Data sourced from MCP Benchmark Results |
|
|
- Icons from Font Awesome |
|
|
- Fonts from Google Fonts |
|
|
- Hosted on Hugging Face Spaces |
|
|
|
|
|
--- |
|
|
|
|
|
*Last updated: September 2025* |