mcp-bench / README.md
ztwang's picture
Upload 10 files
3e04edb verified
|
raw
history blame
4.12 kB
---
title: MCP-Bench Leaderboard
emoji: πŸ†
colorFrom: blue
colorTo: indigo
sdk: static
pinned: false
short_description: Leaderboard for MCP-Bench
tags:
- benchmark
- leaderboard
- llm
- mcp
- evaluation
- performance
- tool-use
- agents
---
# MCP-Bench Leaderboard
A modern, interactive web application displaying performance metrics for various Language Learning Models (LLMs) in the MCP-Bench.
## πŸ† Features
- **Interactive Leaderboard**: Sort by any metric column
- **Real-time Search**: Filter models by name
- **Responsive Design**: Optimized for desktop and mobile
- **Visual Indicators**: Color-coded performance levels and progress bars
- **Modern UI**: Clean, professional Material Design interface
- **Dark Mode Support**: Automatic dark/light theme detection
## πŸ“Š Metrics Displayed
The leaderboard shows comprehensive performance metrics:
- **Overall Score**: Combined performance metric
- **Valid Tool Schema**: Percentage of valid tool schemas
- **Compliance**: Rule compliance percentage
- **Task Success**: Task completion success rate
- **Schema Understanding**: Understanding of tool schemas
- **Task Completion**: Task completion effectiveness
- **Tool Usage**: Tool utilization efficiency
- **Planning Effectiveness**: Planning and execution quality
## πŸš€ Quick Start
### Local Development
1. Clone this repository
2. Open `index.html` in your web browser
3. Or serve using a local HTTP server:
```bash
# Using Python
python -m http.server 8000
# Using Node.js
npx serve .
# Using PHP
php -S localhost:8000
```
### Hugging Face Spaces Deployment
This project is optimized for deployment on Hugging Face Spaces:
1. Create a new Space on [Hugging Face](https://huggingface.co/spaces)
2. Choose **Gradio** as the SDK
3. Upload all files to your Space
4. Rename `requirements-hf.txt` to `requirements.txt`
5. Your Space will automatically build and deploy
The `app.py` file provides Gradio integration for Hugging Face Spaces compatibility.
## πŸ“ Project Structure
```
mcp-bench-leaderboard/
β”œβ”€β”€ index.html # Main HTML page
β”œβ”€β”€ style.css # Responsive CSS styling
β”œβ”€β”€ script.js # Interactive JavaScript functionality
β”œβ”€β”€ data.json # Leaderboard data
β”œβ”€β”€ app.py # Gradio app for HF Spaces
β”œβ”€β”€ requirements-hf.txt # Dependencies for HF deployment
└── README.md # Documentation
```
## 🎨 Customization
### Update Data
Modify `data.json` to add new models or update scores:
```json
{
"lastUpdated": "2025-09-05",
"models": [
{
"name": "your-model-name",
"overall_score": 0.750,
"valid_tool_schema": 99.5,
"compliance": 98.2,
// ... other metrics
}
]
}
```
### Styling
Edit `style.css` to customize:
- Colors and themes
- Layout and spacing
- Responsive breakpoints
- Animation effects
### Functionality
Extend `script.js` to add:
- New sorting algorithms
- Additional filtering options
- Export functionality
- Chart visualizations
## 🌐 Browser Support
- Chrome 60+
- Firefox 55+
- Safari 12+
- Edge 79+
## πŸ“± Mobile Compatibility
The application is fully responsive and optimized for:
- Tablets (768px - 1024px)
- Mobile phones (320px - 767px)
- Large screens (1200px+)
## πŸ”§ Technical Details
- **Pure Frontend**: No backend dependencies
- **Vanilla JavaScript**: No frameworks required
- **Modern CSS**: Flexbox, Grid, CSS Variables
- **Progressive Enhancement**: Works without JavaScript
- **SEO Friendly**: Semantic HTML structure
## πŸ“ˆ Performance
- Lightweight (~50KB total)
- Fast loading times
- Optimized images and assets
- Efficient DOM updates
- Smooth animations
## 🀝 Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Test across browsers
5. Submit a pull request
## πŸ“„ License
This project is open source and available under the MIT License.
## πŸ™ Acknowledgments
- Data sourced from MCP Benchmark Results
- Icons from Font Awesome
- Fonts from Google Fonts
- Hosted on Hugging Face Spaces
---
*Last updated: September 2025*