Spaces:

mcpbench
/

mcp-bench

Running

App Files Files Community

mcp-bench / README.md

ztwang

Upload 10 files

3e04edb verified 2 months ago

preview code

raw

history blame

4.12 kB

metadata

title: MCP-Bench Leaderboard
emoji: 🏆
colorFrom: blue
colorTo: indigo
sdk: static
pinned: false
short_description: Leaderboard for MCP-Bench
tags:
  - benchmark
  - leaderboard
  - llm
  - mcp
  - evaluation
  - performance
  - tool-use
  - agents

MCP-Bench Leaderboard

A modern, interactive web application displaying performance metrics for various Language Learning Models (LLMs) in the MCP-Bench.

🏆 Features

Interactive Leaderboard: Sort by any metric column
Real-time Search: Filter models by name
Responsive Design: Optimized for desktop and mobile
Visual Indicators: Color-coded performance levels and progress bars
Modern UI: Clean, professional Material Design interface
Dark Mode Support: Automatic dark/light theme detection

📊 Metrics Displayed

The leaderboard shows comprehensive performance metrics:

Overall Score: Combined performance metric
Valid Tool Schema: Percentage of valid tool schemas
Compliance: Rule compliance percentage
Task Success: Task completion success rate
Schema Understanding: Understanding of tool schemas
Task Completion: Task completion effectiveness
Tool Usage: Tool utilization efficiency
Planning Effectiveness: Planning and execution quality

🚀 Quick Start

Local Development

Clone this repository
Open index.html in your web browser
Or serve using a local HTTP server:

# Using Python
python -m http.server 8000

# Using Node.js
npx serve .

# Using PHP
php -S localhost:8000

Hugging Face Spaces Deployment

This project is optimized for deployment on Hugging Face Spaces:

Create a new Space on Hugging Face
Choose Gradio as the SDK
Upload all files to your Space
Rename requirements-hf.txt to requirements.txt
Your Space will automatically build and deploy

The app.py file provides Gradio integration for Hugging Face Spaces compatibility.

📁 Project Structure

mcp-bench-leaderboard/
├── index.html          # Main HTML page
├── style.css           # Responsive CSS styling
├── script.js           # Interactive JavaScript functionality
├── data.json           # Leaderboard data
├── app.py             # Gradio app for HF Spaces
├── requirements-hf.txt # Dependencies for HF deployment
└── README.md          # Documentation

🎨 Customization

Update Data

Modify data.json to add new models or update scores:

{
  "lastUpdated": "2025-09-05",
  "models": [
    {
      "name": "your-model-name",
      "overall_score": 0.750,
      "valid_tool_schema": 99.5,
      "compliance": 98.2,
      // ... other metrics
    }
  ]
}

Styling

Edit style.css to customize:

Colors and themes
Layout and spacing
Responsive breakpoints
Animation effects

Functionality

Extend script.js to add:

New sorting algorithms
Additional filtering options
Export functionality
Chart visualizations

🌐 Browser Support

Chrome 60+
Firefox 55+
Safari 12+
Edge 79+

📱 Mobile Compatibility

The application is fully responsive and optimized for:

Tablets (768px - 1024px)
Mobile phones (320px - 767px)
Large screens (1200px+)

🔧 Technical Details

Pure Frontend: No backend dependencies
Vanilla JavaScript: No frameworks required
Modern CSS: Flexbox, Grid, CSS Variables
Progressive Enhancement: Works without JavaScript
SEO Friendly: Semantic HTML structure

📈 Performance

Lightweight (~50KB total)
Fast loading times
Optimized images and assets
Efficient DOM updates
Smooth animations

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Test across browsers
Submit a pull request

📄 License

This project is open source and available under the MIT License.

🙏 Acknowledgments

Data sourced from MCP Benchmark Results
Icons from Font Awesome
Fonts from Google Fonts
Hosted on Hugging Face Spaces

Last updated: September 2025