Spaces:
Runtime error
Runtime error
| title: FutureBench Leaderboard | |
| emoji: ๐ฎ | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 4.44.0 | |
| app_file: app.py | |
| pinned: false | |
| # FutureBench Leaderboard App | |
| A minimal Gradio application for viewing FutureBench prediction data. This app downloads datasets from HuggingFace on startup and provides a web interface to explore the data. | |
| ## Features | |
| - ๐ **Data Summary**: View dataset statistics and information | |
| - ๐ **Sample Data**: Browse sample prediction records | |
| - ๐ **About**: Learn about the FutureBench system | |
| - ๐ **Auto-refresh**: Download latest data on startup | |
| - ๐ **Date Range Slider**: Filter the leaderboard by a custom date span | |
| ## Setup | |
| 1. Install dependencies: | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 2. (Optional) Set your HuggingFace token for private repositories: | |
| ```bash | |
| export HF_TOKEN=your_token_here | |
| ``` | |
| ## Running the App | |
| Launch the Gradio application: | |
| ```bash | |
| python app.py | |
| ``` | |
| The app will: | |
| 1. Download datasets from HuggingFace repositories on startup | |
| 2. Process the data and create summaries | |
| 3. Launch a web interface at `http://localhost:7860` | |
| ## Data Sources | |
| The app downloads data from these HuggingFace repositories: | |
| - `futurebench/requests` - Evaluation queue | |
| - `futurebench/results` - Evaluation results | |
| - `futurebench/data` - Main prediction dataset | |
| ## Structure | |
| - `app.py` - Main Gradio application | |
| - `process_data/` - Data processing utilities | |
| - `requirements.txt` - Python dependencies | |
| - `README.md` - This file | |
| ## Next Steps | |
| This is a minimal version focusing on data download and display. Future enhancements will include: | |
| - Full leaderboard with model rankings | |
| - Interactive filtering and sorting | |
| - Detailed performance metrics | |
| - Model comparison tools | |