| # Submitting to the leaderboard | |
| NAVSIM comes with an [official leaderboard](https://huggingface.co/spaces/AGC2024-P/e2e-driving-navsim) on HuggingFace. The leaderboard prevents ambiguity in metric definitions between different projects, as all evaluation is performed on the server with the official evaluation script. | |
| After the NAVSIM challenge 2024, we have re-opened the leaderboard with the `navtest` split. With 12k samples, the `navtest` split contains a larger set for comprehensive evaluations. In this guide, we describe how to create a valid submission and what rules apply for the new leaderboard. | |
| ### Rules | |
| - **Open-source code and models**: | |
| - We will periodically (~every 6 months) be removing all entries on the leaderboard which **do not provide associated open-source training and inference code with the corresponding pre-trained checkpoints**. Even if removed for not having this information, an entry can be resubmitted once the code needed for reproducibility is made publicly available. | |
| - Code must be provided by setting the `TEAM_NAME` variable of the submission file as `"<a href=Link/to/repository>Method name</a>"`. Note that this can also be edited on the leaderboard for an existing submission, if the repo is created (or updated) after the initial submission. | |
| - **Multi-seed submissions**: | |
| - Driving policies often differ significantly in performance when re-trained with different network initialization seeds. | |
| - Therefore, the leaderboard now supports (1) regular single-seed submissions and (2) multi-seed submission, which we **strongly encourage** (with a minimum of 3 training seeds). | |
| - The maximum, mean and standard deviations of our evaluation metrics will be displayed for multi-seed submissions. | |
| ### Regular navtest submission | |
| To submit to a leaderboard you need to create a pickle file that contains a trajectory for each test scenario. NAVSIM provides a script to create such a pickle file. | |
| Have a look at `run_create_submission_pickle.sh`: this file creates the pickle file for the ConstantVelocity agent. You can run it for your own agent by replacing the `agent` override. Follow the [submission instructions on huggingface](https://huggingface.co/spaces/AGC2024-P/e2e-driving-navsim) to upload your submission. | |
| **Note that you have to set the variables `TEAM_NAME`, `AUTHORS`, `EMAIL`, `INSTITUTION`, and `COUNTRY` in `run_create_submission_pickle.sh` to generate a valid submission file** | |
| You should be able to obtain the same evaluation results as on the server by running the evaluation locally. | |
| To do so, use the override `train_test_split=navtest` when executing the script to run the PDM scoring. | |
| ### Multi-seed navtest submission | |
| For a multi-seed submission, you first have to create individual agents, i.e. trained on different seeds. Consequently, you can merge your entries to a single submission file with the `run_merge_submission_pickles.sh` bash script. Please set the override `train_test_split=navtest` to ensure all individual entries contain trajectories for the evaluation. | |