Update README.md
Browse files
README.md
CHANGED
|
@@ -13,8 +13,9 @@ license: apache-2.0
|
|
| 13 |
|
| 14 |
# Evalica
|
| 15 |
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
|
|
|
|
| 19 |
|
| 20 |
Chatbot Arena dataset `chatbot_arena_20240814.csv` was derived from the [clean_battle_20240814_public.json](https://storage.googleapis.com/arena_external_data/public/clean_battle_20240814_public.json) dataset available from <https://lmarena.ai/>.
|
|
|
|
| 13 |
|
| 14 |
# Evalica
|
| 15 |
|
| 16 |
+
|
| 17 |
+
[Evalica](https://github.com/dustalov/evalica) is an easy-to-use tool transforms pairwise comparisons (*aka* side-by-side) to a meaningful ranking of items.
|
| 18 |
+
|
| 19 |
+
- Ustalov, D. [Reliable, Reproducible, and Really Fast Leaderboards with Evalica](https://aclanthology.org/2025.coling-demos.6). 2025. Proceedings of the 31st International Conference on Computational Linguistics: System Demonstrations. 46–53. arXiv: [2412.11314 [cs.CL]](https://arxiv.org/abs/2412.11314).
|
| 20 |
|
| 21 |
Chatbot Arena dataset `chatbot_arena_20240814.csv` was derived from the [clean_battle_20240814_public.json](https://storage.googleapis.com/arena_external_data/public/clean_battle_20240814_public.json) dataset available from <https://lmarena.ai/>.
|