Spaces:
Sleeping
Sleeping
A newer version of the Streamlit SDK is available:
1.51.0
ποΈ RetailGPT Evaluator β AxionX Digital
Purpose: Evaluate and compare multiple retail QA models on the same dataset.
Includes
evaluate.pyβ runs metrics across multiple modelsleaderboard.pyβ aggregates results into rankingapp.pyβ Streamlit UI with leaderboard + live model chat
Usage
!python retailgpt_evaluator/dataset_loader.py
!python retailgpt_evaluator/evaluate.py