Spaces:
Running
Running
Updated execution guide
Browse filesSigned-off-by: Jonathan Bnayahu <bnayahu@il.ibm.com>
- src/about.py +2 -1
src/about.py
CHANGED
|
@@ -89,7 +89,8 @@ To reproduce our results, here is the commands you can run:
|
|
| 89 |
|
| 90 |
```
|
| 91 |
pip install unitxt[bluebench]
|
| 92 |
-
unitxt-evaluate --tasks "benchmarks.bluebench" --model cross_provider --model_args "model_name
|
|
|
|
| 93 |
```
|
| 94 |
"""
|
| 95 |
|
|
|
|
| 89 |
|
| 90 |
```
|
| 91 |
pip install unitxt[bluebench]
|
| 92 |
+
unitxt-evaluate --tasks "benchmarks.bluebench" --model cross_provider --model_args "model_name=MODEL_TO_EVALUATE_IN_LITELLM_FORMAT,max_tokens=1024" --output_path ./results/bluebench --log_samples --trust_remote_code --batch_size 8
|
| 93 |
+
unitxt-summarize ./results/bluebench
|
| 94 |
```
|
| 95 |
"""
|
| 96 |
|