Update README.md
Browse files
README.md
CHANGED
|
@@ -333,12 +333,12 @@ Run the benchmarks under `vllm` root folder:
|
|
| 333 |
|
| 334 |
### baseline
|
| 335 |
```Shell
|
| 336 |
-
|
| 337 |
```
|
| 338 |
|
| 339 |
### FP8
|
| 340 |
```Shell
|
| 341 |
-
VLLM_DISABLE_COMPILE_CACHE=1
|
| 342 |
```
|
| 343 |
|
| 344 |
</details>
|
|
|
|
| 333 |
|
| 334 |
### baseline
|
| 335 |
```Shell
|
| 336 |
+
vllm bench latency --input-len 256 --output-len 256 --model microsoft/Phi-4-mini-instruct --batch-size 1
|
| 337 |
```
|
| 338 |
|
| 339 |
### FP8
|
| 340 |
```Shell
|
| 341 |
+
VLLM_DISABLE_COMPILE_CACHE=1 vllm bench latency --input-len 256 --output-len 256 --model pytorch/Phi-4-mini-instruct-FP8 --batch-size 1
|
| 342 |
```
|
| 343 |
|
| 344 |
</details>
|