Update README.md
Browse files
README.md
CHANGED
|
@@ -98,7 +98,7 @@ You can refer to the content in [Tencent-Hunyuan-Large](https://github.com/Tence
|
|
| 98 |
|
| 99 |
### Inference Performance
|
| 100 |
|
| 101 |
-
This section presents the efficiency test results of deploying various models
|
| 102 |
|
| 103 |
| Inference Framework | Model | Number of GPUs (series 1) | input_length | batch=1 | batch=4 |
|
| 104 |
|------|------------|-------------------------|-------------------------|---------------------|----------------------|
|
|
|
|
| 98 |
|
| 99 |
### Inference Performance
|
| 100 |
|
| 101 |
+
This section presents the efficiency test results of deploying various models using vLLM, including inference speed (tokens/s) under different batch sizes.
|
| 102 |
|
| 103 |
| Inference Framework | Model | Number of GPUs (series 1) | input_length | batch=1 | batch=4 |
|
| 104 |
|------|------------|-------------------------|-------------------------|---------------------|----------------------|
|