tencent
/

Hunyuan-7B-Instruct-0124

Text Generation

hunyuan_v1_dense

Model card Files Files and versions

woodchen7 commited on Jan 24

Commit

87598a5

·

verified ·

1 Parent(s): 20b75b5

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -96,6 +96,10 @@ Note: The following benchmarks are evaluated by TRT-LLM-backend
 You can refer to the content in [Tencent-Hunyuan-Large](https://github.com/Tencent/Tencent-Hunyuan-Large) to get started quickly. The training and inference code can use the version provided in this github repository.
 ### Inference Performance
 This section presents the efficiency test results of deploying various models using vLLM, including inference speed (tokens/s) under different batch sizes.

 You can refer to the content in [Tencent-Hunyuan-Large](https://github.com/Tencent/Tencent-Hunyuan-Large) to get started quickly. The training and inference code can use the version provided in this github repository.
+#### Inference Framework
+- This open-source release offers two inference backend options tailored for the Hunyuan-7B model: the popular [vLLM-backend](https://github.com/quinnrong94/vllm/tree/dev_hunyuan) and the TensorRT-LLM Backend. In this release, we are initially open-sourcing the vLLM solution, with plans to release the TRT-LLM solution in the near future.
 ### Inference Performance
 This section presents the efficiency test results of deploying various models using vLLM, including inference speed (tokens/s) under different batch sizes.