jiaxin
commited on
Commit
·
8abcad7
1
Parent(s):
b86f414
update README
Browse files- docs/sglang_deploy_guide.md +26 -6
- docs/sglang_deploy_guide_cn.md +19 -4
- docs/vllm_deploy_guide.md +1 -1
- docs/vllm_deploy_guide_cn.md +1 -1
docs/sglang_deploy_guide.md
CHANGED
|
@@ -35,13 +35,30 @@ It is recommended to use a virtual environment (such as **venv**, **conda**, or
|
|
| 35 |
We recommend installing SGLang in a fresh Python environment. Since it has not been released yet, you need to manually build it from the source code:
|
| 36 |
|
| 37 |
```bash
|
| 38 |
-
git clone https://github.com/sgl-project/sglang.git
|
| 39 |
cd sglang
|
| 40 |
-
|
|
|
|
|
|
|
|
|
|
| 41 |
```
|
| 42 |
|
| 43 |
Run the following command to start the SGLang server. SGLang will automatically download and cache the MiniMax-M2 model from Hugging Face.
|
| 44 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 45 |
8-GPU deployment command:
|
| 46 |
|
| 47 |
```bash
|
|
@@ -50,15 +67,18 @@ python -m sglang.launch_server \
|
|
| 50 |
--tp-size 8 \
|
| 51 |
--ep-size 8 \
|
| 52 |
--tool-call-parser minimax-m2 \
|
| 53 |
-
--reasoning-parser minimax \
|
|
|
|
| 54 |
--trust-remote-code \
|
| 55 |
--port 8000 \
|
| 56 |
--mem-fraction-static 0.7
|
| 57 |
```
|
| 58 |
|
|
|
|
|
|
|
| 59 |
## Testing Deployment
|
| 60 |
|
| 61 |
-
After startup, you can test the
|
| 62 |
|
| 63 |
```bash
|
| 64 |
curl http://localhost:8000/v1/chat/completions \
|
|
@@ -84,13 +104,13 @@ export HF_ENDPOINT=https://hf-mirror.com
|
|
| 84 |
|
| 85 |
### MiniMax-M2 model is not currently supported
|
| 86 |
|
| 87 |
-
This
|
| 88 |
|
| 89 |
## Getting Support
|
| 90 |
|
| 91 |
If you encounter any issues while deploying the MiniMax model:
|
| 92 |
|
| 93 |
-
- Contact our technical support team through official channels such as email at [
|
| 94 |
|
| 95 |
- Submit an issue on our [GitHub](https://github.com/MiniMax-AI) repository
|
| 96 |
|
|
|
|
| 35 |
We recommend installing SGLang in a fresh Python environment. Since it has not been released yet, you need to manually build it from the source code:
|
| 36 |
|
| 37 |
```bash
|
| 38 |
+
git clone -b v0.5.4.post3 https://github.com/sgl-project/sglang.git
|
| 39 |
cd sglang
|
| 40 |
+
|
| 41 |
+
# Install the python packages
|
| 42 |
+
pip install --upgrade pip
|
| 43 |
+
pip install -e "python"
|
| 44 |
```
|
| 45 |
|
| 46 |
Run the following command to start the SGLang server. SGLang will automatically download and cache the MiniMax-M2 model from Hugging Face.
|
| 47 |
|
| 48 |
+
4-GPU deployment command:
|
| 49 |
+
|
| 50 |
+
```bash
|
| 51 |
+
python -m sglang.launch_server \
|
| 52 |
+
--model-path MiniMaxAI/MiniMax-M2 \
|
| 53 |
+
--tp-size 4 \
|
| 54 |
+
--tool-call-parser minimax-m2 \
|
| 55 |
+
--reasoning-parser minimax-append-think \
|
| 56 |
+
--host 0.0.0.0 \
|
| 57 |
+
--trust-remote-code \
|
| 58 |
+
--port 8000 \
|
| 59 |
+
--mem-fraction-static 0.7
|
| 60 |
+
```
|
| 61 |
+
|
| 62 |
8-GPU deployment command:
|
| 63 |
|
| 64 |
```bash
|
|
|
|
| 67 |
--tp-size 8 \
|
| 68 |
--ep-size 8 \
|
| 69 |
--tool-call-parser minimax-m2 \
|
| 70 |
+
--reasoning-parser minimax-append-think \
|
| 71 |
+
--host 0.0.0.0 \
|
| 72 |
--trust-remote-code \
|
| 73 |
--port 8000 \
|
| 74 |
--mem-fraction-static 0.7
|
| 75 |
```
|
| 76 |
|
| 77 |
+
|
| 78 |
+
|
| 79 |
## Testing Deployment
|
| 80 |
|
| 81 |
+
After startup, you can test the SGLang OpenAI-compatible API with the following command:
|
| 82 |
|
| 83 |
```bash
|
| 84 |
curl http://localhost:8000/v1/chat/completions \
|
|
|
|
| 104 |
|
| 105 |
### MiniMax-M2 model is not currently supported
|
| 106 |
|
| 107 |
+
This SGLang version is outdated. Please upgrade to the latest version.
|
| 108 |
|
| 109 |
## Getting Support
|
| 110 |
|
| 111 |
If you encounter any issues while deploying the MiniMax model:
|
| 112 |
|
| 113 |
+
- Contact our technical support team through official channels such as email at [model@minimax.io](mailto:model@minimax.io)
|
| 114 |
|
| 115 |
- Submit an issue on our [GitHub](https://github.com/MiniMax-AI) repository
|
| 116 |
|
docs/sglang_deploy_guide_cn.md
CHANGED
|
@@ -1,6 +1,6 @@
|
|
| 1 |
# MiniMax M2 模型 SGLang 部署指南
|
| 2 |
|
| 3 |
-
我们推荐使用 [SGLang](https://github.com/sgl-project/sglang) 来部署 [MiniMax-M2](https://huggingface.co/MiniMaxAI/MiniMax-M2) 模型。
|
| 4 |
|
| 5 |
## 本文档适用模型
|
| 6 |
|
|
@@ -41,6 +41,20 @@ uv pip install ./python --torch-backend=auto
|
|
| 41 |
|
| 42 |
运行如下命令启动 SGLang 服务器,SGLang 会自动从 Huggingface 下载并缓存 MiniMax-M2 模型。
|
| 43 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 44 |
8 卡部署命令:
|
| 45 |
|
| 46 |
```bash
|
|
@@ -49,7 +63,8 @@ python -m sglang.launch_server \
|
|
| 49 |
--tp-size 8 \
|
| 50 |
--ep-size 8 \
|
| 51 |
--tool-call-parser minimax-m2 \
|
| 52 |
-
--reasoning-parser minimax \
|
|
|
|
| 53 |
--trust-remote-code \
|
| 54 |
--port 8000 \
|
| 55 |
--mem-fraction-static 0.7
|
|
@@ -83,13 +98,13 @@ export HF_ENDPOINT=https://hf-mirror.com
|
|
| 83 |
|
| 84 |
### MiniMax-M2 model is not currently supported
|
| 85 |
|
| 86 |
-
该
|
| 87 |
|
| 88 |
## 获取支持
|
| 89 |
|
| 90 |
如果在部署 MiniMax 模型过程中遇到任何问题:
|
| 91 |
|
| 92 |
-
- 通过邮箱 [
|
| 93 |
|
| 94 |
- 在我们的 [GitHub](https://github.com/MiniMax-AI) 仓库提交 Issue
|
| 95 |
我们会持续优化模型的部署体验,欢迎反馈!
|
|
|
|
| 1 |
# MiniMax M2 模型 SGLang 部署指南
|
| 2 |
|
| 3 |
+
我们推荐使用 [SGLang](https://github.com/sgl-project/sglang) 来部署 [MiniMax-M2](https://huggingface.co/MiniMaxAI/MiniMax-M2) 模型。SGLang 是一个高性能的推理引擎,其具有卓越的服务吞吐、高效智能的内存管理机制、强大的批量请求处理能力、深度优化的底层性能等特性。我们建议在部署之前查看 SGLang 的官方文档以检查硬件兼容性。
|
| 4 |
|
| 5 |
## 本文档适用模型
|
| 6 |
|
|
|
|
| 41 |
|
| 42 |
运行如下命令启动 SGLang 服务器,SGLang 会自动从 Huggingface 下载并缓存 MiniMax-M2 模型。
|
| 43 |
|
| 44 |
+
4 卡部署命令:
|
| 45 |
+
|
| 46 |
+
```bash
|
| 47 |
+
python -m sglang.launch_server \
|
| 48 |
+
--model-path MiniMaxAI/MiniMax-M2 \
|
| 49 |
+
--tp-size 4 \
|
| 50 |
+
--tool-call-parser minimax-m2 \
|
| 51 |
+
--reasoning-parser minimax-append-think \
|
| 52 |
+
--host 0.0.0.0 \
|
| 53 |
+
--trust-remote-code \
|
| 54 |
+
--port 8000 \
|
| 55 |
+
--mem-fraction-static 0.7
|
| 56 |
+
```
|
| 57 |
+
|
| 58 |
8 卡部署命令:
|
| 59 |
|
| 60 |
```bash
|
|
|
|
| 63 |
--tp-size 8 \
|
| 64 |
--ep-size 8 \
|
| 65 |
--tool-call-parser minimax-m2 \
|
| 66 |
+
--reasoning-parser minimax-append-think \
|
| 67 |
+
--host 0.0.0.0 \
|
| 68 |
--trust-remote-code \
|
| 69 |
--port 8000 \
|
| 70 |
--mem-fraction-static 0.7
|
|
|
|
| 98 |
|
| 99 |
### MiniMax-M2 model is not currently supported
|
| 100 |
|
| 101 |
+
该 SGLang 版本过旧,请升级到最新版本。
|
| 102 |
|
| 103 |
## 获取支持
|
| 104 |
|
| 105 |
如果在部署 MiniMax 模型过程中遇到任何问题:
|
| 106 |
|
| 107 |
+
- 通过邮箱 [model@minimax.io](mailto:model@minimax.io) 等官方渠道联系我们的技术支持团队
|
| 108 |
|
| 109 |
- 在我们的 [GitHub](https://github.com/MiniMax-AI) 仓库提交 Issue
|
| 110 |
我们会持续优化模型的部署体验,欢迎反馈!
|
docs/vllm_deploy_guide.md
CHANGED
|
@@ -87,7 +87,7 @@ This vLLM version is outdated. Please upgrade to the latest version.
|
|
| 87 |
|
| 88 |
If you encounter any issues while deploying the MiniMax model:
|
| 89 |
|
| 90 |
-
- Contact our technical support team through official channels such as email at [
|
| 91 |
|
| 92 |
- Submit an issue on our [GitHub](https://github.com/MiniMax-AI) repository
|
| 93 |
|
|
|
|
| 87 |
|
| 88 |
If you encounter any issues while deploying the MiniMax model:
|
| 89 |
|
| 90 |
+
- Contact our technical support team through official channels such as email at [model@minimax.io](mailto:model@minimax.io)
|
| 91 |
|
| 92 |
- Submit an issue on our [GitHub](https://github.com/MiniMax-AI) repository
|
| 93 |
|
docs/vllm_deploy_guide_cn.md
CHANGED
|
@@ -86,7 +86,7 @@ export HF_ENDPOINT=https://hf-mirror.com
|
|
| 86 |
|
| 87 |
如果在部署 MiniMax 模型过程中遇到任何问题:
|
| 88 |
|
| 89 |
-
- 通过邮箱 [
|
| 90 |
|
| 91 |
- 在我们的 [GitHub](https://github.com/MiniMax-AI) 仓库提交 Issue
|
| 92 |
我们会持续优化模型的部署体验,欢迎反馈!
|
|
|
|
| 86 |
|
| 87 |
如果在部署 MiniMax 模型过程中遇到任何问题:
|
| 88 |
|
| 89 |
+
- 通过邮箱 [model@minimax.io](mailto:model@minimax.io) 等官方渠道联系我们的技术支持团队
|
| 90 |
|
| 91 |
- 在我们的 [GitHub](https://github.com/MiniMax-AI) 仓库提交 Issue
|
| 92 |
我们会持续优化模型的部署体验,欢迎反馈!
|