Update README.md
Browse files
README.md
CHANGED
|
@@ -15,9 +15,8 @@ language:
|
|
| 15 |
pipeline_tag: text-generation
|
| 16 |
---
|
| 17 |
|
| 18 |
-
<img src="https://cdn-uploads.huggingface.co/production/uploads/645ded34a45b4182d7f5c385/EgsjPDWd37LjAtamiICxk.png" width="
|
| 19 |
|
| 20 |
-

|
| 21 |
|
| 22 |
### Disclaimer
|
| 23 |
This model is a base model which received aggressive pruning and knowledge distillation. To make it usable for your individual application it must we finetuned.
|
|
@@ -66,7 +65,8 @@ Up to 40 % parameter reduction (24 B → 15 B) delivers 2× lower TTFT
|
|
| 66 |
| Tokens / s | 579 | **812** | +40% |
|
| 67 |
|
| 68 |
|
| 69 |
-
|
|
|
|
| 70 |
|
| 71 |
### Training scalability (distillation run, MI300A cluster)
|
| 72 |
|
|
|
|
| 15 |
pipeline_tag: text-generation
|
| 16 |
---
|
| 17 |
|
| 18 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/645ded34a45b4182d7f5c385/EgsjPDWd37LjAtamiICxk.png" width="480" height="480" alt="image/png">
|
| 19 |
|
|
|
|
| 20 |
|
| 21 |
### Disclaimer
|
| 22 |
This model is a base model which received aggressive pruning and knowledge distillation. To make it usable for your individual application it must we finetuned.
|
|
|
|
| 65 |
| Tokens / s | 579 | **812** | +40% |
|
| 66 |
|
| 67 |
|
| 68 |
+
|
| 69 |
+
<img src="https://cdn-uploads.huggingface.co/production/uploads/645ded34a45b4182d7f5c385/4rDhaeC-1GMj6KWbB27f9.png" width="300" height="300" alt="image/png">
|
| 70 |
|
| 71 |
### Training scalability (distillation run, MI300A cluster)
|
| 72 |
|