Spaces:
Running
Running
Update app.py
Browse files
app.py
CHANGED
|
@@ -555,12 +555,12 @@ with gr.Blocks(css=css) as demo:
|
|
| 555 |
- **GemliteUIntXWeightOnly**: uintx gemlite quantization (default to 4 bit only for now)
|
| 556 |
- **Int8WeightOnly**: 8-bit weight-only quantization
|
| 557 |
- **Int8DynamicActivationInt8Weight**: 8-bit quantization for both weights and activations
|
| 558 |
-
- **Float8WeightOnly**: float8
|
| 559 |
-
- **Float8DynamicActivationFloat8Weight**: float8
|
| 560 |
- **autoquant**: automatic quantization (uses the best quantization method for the model)
|
| 561 |
|
| 562 |
### Group Size
|
| 563 |
-
- Only applicable for
|
| 564 |
- Default value is 128
|
| 565 |
- Affects the granularity of quantization
|
| 566 |
|
|
|
|
| 555 |
- **GemliteUIntXWeightOnly**: uintx gemlite quantization (default to 4 bit only for now)
|
| 556 |
- **Int8WeightOnly**: 8-bit weight-only quantization
|
| 557 |
- **Int8DynamicActivationInt8Weight**: 8-bit quantization for both weights and activations
|
| 558 |
+
- **Float8WeightOnly**: float8 weight-only quantization
|
| 559 |
+
- **Float8DynamicActivationFloat8Weight**: float8 quantization for both weights and activations
|
| 560 |
- **autoquant**: automatic quantization (uses the best quantization method for the model)
|
| 561 |
|
| 562 |
### Group Size
|
| 563 |
+
- Only applicable for Int4WeightOnly and GemliteUIntXWeightOnly quantization
|
| 564 |
- Default value is 128
|
| 565 |
- Affects the granularity of quantization
|
| 566 |
|