ronantakizawa commited on
Commit
876a368
·
verified ·
1 Parent(s): 710c556

Upload AWQ 4-bit quantized Molmo-7B-D (~5.2GB, 63.0% reduction)

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -36,7 +36,7 @@ This is a 4-bit AWQ quantized version of [allenai/Molmo-7B-D-0924](https://huggi
36
  - **Architecture:** Molmo (Qwen2-7B decoder + OpenAI CLIP vision encoder)
37
  - **Quantization Method:** AWQ (Activation-aware Weight Quantization)
38
  - **Quantization Scheme:** W4A16 (4-bit weights, 16-bit activations)
39
- - **Calibration Dataset:** Flickr30k (32 samples)
40
 
41
  ## Size Comparison
42
 
@@ -150,8 +150,8 @@ Molmo-7B-D is part of the Molmo family of open vision-language models developed
150
 
151
  - **Method:** AWQ (Activation-aware Weight Quantization)
152
  - **Independent Pipeline:** Used with BasicPipeline for layer-by-layer quantization
153
- - **Calibration:** 32 Flickr30k image-text pairs
154
- - **Max Sequence Length:** 1024 tokens
155
  - **Why AWQ**: Activation-aware quantization preserves important weights
156
 
157
  ## Limitations
 
36
  - **Architecture:** Molmo (Qwen2-7B decoder + OpenAI CLIP vision encoder)
37
  - **Quantization Method:** AWQ (Activation-aware Weight Quantization)
38
  - **Quantization Scheme:** W4A16 (4-bit weights, 16-bit activations)
39
+ - **Calibration Dataset:** Flickr30k (128 samples)
40
 
41
  ## Size Comparison
42
 
 
150
 
151
  - **Method:** AWQ (Activation-aware Weight Quantization)
152
  - **Independent Pipeline:** Used with BasicPipeline for layer-by-layer quantization
153
+ - **Calibration:** 128 Flickr30k image-text pairs
154
+ - **Max Sequence Length:** 2048 tokens
155
  - **Why AWQ**: Activation-aware quantization preserves important weights
156
 
157
  ## Limitations