llmat
/

Qwen3-8B-NVFP4

Text Generation

8-bit precision

compressed-tensors

Model card Files Files and versions

Qwen3-8B-NVFP4

NVFP4-quantized version of Qwen/Qwen3-8B produced with llmcompressor.

Notes

Quantization scheme: NVFP4 (linear layers, lm_head excluded)
Calibration samples: 512
Max sequence length during calibration: 2048

Downloads last month: 19

Safetensors

Model size

5B params

Tensor type

F32

·

BF16

·

F8_E4M3

·

U8

·

Model tree for llmat/Qwen3-8B-NVFP4

Base model

Qwen/Qwen3-8B-Base

Finetuned

Quantized

(173)

this model