High-quality QAT FP4 models to use with the fp_quant vLLM/Transformers integration on Blackwell NVIDIA GPUs. See https://arxiv.org/abs/2509.23202
AI & ML interests
None defined yet.
Recent Activity
View all activity
Papers
Bridging the Gap Between Promise and Performance for Microscaling FP4 Quantization
The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm
models
144
ISTA-DASLab/Llama-3.2-1B-Instruct-W4A4-mxfp4-rtn-identity-transform-sft-fp_quant
Updated
•
55
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-gptq-hadamard-transform-fake_quant
Updated
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-gptq-identity-transform
17B
•
Updated
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-rtn-hadamard-transform
17B
•
Updated
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-rtn-identity-transform
17B
•
Updated
ISTA-DASLab/Qwen3-30B-A3B-Instruct-2507-W4A4-mxfp4-gptq-hadamard-transform
17B
•
Updated
ISTA-DASLab/Llama-3.2-1B-Instruct-W4A4-nvfp4-gptq-identity-transform-sft-fp_quant
Updated
•
14
ISTA-DASLab/Llama-3.2-1B-Instruct-W4A4-nvfp4-gptq-hadamard-transform-sft-fp_quant
Updated
•
24
ISTA-DASLab/Llama-3.2-1B-Instruct-W4A4-mxfp4-rtn-identity-transform
0.8B
•
Updated
•
20
ISTA-DASLab/NVIDIA-Nemotron-Nano-9B-v2-W4A4-nvfp4-gptq-identity-transform-actorder
7B
•
Updated
•
15