`ik_llama.cpp` quantizations of DeepSeek-V3-0324

Quantized using ik_llama.cpp build = 3788 (4622fadc)

NOTE: These quants MUST be run using the llama.cpp fork, ik_llama.cpp

Credits to @ubergarm for his DeepSeek quant recipes for which these quants were based on.

name	file size	quant type	bpw
DeepSeek-V3-0324-IQ4_KT	322.355 GiB	`IQ4_KT` (97.5%) / `Q8_0` (2.5%)	4.127
DeepSeek-V3-0324-IQ4_XS_R8	340.764 GiB	`IQ4_XS_R8` (97.5%) / `Q8_0` (2.5%)	4.362
DeepSeek-V3-0324-D-IQ4_KS_R4	366.762 GiB	`IQ4_KS_R4` (65%) / `IQ5_KS_R4` (32.5%) / `Q8_0` (2.5%)	4.695
DeepSeek-V3-0324-D-Q4_K_R4	412.131 GiB	`Q4_K_R4` (65%) / `Q6_K_R4` (32.5%) / `Q8_0` (2.5%)	5.276
DeepSeek-V3-0324-Q8_0_R8	664.295 GiB	`Q8_0_R8` (100%)	8.504

GGUF

Model size

671B params

Architecture

deepseek2

Hardware compatibility

4-bit

8-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Kebob/DeepSeek-V3-0324-IK_GGUF

Base model

Quantized

(25)

this model