Are the F16 weights upcasted MXFP4? -- Why no `gpt-oss-20b-MXFP4.gguf`?

#34

by rtzurtz - opened 10 days ago

10 days ago

Follow up question to https://huggingface.co/unsloth/gpt-oss-20b-GGUF/discussions/14 and https://huggingface.co/unsloth/gpt-oss-20b-GGUF/discussions/7:

Are the F16 weights maybe just upcasted MXFP4 ones, or why is bartowski recommending to use gpt-oss-20b-MXFP4.gguf (12.1 GB):

Use this one:
gpt-oss-20b-MXFP4.gguf
The reason is, the FFN (feed forward networks) of gpt-oss do not behave nicely when quantized to anything other than MXFP4, so they are kept at that level for everything.

, everyone in https://github.com/ggml-org/llama.cpp/discussions/15396 is also testing only the gpt-oss-20b-MXFP4.gguf and just another example, lmstudio-community also only has the gpt-oss-20b-MXFP4.gguf?

qikchen

9 days ago

Yes, I think only using MXFP4.gguf is the way to go with gpt oss. Unsloth GGUFs aren't applicable AFAIK in this model.
I think they have anyway made all their GGUFs for a completion's sake. And perhaps the quantization under Q4 also has value for people without enough vram. But if you can use Q4, it only makes sense to use the standard *-MXFP4.ggufs.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment