Resources

View closed (15)

GGUF uploaded now + Chat template Fixes!

pinned

🤗 🔥 4

#2 opened 3 months ago by

shimmyshimmer

Why Q8 is just 13 GiB ?

#36 opened 4 days ago by

zvwgvx

Why does GGUF conversion add third linear weight to MoE FFN?

#35 opened 8 days ago by

eturok-weizmann

Are the F16 weights upcasted MXFP4? -- Why no `gpt-oss-20b-MXFP4.gguf`?

#34 opened 10 days ago by

rtzurtz

COPILOT

#30 opened about 2 months ago by

coziSoul

TEMPLATE for Ollama Modelfile?

#29 opened about 2 months ago by

Ray9821

Internvl3_5-gptoss-20b issue

#28 opened 2 months ago by

wsbagnsv1

[solved] Setup high reasoning mode

#27 opened 2 months ago by

Maria99934

Problems with FP32 model

#25 opened 2 months ago by

YardWeasel

Feature Request: Disable reasoning

👀 1

#22 opened 3 months ago by

SomAnon

Speed differences for different quants

👍 4

#21 opened 3 months ago by

leonardlin

New Chat Template Fixes as of Aug 8, 2025:

🤗 ❤️ 1

#19 opened 3 months ago by

shimmyshimmer

ollama load error

➕ 👍 7

#17 opened 3 months ago by

kwangtek

Failed to use with vLLM

➕ 9

#16 opened 3 months ago by

chengorange1

failed to read tensor info

#15 opened 3 months ago by

valid-name1

Error installing model

#13 opened 3 months ago by

nototon

Absurd sizes.

#12 opened 3 months ago by

ZeroWw

Giving me error with llama-cpp-python

👍 1

#11 opened 3 months ago by

divyanshu-k

Is the BF16 gguf any different from the F16 one? (speed/accuracy)

#10 opened 3 months ago by

CHNtentes

Tool calling broken

#5 opened 3 months ago by

AekDevDev

Wow, amazing response time

🤗 ➕ 9

#1 opened 3 months ago by

AlexPradas