GGUF uploaded now + Chat template Fixes!
pinnedπ€
π₯
4
27
#2 opened 3 months ago
by
shimmyshimmer
Why Q8 is just 13 GiB ?
#36 opened 4 days ago
by
zvwgvx
Why does GGUF conversion add third linear weight to MoE FFN?
#35 opened 8 days ago
by
eturok-weizmann
Are the F16 weights upcasted MXFP4? -- Why no `gpt-oss-20b-MXFP4.gguf`?
1
#34 opened 10 days ago
by
rtzurtz
TEMPLATE for Ollama Modelfile?
2
#29 opened about 2 months ago
by
Ray9821
Internvl3_5-gptoss-20b issue
#28 opened 2 months ago
by
wsbagnsv1
[solved] Setup high reasoning mode
1
#27 opened 2 months ago
by
Maria99934
Problems with FP32 model
2
#25 opened 2 months ago
by
YardWeasel
Feature Request: Disable reasoning
π
1
3
#22 opened 3 months ago
by
SomAnon
Speed differences for different quants
π
4
2
#21 opened 3 months ago
by
leonardlin
New Chat Template Fixes as of Aug 8, 2025:
π€
β€οΈ
1
#19 opened 3 months ago
by
shimmyshimmer
ollama load error
β
π
7
10
#17 opened 3 months ago
by
kwangtek
Failed to use with vLLM
β
9
#16 opened 3 months ago
by
chengorange1
failed to read tensor info
5
#15 opened 3 months ago
by
valid-name1
Error installing model
2
#13 opened 3 months ago
by
nototon
Absurd sizes.
4
#12 opened 3 months ago
by
ZeroWw
Giving me error with llama-cpp-python
π
1
1
#11 opened 3 months ago
by
divyanshu-k
Is the BF16 gguf any different from the F16 one? (speed/accuracy)
6
#10 opened 3 months ago
by
CHNtentes
Tool calling broken
3
#5 opened 3 months ago
by
AekDevDev
Wow, amazing response time
π€
β
9
7
#1 opened 3 months ago
by
AlexPradas