Spaces:
No application file
No application file
Update README.md
Browse files
README.md
CHANGED
|
@@ -32,4 +32,13 @@ High-quality, Apple-Silicon–optimized **MLX** builds, tools, and evals — foc
|
|
| 32 |
| [halley-ai/gpt-oss-120b-MLX-8bit-gs32](https://huggingface.co/halley-ai/gpt-oss-120b-MLX-8bit-gs32) | Q8 / 32 | ~63.42 GB | Reference int8; stable and simple to use. |
|
| 33 |
| [halley-ai/gpt-oss-120b-MLX-bf16](https://huggingface.co/halley-ai/gpt-oss-120b-MLX-bf16) | bf16 | ~65.28 GB | Non-quantized reference for evaluation/ground truth. |
|
| 34 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 35 |
**Format:** MLX (not GGUF). For Linux/Windows or non-MLX stacks, use a GGUF build with llama.cpp.
|
|
|
|
| 32 |
| [halley-ai/gpt-oss-120b-MLX-8bit-gs32](https://huggingface.co/halley-ai/gpt-oss-120b-MLX-8bit-gs32) | Q8 / 32 | ~63.42 GB | Reference int8; stable and simple to use. |
|
| 33 |
| [halley-ai/gpt-oss-120b-MLX-bf16](https://huggingface.co/halley-ai/gpt-oss-120b-MLX-bf16) | bf16 | ~65.28 GB | Non-quantized reference for evaluation/ground truth. |
|
| 34 |
|
| 35 |
+
### Qwen3-Next-80B-A3B-Instruct (MLX)
|
| 36 |
+
|
| 37 |
+
| Repo | Bits/GS | Footprint | Notes |
|
| 38 |
+
|---|---:|---:|---|
|
| 39 |
+
| [halley-ai/Qwen3-Next-80B-A3B-Instruct-MLX-6bit-gs64](https://huggingface.co/halley-ai/Qwen3-Next-80B-A3B-Instruct-MLX-6bit-gs64) | Q6 / 64 | ~64.92 GB | Quality pick; matched bf16 on our PPL run (5.14). |
|
| 40 |
+
| [halley-ai/Qwen3-Next-80B-A3B-Instruct-MLX-5bit-gs32](https://huggingface.co/halley-ai/Qwen3-Next-80B-A3B-Instruct-MLX-5bit-gs32) | Q5 / 32 | ~59.86 GB | Balanced; near‑par PPL (5.20) and strong deterministic math. |
|
| 41 |
+
|
| 42 |
+
Perplexity reported with our fast preset on WikiText‑2 (raw, test). See repository docs for exact commands.
|
| 43 |
+
|
| 44 |
**Format:** MLX (not GGUF). For Linux/Windows or non-MLX stacks, use a GGUF build with llama.cpp.
|