GGUF quantized and bug fixed version of phi4
review
- bug fixed for: "ResponseError: llama runner process has terminated: GGML_ASSERT(hparams.n_swa > 0) failed"
 - define the architecture (from none) to llama; all works right away
 
run the model
use any gguf connector to interact with gguf file(s), i.e., connector
reference
- base model: microsoft/phi-4
 - bug fixed following the guide written by unsloth
 - tool used for quantization: cutter
 
citation
appendices: model summary and quality (written by microsoft)
model summary
| Developers | Microsoft Research | 
| Description | phi-4 is a state-of-the-art open model built upon a blend of synthetic datasets, data from filtered public domain websites, and acquired academic books and Q&A datasets. The goal of this approach was to ensure that small capable models were trained with data focused on high quality and advanced reasoning.phi-4 underwent a rigorous enhancement and alignment process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures | 
| Architecture | 14B parameters, dense decoder-only Transformer model | 
| Inputs | Text, best suited for prompts in the chat format | 
| Context length | 16K tokens | 
| GPUs | 1920 H100-80G | 
| Training time | 21 days | 
| Training data | 9.8T tokens | 
| Outputs | Generated text in response to input | 
| Dates | October 2024 – November 2024 | 
| Status | Static model trained on an offline dataset with cutoff dates of June 2024 and earlier for publicly available data | 
| Release date | December 12, 2024 | 
| License | MIT | 
model quality
to understand the capabilities, we (here refer to microsoft side) compare phi-4 with a set of models over OpenAI’s SimpleEval benchmark; at the high-level overview of the model quality on representative benchmarks; for the table below, higher numbers indicate better performance: 
| Category | Benchmark | phi-4 (14B) | phi-3 (14B) | Qwen 2.5 (14B instruct) | GPT-4o-mini | Llama-3.3 (70B instruct) | Qwen 2.5 (72B instruct) | GPT-4o | 
|---|---|---|---|---|---|---|---|---|
| Popular Aggregated Benchmark | MMLU | 84.8 | 77.9 | 79.9 | 81.8 | 86.3 | 85.3 | 88.1 | 
| Science | GPQA | 56.1 | 31.2 | 42.9 | 40.9 | 49.1 | 49.0 | 50.6 | 
| Math | MGSM MATH  | 
80.6 80.4  | 
53.5 44.6  | 
79.6 75.6  | 
86.5 73.0  | 
89.1 66.3*  | 
87.3 80.0  | 
90.4 74.6  | 
| Code Generation | HumanEval | 82.6 | 67.8 | 72.1 | 86.2 | 78.9* | 80.4 | 90.6 | 
| Factual Knowledge | SimpleQA | 3.0 | 7.6 | 5.4 | 9.9 | 20.9 | 10.2 | 39.4 | 
| Reasoning | DROP | 75.5 | 68.3 | 85.5 | 79.3 | 90.2 | 76.7 | 80.9 | 
* these scores are lower than those reported by Meta, perhaps because simple-evals has a strict formatting requirement that Llama models have particular trouble following.
- Downloads last month
 - 274
 
							Hardware compatibility
						Log In
								
								to view the estimation
2-bit
3-bit
4-bit
5-bit
6-bit
8-bit
16-bit
Model tree for calcuis/phi4
Base model
microsoft/phi-4-gguf