ikaganacar commited on
Commit
922f73b
·
verified ·
1 Parent(s): d2800aa

Upload folder using huggingface_hub

Browse files
.gitattributes CHANGED
@@ -33,3 +33,9 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
 
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ kubernetes-ai-IQ3_M.gguf filter=lfs diff=lfs merge=lfs -text
37
+ kubernetes-ai-Q3_K_M.gguf filter=lfs diff=lfs merge=lfs -text
38
+ kubernetes-ai-Q4_K_M.gguf filter=lfs diff=lfs merge=lfs -text
39
+ kubernetes-ai-Q4_K_S.gguf filter=lfs diff=lfs merge=lfs -text
40
+ kubernetes-ai-Q5_K_M.gguf filter=lfs diff=lfs merge=lfs -text
41
+ kubernetes-ai.gguf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,3 +1,171 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - tr
5
+ - en
6
+ library_name: gguf
7
+ tags:
8
+ - kubernetes
9
+ - devops
10
+ - quantized
11
+ - gguf
12
+ - gemma3
13
+ - llama-cpp
14
+ - ollama
15
+ base_model: aciklab/kubernetes-ai
16
+ model_type: gemma3
17
+ quantized_by: aciklab
18
+ ---
19
+
20
+ # Kubernetes AI - GGUF Quantized Models
21
+
22
+ Fine-tuned Gemma 3 12B model specialized for answering Kubernetes questions in Turkish, quantized to GGUF format for efficient local inference.
23
+
24
+ ## Model Description
25
+
26
+ This repository contains GGUF quantized versions of the Kubernetes AI model, optimized for running on consumer hardware without GPU requirements. The model consists of LoRA adapters fine-tuned on unsloth/gemma-3-12b-it-qat-bnb-4bit and converted to GGUF format for llama.cpp compatibility.
27
+
28
+ **Primary Purpose:** Answer Kubernetes-related questions in Turkish language on local machines.
29
+
30
+ ## Available Models
31
+
32
+ | Model | Size | Download |
33
+ |-------|------|----------|
34
+ | **Unquantized** | 22.0 GB | [kubernetes-ai.gguf](https://huggingface.co/aciklab/kubernetes-ai-GGUF/resolve/main/kubernetes-ai.gguf) |
35
+ | **Q4_K_M** | 7.3 GB | [kubernetes-ai-Q4_K_M.gguf](https://huggingface.co/aciklab/kubernetes-ai-GGUF/resolve/main/kubernetes-ai-Q4_K_M.gguf) |
36
+ | **Q4_K_S** | 6.9 GB | [kubernetes-ai-Q4_K_S.gguf](https://huggingface.co/aciklab/kubernetes-ai-GGUF/resolve/main/kubernetes-ai-Q4_K_S.gguf) |
37
+ | **Q3_K_M** | 6.0 GB | [kubernetes-ai-Q3_K_M.gguf](https://huggingface.co/aciklab/kubernetes-ai-GGUF/resolve/main/kubernetes-ai-Q3_K_M.gguf) |
38
+ | **IQ3_M** | 5.6 GB | [kubernetes-ai-IQ3_M.gguf](https://huggingface.co/aciklab/kubernetes-ai-GGUF/resolve/main/kubernetes-ai-IQ3_M.gguf) |
39
+
40
+ **Recommended:** Q4_K_M for best balance of quality and size, or IQ3_M for low-end systems.
41
+
42
+ ## Quick Start
43
+
44
+ ### Using Ollama (Recommended)
45
+
46
+ Ollama provides the easiest way to run GGUF models locally.
47
+
48
+ #### 1. Install Ollama
49
+
50
+ ```bash
51
+ # Linux
52
+ curl -fsSL https://ollama.com/install.sh | sh
53
+
54
+ # macOS
55
+ brew install ollama
56
+
57
+ # Windows - Download from https://ollama.com/download
58
+ ```
59
+
60
+ #### 2. Download Model
61
+
62
+ ```bash
63
+ # Download your preferred quantization
64
+ wget https://huggingface.co/aciklab/kubernetes-ai-GGUF/resolve/main/kubernetes-ai-Q4_K_M.gguf
65
+ ```
66
+
67
+ #### 3. Create Modelfile
68
+
69
+ ```bash
70
+ cat > Modelfile << 'EOF'
71
+ FROM <path-to-model>/kubernetes-ai.gguf
72
+
73
+ TEMPLATE """{{ if .System }}<start_of_turn>system
74
+ {{ .System }}<end_of_turn>
75
+ {{ end }}{{ if .Prompt }}<start_of_turn>user
76
+ {{ .Prompt }}<end_of_turn>
77
+ {{ end }}<start_of_turn>model
78
+ {{ .Response }}<end_of_turn>
79
+ """
80
+
81
+ # Model Parametreleri
82
+ PARAMETER temperature 1.0
83
+ PARAMETER top_p 0.95
84
+ PARAMETER top_k 64
85
+ PARAMETER repeat_penalty 1.05
86
+ PARAMETER stop "<start_of_turn>"
87
+ PARAMETER stop "<end_of_turn>"
88
+
89
+ SYSTEM """Sen Kubernetes konusunda uzmanlaşmış bir yapay zeka asistanısın. Kubernetes ile ilgili soruları Türkçe olarak yanıtlıyorsun."""
90
+ EOF
91
+ ```
92
+
93
+ #### 4. Create and Run Model
94
+
95
+ ```bash
96
+ # Create model
97
+ ollama create kubernetes-ai -f Modelfile
98
+
99
+ # Run interactive chat
100
+ ollama run kubernetes-ai
101
+
102
+ # Example query
103
+ ollama run kubernetes-ai "Kubernetes'te 3 replikaya sahip bir deployment nasıl oluştururum?"
104
+ ```
105
+
106
+ ## Training Details
107
+
108
+ This model is based on the [aciklab/kubernetes-ai](https://huggingface.co/aciklab/kubernetes-ai) LoRA adapters:
109
+
110
+ - **Base Model:** unsloth/gemma-3-12b-it-qat-bnb-4bit
111
+ - **Training Method:** LoRA (Low-Rank Adaptation)
112
+ - **LoRA Rank:** 8
113
+ - **Target Modules:** q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
114
+ - **Training Dataset:** ~157,210 examples from Kubernetes docs, Stack Overflow, and DevOps datasets
115
+ - **Training Time:** 28 hours on NVIDIA RTX 5070 12GB
116
+ - **Max Sequence Length:** 1024 tokens
117
+
118
+ ### Training Dataset Summary
119
+
120
+ | Dataset Category | Count | Description |
121
+ |-----------------|-------|-------------|
122
+ | **Kubernetes Official Docs** | 8,910 | Concepts, kubectl, setup, tasks, tutorials |
123
+ | **Stack Overflow** | 52,000 | Kubernetes Q&A from community |
124
+ | **DevOps Datasets** | 62,500 | General DevOps and Kubernetes content |
125
+ | **Configurations & CLI** | 36,800 | Kubernetes configs, kubectl examples, operators |
126
+ | **Total** | **~157,210** | Comprehensive Kubernetes knowledge base |
127
+
128
+ ## Quantization Details
129
+
130
+ All models were quantized using llama.cpp with importance matrix optimization:
131
+
132
+ - **Source:** Merged LoRA adapters with base model
133
+ - **Quantization Tool:** llama.cpp (latest)
134
+ - **Method:** K-quant and IQ-quant mixtures
135
+ - **Optimization:** Importance matrix for better quality
136
+
137
+ ### Quantization Quality
138
+
139
+ - **Q4_K_M:** Best balance - recommended for most users
140
+ - **Q4_K_S:** Slightly smaller with minimal quality loss
141
+ - **Q3_K_M:** Good for memory-constrained systems
142
+ - **IQ3_M:** Advanced 3-bit quantization for laptops
143
+ - **Unquantized:** Original F16/F32 precision
144
+
145
+ ## Hardware Requirements
146
+
147
+ ### Minimum
148
+ - **CPU:** 4+ cores
149
+ - **RAM:** 8GB (for IQ3_M/Q3_K_M quantizations)
150
+ - **Storage:** 6-8GB free space
151
+ - **GPU:** Not required (CPU inference)
152
+
153
+ ### Recommended
154
+ - **CPU:** 8+ cores
155
+ - **RAM:** 16GB (for Q4_K_M/Q4_K_S quantizations)
156
+ - **Storage:** 10GB free space
157
+ - **GPU:** Optional (can accelerate inference)
158
+
159
+ ## License
160
+
161
+ This model is released under the **MIT License**. Free to use in commercial and open-source projects.
162
+
163
+ ## Contact
164
+
165
+ **Produced by:** HAVELSAN/Açıklab
166
+
167
+ For questions or feedback, please open an issue on the model repository.
168
+
169
+ ---
170
+
171
+ **Note:** These are GGUF quantized versions ready for immediate use. No additional model loading or merging required.
kubernetes-ai-IQ3_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a45b303ce2df1759e4a14a19e36e5b19aa7e74d65993aca4552d76bed4ff73f5
3
+ size 5655722848
kubernetes-ai-Q3_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d7434df3989ddcf73df53f67bac6f2f36f852766f88a1aaa0ea903a02917914
3
+ size 6008818528
kubernetes-ai-Q4_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6e98e81e0ca18a6ab88cd29ed4b1c5fc6bea3087d62d5c47e467740b70f1d646
3
+ size 7300778848
kubernetes-ai-Q4_K_S.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7019613469aed3e16ee1c1038b083346cc9a2deff18e6c7c12d7c9cf3b2e2559
3
+ size 6935333728
kubernetes-ai-Q5_K_M.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3132ce98cf9e7d2588e842ea9e5921a6d519b5c811714f14565f5142ca237af5
3
+ size 8445037408
kubernetes-ai.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:06ba014380d640c3b23dfc7d1fffdef82538d577c408f5d85e1c3af4f3191d5f
3
+ size 22352350048