prithivMLmods commited on
Commit
acd4926
·
verified ·
1 Parent(s): a9a6cdf

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +87 -1
README.md CHANGED
@@ -17,4 +17,90 @@ tags:
17
  - moe
18
  - code
19
  - text-generation-inference
20
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  - moe
18
  - code
19
  - text-generation-inference
20
+ ---
21
+
22
+ # Tureis-Qwen3\_QWQ-4B-Exp
23
+
24
+ > **Tureis-Qwen3\_QWQ-4B-Exp** is a fine-tuned variant of the **Qwen3-4B** architecture, trained specifically on **QWQ Synthetic datasets** to maximize **precise mathematical and logical reasoning**. This experimental model offers high accuracy on structured reasoning tasks while maintaining lightweight performance, making it ideal for technical, educational, and symbolic computation applications.
25
+
26
+ ## Key Features
27
+
28
+ 1. **Precision Reasoning with QWQ Dataset**
29
+ Tailored for high-fidelity symbolic reasoning, step-by-step math problem solving, and logic tasks, thanks to specialized QWQ synthetic fine-tuning.
30
+
31
+ 2. **Lightweight Code Understanding**
32
+ Capable of interpreting, generating, and correcting code in Python, C++, and other languages, optimized for concise logic-based tasks.
33
+
34
+ 3. **Structured Output Formatting**
35
+ Generates well-organized responses in Markdown, JSON, LaTeX, and tabular formats suitable for notebooks, documentation, and data-centric workflows.
36
+
37
+ 4. **Instruction-Following Accuracy**
38
+ Tuned to follow multi-step user instructions with consistency across tasks and sessions, improving reliability in educational and factual domains.
39
+
40
+ 5. **Multilingual Capabilities**
41
+ Supports reasoning and generation in more than 20 languages for global accessibility and technical translation use cases.
42
+
43
+ 6. **Efficient 4B Architecture**
44
+ Based on Qwen3-4B, providing an optimal tradeoff between performance and compute requirements—suitable for mid-tier GPUs or scaled inference scenarios.
45
+
46
+ ## Quickstart with Transformers
47
+
48
+ ```python
49
+ from transformers import AutoModelForCausalLM, AutoTokenizer
50
+
51
+ model_name = "prithivMLmods/Tureis-Qwen3_QWQ-4B-Exp"
52
+
53
+ model = AutoModelForCausalLM.from_pretrained(
54
+ model_name,
55
+ torch_dtype="auto",
56
+ device_map="auto"
57
+ )
58
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
59
+
60
+ prompt = "If 5(x - 2) = 3x + 4, solve for x step-by-step."
61
+
62
+ messages = [
63
+ {"role": "system", "content": "You are a precise reasoning assistant trained on QWQ datasets."},
64
+ {"role": "user", "content": prompt}
65
+ ]
66
+
67
+ text = tokenizer.apply_chat_template(
68
+ messages,
69
+ tokenize=False,
70
+ add_generation_prompt=True
71
+ )
72
+
73
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
74
+
75
+ generated_ids = model.generate(
76
+ **model_inputs,
77
+ max_new_tokens=512
78
+ )
79
+ generated_ids = [
80
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
81
+ ]
82
+
83
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
84
+ print(response)
85
+ ```
86
+
87
+ ## Intended Use
88
+
89
+ * Step-by-step math and logic problem solving
90
+ * Code snippet generation and explanation
91
+ * Technical and structured documentation
92
+ * JSON/Markdown/tabular output generation
93
+ * Education tools and auto-tutoring in STEM
94
+ * Multilingual reasoning and Q\&A systems
95
+
96
+ ## Limitations
97
+
98
+ * Limited creativity for fiction or open-domain chat
99
+ * Small context window compared to larger models
100
+ * Sensitive to formatting in complex queries
101
+ * May still produce errors in adversarial reasoning prompts
102
+
103
+ ## References
104
+
105
+ 1. [Qwen2.5 Technical Report](https://arxiv.org/pdf/2412.15115)
106
+ 2. [YaRN: Context Window Extension for LLMs](https://arxiv.org/pdf/2309.00071)