File size: 3,003 Bytes
4b7555e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 |
---
license: apache-2.0
tags:
- llm
- tinyllama
- function-calling
- cpu-optimized
- low-resource
---
# TinyLlama Function Calling (CPU Optimized)
This is a CPU-optimized version of TinyLlama that has been fine-tuned for function calling capabilities.
## Model Details
- **Base Model**: TinyLlama-1.1B-Chat-v1.0
- **Parameters**: 1.1 billion
- **Fine-tuning Method**: LoRA (Low-Rank Adaptation)
- **Training Data**: Function calling examples from Glaive Function Calling v2 dataset
- **Optimization**: Merged LoRA weights, converted to float32 for CPU deployment
## Key Features
1. **Function Calling Capabilities**: The model can identify when functions should be called and generate appropriate function call syntax
2. **CPU Optimized**: Ready to run efficiently on low-end hardware without GPUs
3. **Lightweight**: Only 1.1B parameters, making it suitable for older hardware
4. **Low Resource Requirements**: Requires only 4-6 GB RAM for loading
## Usage
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
# Load the model
model = AutoModelForCausalLM.from_pretrained("tinyllama-function-calling-cpu-optimized")
tokenizer = AutoTokenizer.from_pretrained("tinyllama-function-calling-cpu-optimized")
# Example prompt for function calling
prompt = """### Instruction:
Given the available functions and the user query, determine which function(s) to call and with what arguments.
Available functions:
{
"name": "get_exchange_rate",
"description": "Get the exchange rate between two currencies",
"parameters": {
"type": "object",
"properties": {
"base_currency": {
"type": "string",
"description": "The currency to convert from"
},
"target_currency": {
"type": "string",
"description": "The currency to convert to"
}
},
"required": [
"base_currency",
"target_currency"
]
}
}
User query: What is the exchange rate from USD to EUR?
### Response:"""
# Tokenize and generate response
inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model.generate(
**inputs,
max_new_tokens=150,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(response)
```
## Performance on Low-End Hardware
The CPU-optimized model requires approximately:
- 4-6 GB RAM for loading
- 2-4 CPU cores for inference
- No GPU required
This makes it suitable for:
- Older laptops (2018 and newer)
- Low-end desktops
- Edge devices with ARM processors
## Training Process
The model was fine-tuned using LoRA (Low-Rank Adaptation) on the Glaive Function Calling v2 dataset. Only a subset of 50 examples was used for demonstration purposes.
## License
This model is licensed under the Apache 2.0 license. |