Successmove
/

tinyllama-function-calling-cpu-optimized

function-calling

Model card Files Files and versions

tinyllama-function-calling-cpu-optimized / README.md

Successmove's picture

Upload folder using huggingface_hub

4b7555e verified 3 months ago

|

history blame contribute delete

3 kB

	---
	license: apache-2.0
	tags:
	- llm
	- tinyllama
	- function-calling
	- cpu-optimized
	- low-resource
	---

	# TinyLlama Function Calling (CPU Optimized)

	This is a CPU-optimized version of TinyLlama that has been fine-tuned for function calling capabilities.

	## Model Details

	- Base Model: TinyLlama-1.1B-Chat-v1.0
	- Parameters: 1.1 billion
	- Fine-tuning Method: LoRA (Low-Rank Adaptation)
	- Training Data: Function calling examples from Glaive Function Calling v2 dataset
	- Optimization: Merged LoRA weights, converted to float32 for CPU deployment

	## Key Features

	1. Function Calling Capabilities: The model can identify when functions should be called and generate appropriate function call syntax
	2. CPU Optimized: Ready to run efficiently on low-end hardware without GPUs
	3. Lightweight: Only 1.1B parameters, making it suitable for older hardware
	4. Low Resource Requirements: Requires only 4-6 GB RAM for loading

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	# Load the model
	model = AutoModelForCausalLM.from_pretrained("tinyllama-function-calling-cpu-optimized")
	tokenizer = AutoTokenizer.from_pretrained("tinyllama-function-calling-cpu-optimized")

	# Example prompt for function calling
	prompt = """### Instruction:
	Given the available functions and the user query, determine which function(s) to call and with what arguments.

	Available functions:
	{
	"name": "get_exchange_rate",
	"description": "Get the exchange rate between two currencies",
	"parameters": {
	"type": "object",
	"properties": {
	"base_currency": {
	"type": "string",
	"description": "The currency to convert from"
	},
	"target_currency": {
	"type": "string",
	"description": "The currency to convert to"
	}
	},
	"required": [
	"base_currency",
	"target_currency"
	]
	}
	}

	User query: What is the exchange rate from USD to EUR?

	### Response:"""

	# Tokenize and generate response
	inputs = tokenizer(prompt, return_tensors="pt", truncation=True, max_length=512)
	with torch.no_grad():
	outputs = model.generate(
	**inputs,
	max_new_tokens=150,
	do_sample=True,
	temperature=0.7,
	top_k=50,
	top_p=0.95
	)

	response = tokenizer.decode(outputs[0], skip_special_tokens=True)
	print(response)
	```

	## Performance on Low-End Hardware

	The CPU-optimized model requires approximately:
	- 4-6 GB RAM for loading
	- 2-4 CPU cores for inference
	- No GPU required

	This makes it suitable for:
	- Older laptops (2018 and newer)
	- Low-end desktops
	- Edge devices with ARM processors

	## Training Process

	The model was fine-tuned using LoRA (Low-Rank Adaptation) on the Glaive Function Calling v2 dataset. Only a subset of 50 examples was used for demonstration purposes.

	## License

	This model is licensed under the Apache 2.0 license.