license: mit

Model Architecture

Base Model: Llama 3 (70 billion parameters)

Quantization: 4-bit integer quantization for memory and computational efficiency

Framework: Fine-tuned with PyTorch, leveraging Hugging Face Transformers

PIM Optimization: Enhanced for PIM hardware to process data directly in memory, minimizing latency and maximizing throughput

Intended Use

Primary Use Cases:

Large-scale text generation Summarization Question answering Conversational AI Text classification

Research Focus:

This model is specifically designed for research and industrial applications that require efficient handling of large language models with constrained hardware resources.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support