sovthpaw
/

senter-omni-model

+---
+license: apache-2.0
+base_model: unsloth/Qwen2.5-Omni-3B
+tags:
+- multimodal
+- text
+- image
+- audio
+- video
+- senter
+- omnimodal
+- 4b
+- 128k
+- uncensored
+pipeline_tag: text-generation
+---
+# 🎭 Senter-Omni
+**Multimodal AI Assistant with Cross-Modal Embeddings**
+![Senter Banner](senter-banner.png)
+## 🌟 Overview
+Senter-Omni is a 4B parameter multimodal AI assistant that understands and reasons across text, images, audio, and video simultaneously. Built on Qwen2.5-Omni with extended 128K context and Apache 2.0 licensing.
+## ✨ Key Features
+- **🎯 ONE MODEL, ALL MODALITIES** - Single model for text, image, audio, and video
+- **⚡ TRUE STREAMING** - Real-time token generation (~0.234s time-to-first-token)
+- **🔓 OPEN & UNCENSORED** - Apache 2.0 licensed with unrestricted responses
+- **🧠 128K CONTEXT** - Extended RoPE scaling for massive documents
+- **💾 MEMORY EFFICIENT** - 4-bit quantized model for consumer GPUs
+- **🔍 CROSS-MODAL EMBEDDINGS** - Unified 1024D space for all modalities
+## 🚀 Quick Start
+```python
+from omni import OmniClient
+# Initialize Senter-Omni
+client = OmniClient()
+# Multimodal chat
+response = client.chat([
+    {"role": "user", "content": [
+        {"type": "image", "image": "photo.jpg"},
+        {"type": "text", "text": "What do you see?"}
+    ]}
+])
+# Cross-modal embeddings
+embedding = client.embed("any content", modality="auto")
+```
+## 📊 Model Specifications
+- **Parameters**: 4B (quantized to 4-bit)
+- **Context Length**: 128K tokens (RoPE scaled)
+- **Memory Usage**: ~8GB VRAM
+- **Modalities**: Text, Image, Audio, Video
+- **License**: Apache 2.0
+## 🔗 Links
+- **GitHub Repository**: https://github.com/SouthpawIN/senter-omni
+- **Training Dataset**: https://huggingface.co/datasets/SouthpawIN/senter-omni-data
+- **Demo Script**: Run `python senter_omni_demo.py` in the GitHub repo
+## 🎯 Performance
+- **Time to First Token**: ~0.234s
+- **Text Generation**: 2-5 seconds
+- **Image Analysis**: 3-6 seconds
+- **Audio Processing**: 4-8 seconds
+- **Multimodal Chat**: 5-10 seconds
+## 🛠️ Installation
+```bash
+git clone https://github.com/SouthpawIN/senter-omni.git
+cd senter-omni
+pip install -r requirements.txt
+python senter_omni_demo.py
+```
+## 📝 Citation
+```bibtex
+@misc{senter-omni,
+  title={Senter-Omni: Multimodal AI Assistant with Cross-Modal Embeddings},
+  author={Chris at Alignment Lab AI},
+  year={2024},
+  url={https://github.com/SouthpawIN/senter-omni}
+}
+```
+---
+**Built with ❤️ by Chris at Alignment Lab AI**