Multimodal - MLX - a NexaAI Collection

NexaAI 's Collections

Qwen3VL

Apple Neural Engine

Multimodal - MLX

Multimodal - GGUF

NexaQuant Models

Multimodal - MLX

updated Sep 24

Language Models that takes vision input and/or audio input, hand picked by Nexa Team.

NexaAI/gemma-3n-E4B-it-4bit-MLX

Image-Text-to-Text • Updated Jul 22 • 59 • 1
NexaAI/Qwen2.5-VL-7B-Instruct-4bit-MLX

Image-Text-to-Text • 2B • Updated Jul 22 • 17
NexaAI/SmolVLM-500M-Instruct-8bit-MLX

Image-Text-to-Text • 0.7B • Updated Jul 22 • 16
NexaAI/SmolVLM-Instruct-8bit-MLX

Image-Text-to-Text • 0.7B • Updated Jul 22 • 12
NexaAI/gemma-3-4b-it-8bit-MLX

Image-Text-to-Text • 2B • Updated Jul 22 • 19 • 1
NexaAI/gemma-3n-E2B-it-4bit-MLX

Image-Text-to-Text • 2B • Updated Jul 22 • 15 • 1
NexaAI/Kokoro-82M-bf16-MLX

Text-to-Speech • Updated Aug 7 • 77 • 2
NexaAI/parakeet-tdt-0.6b-v2-MLX

Automatic Speech Recognition • Updated Aug 7 • 35 • 2
NexaAI/whisper-large-v3-turbo-MLX

Automatic Speech Recognition • Updated Aug 7 • 203 • 2