Latest SOTA models supported on Qualcomm NPU.
			
	
	AI & ML interests
On Device AI Deployment and Research
Recent Activity
			Organization Card
		
		 Welcome to Nexa AI org on HuggingFace!
Welcome to Nexa AI org on HuggingFace!
NexaâŻAI is an on device AI deployment and research company. We craft optimized foundation models and on-device inference framework that runs any model on any device, across any backendâwithin minutes. Our mission is to make on device AI frictionâfree and productionâready.
On this page youâll find
- Our own trained checkpoints
- Handâpicked community models in GGUF or MLX formats, ready to run on nexa-sdk
Resources
- âď¸ Download nexaSDK â get up and run models locally in minutes
- đŹ Discord Community
- đź Slack Community
Nexa AI infra to support Qwen3VL running on GPU/NPU/CPU
			
	
	- 
	
	
	  NexaAI/Qwen3-VL-4B-Instruct-GGUFImage-Text-to-Text ⢠4B ⢠Updated ⢠25.4k ⢠25
- 
	
	
	  NexaAI/Qwen3-VL-4B-Thinking-GGUFImage-Text-to-Text ⢠4B ⢠Updated ⢠7.57k ⢠6
- 
	
	
	  NexaAI/Qwen3-VL-8B-Instruct-GGUFImage-Text-to-Text ⢠8B ⢠Updated ⢠29.3k ⢠19
- 
	
	
	  NexaAI/Qwen3-VL-8B-Thinking-GGUFImage-Text-to-Text ⢠8B ⢠Updated ⢠14.8k ⢠13
Latest SOTA models supported on Qualcomm NPU.
			
	
	Nexa AI infra to support Qwen3VL running on GPU/NPU/CPU
			
	
	- 
	
	
	  NexaAI/Qwen3-VL-4B-Instruct-GGUFImage-Text-to-Text ⢠4B ⢠Updated ⢠25.4k ⢠25
- 
	
	
	  NexaAI/Qwen3-VL-4B-Thinking-GGUFImage-Text-to-Text ⢠4B ⢠Updated ⢠7.57k ⢠6
- 
	
	
	  NexaAI/Qwen3-VL-8B-Instruct-GGUFImage-Text-to-Text ⢠8B ⢠Updated ⢠29.3k ⢠19
- 
	
	
	  NexaAI/Qwen3-VL-8B-Thinking-GGUFImage-Text-to-Text ⢠8B ⢠Updated ⢠14.8k ⢠13
			spaces
			5
		
			
	
	
	
	
	
		Running
		
	
					
					64
Nexa Omni Demo
đ§
Generate text from audio input
		Running
		
	
					
					79
Omnivlm Dpo Demo
đ
Ask questions about images and get detailed answers
		Running
		
			on 
			
			CPU Upgrade
	
					
					30
Open LLM Leaderboard for domains
đ
Ranking for Open-sourced LLMs in different domains
		Running
		
			on 
			
			CPU Upgrade
	
					
					37
Nexa AI GGUF Convertor
âĄ
Submit a model for quantization and receive an email notification
			models
			84
		
			
	
	
	
	
	 
				NexaAI/Qwen3-8B-NPU
		
	
				Updated
					
				
				
				
	
				
				
 
				NexaAI/Granite-4.0-h-350M-NPU
		
	
				Updated
					
				
				⢠
					
					22
				
	
				⢠
					
					1
				
 
				NexaAI/Qwen3-VL-2B-Instruct-GGUF
		
				2B
			⢠
	
				Updated
					
				
				⢠
					
					5.92k
				
	
				⢠
					
					15
				
 
				NexaAI/Qwen3-VL-2B-Thinking-GGUF
		
				2B
			⢠
	
				Updated
					
				
				⢠
					
					2.09k
				
	
				⢠
					
					12
				
 
				NexaAI/Qwen3-VL-8B-Thinking-GGUF
			Image-Text-to-Text
			⢠
		
				8B
			⢠
	
				Updated
					
				
				⢠
					
					14.8k
				
	
				⢠
					
					13
				
 
				NexaAI/Qwen3-VL-8B-Instruct-GGUF
			Image-Text-to-Text
			⢠
		
				8B
			⢠
	
				Updated
					
				
				⢠
					
					29.3k
				
	
				⢠
					
					19
				
 
				NexaAI/Qwen3-VL-4B-Thinking-GGUF
			Image-Text-to-Text
			⢠
		
				4B
			⢠
	
				Updated
					
				
				⢠
					
					7.57k
				
	
				⢠
					
					6
				
 
				NexaAI/Qwen3-VL-4B-Instruct-GGUF
			Image-Text-to-Text
			⢠
		
				4B
			⢠
	
				Updated
					
				
				⢠
					
					25.4k
				
	
				⢠
					
					25
				
 
				NexaAI/LFM2-1.2B-npu
		
	
				Updated
					
				
				⢠
					
					10
				
	
				⢠
					
					1
				
 
				NexaAI/Qwen3-VL-4B-Instruct-NPU
			Image-Text-to-Text
			⢠
		
				0.4B
			⢠
	
				Updated
					
				
				⢠
					
					1.73k