- 
	
	
	
Beyond Language Models: Byte Models are Digital World Simulators
Paper • 2402.19155 • Published • 53 - 
	
	
	
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models
Paper • 2402.19427 • Published • 56 - 
	
	
	
VisionLLaMA: A Unified LLaMA Interface for Vision Tasks
Paper • 2403.00522 • Published • 46 - 
	
	
	
Resonance RoPE: Improving Context Length Generalization of Large Language Models
Paper • 2403.00071 • Published • 24 
Mei dianwen
mdw123
		AI & ML interests
None yet
		
		Organizations
None yet