Kimi-Linear-A3B Collection Moonshot's experimental MoE model with Kimi Delta Attention • 3 items • Updated about 15 hours ago • 6
NVIDIA Nemotron V2 Collection Open, Production-ready Enterprise Models. Nvidia Open Model license. • 9 items • Updated 4 days ago • 74
Less is More: Recursive Reasoning with Tiny Networks Paper • 2510.04871 • Published 26 days ago • 461
cwm Collection Collection for Code World Model, an agentic coding model from FAIR. • 3 items • Updated Sep 24 • 16
MobileLLM Collection Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (ICML 2024) https://arxiv.org/abs/2402.14905 • 46 items • Updated Sep 10 • 131
Granite 3.3 Language Models Collection Our latest language models licensed under Apache 2.0 license. • 4 items • Updated 1 day ago • 44
MobileLLM-R1 Collection MobileLLM-R1, a series of sub-billion parameter reasoning models • 7 items • Updated 19 days ago • 21