cognitive-reasoners / configs /micro_smollm2_360m.yml
bkhmsi's picture
added more models
8730f5f
run-title: micro-smollm2-360m
model: micro-smollm2-360m
base-model: HuggingFaceTB/SmolLM2-360M
tokenizer: HuggingFaceTB/SmolLM2-360M-Instruct
num-experts: 4
top-k-experts: 1
jitter-noise: 0
use-router: True
mask-input: True
max-length: 8192
gradient-checkpointing: False
trainable:
- model