Text Generation
Transformers
Safetensors
minimax
conversational
fp8

Was the training done with FP8 or BF16?

#14
by mindkrypted - opened

As the title says, if the training was done in BF16, could we expect the release of those weights for getting better results while doing quantization?

Thanks,

MiniMax org

M2 was trained with FP8

Sign up or log in to comment