5 12 2

FeiYuan

FeYuan

AI & ML interests

Multilingual & Code Intelligence

Recent Activity

liked a dataset 1 day ago

internlm/InteractScience

updated a collection 1 day ago

JanusCoder

updated a collection 1 day ago

JanusCoder

View all activity

Organizations

Posts 2

Post

216

Meet LLaMAX2! Lightweight Pipeline - SFT on Qwen3-Instruct Models without Catastrophic Forgetting !!!
✨Highlights:
🔹 SOTA Translation: State-of-the-art translation performance across both high- and low-resource trained languages.
🔹 Lightweight Pipeline: Engineered for efficiency, our pipeline uses minimal parallel data and applies layer-selective tuning to a powerful instruct model.
🔹 Strong Reasoning Capabilities: Exhibits reasoning abilities that are competitive with top-tier models like Qwen3-Instruct.

Welcome to use our models. More Details:
🎉 Paper: LLaMAX2: Your Translation-Enhanced Model also Performs Well in Reasoning (2510.09189)
🎉 Code: https://github.com/CONE-MT/LLaMAX2.0
🎉 Model: LLaMAX/llamax20-68ad1c154fcf2623b75a068c

Post

4887

Hi everyone, I'm excited to introduce our latest work, LLaMAX. 😁😁😁
LLaMAX is a powerful language model created specifically for multilingual scenarios. Built upon Meta's LLaMA series models, LLaMAX undergoes extensive training across more than 100 languages.

Remarkably, it enhances its multilingual capabilities without compromising its generalization ability, surpassing existing LLMs.

✨Highlights:

🎈 LLaMAX supports the 102 languages covered by Flores-101, and its performance in translating between low-resource languages far surpasses other decoder-only LLMs.

🎈 Even for languages not covered in Flores-200, LLaMAX still shows significant improvements in translation performance.

🎈 By performing simple SFT on English task data, LLaMAX demonstrates impressive multilingual transfer abilities in downstream tasks.

🎈 In our paper, we discuss effective methods for enhancing the multilingual capabilities of LLMs during the continued training phase.

We welcome you to use our model and provide feedback.

More Details:

🎉 Code: https://github.com/CONE-MT/LLaMAX/

🎉 Model: https://huggingface.co/LLaMAX/

View all Posts

Papers 13

models 0

None public yet

datasets 0

None public yet