LazarusNLP
/

IndoNanoT5-base

text2text-generation

text-generation-inference

Model card Files Files and versions

IndoNanoT5 Base

IndoNanoT5 Base is an Indonesian sequence-to-sequence language model based on the T5 architecture. We conducted pre-training on an open-source Indonesian corpus of uonlp/CulturaX. On a held-out subset of the corpus, our model achieved an evaluation loss of 2.082 or a perplexity of about 8.02.

This model was trained using the nanoT5 PyTorch framework. All training was done on an NVIDIA H100 GPU. LazarusNLP/IndoNanoT5-base is released under Apache 2.0 license.

Model Detail

Developed by: LazarusNLP
Model type: Encoder-decoder T5 transformer language model
Language(s): Indonesian
License: Apache 2.0
Contact: Wilson Wongso

Use in 🤗Transformers

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_checkpoint = "LazarusNLP/IndoNanoT5-base"

tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)

Training Datasets

Around 4B tokens from the following corpora were used during pre-training.

Cleaned, Enormous, and Public: The Multilingual Fuel to Democratize Large Language Models for 167 Languages

Training Hyperparameters

The following hyperparameters were used during training:

total_steps: 65536
input_length: 512
batch_size: 128
grad_acc: 1
base_lr: 5e-3
optimizer: AdamWScaled with betas=(0.9,0.999) and epsilon=1e-08
weight_decay: 0.0
lr_scheduler: cosine
warmup_steps: 10000
final_cosine: 1e-5
grad_clip: 1.0
precision: bf16

Acknowledgements

We would like to acknowledge nanoT5 for inspiring this project.

Credits

BhinnekaLM is developed with love by:

Downloads last month: 103

Safetensors

Model size

0.2B params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for LazarusNLP/IndoNanoT5-base

Finetunes

Dataset used to train LazarusNLP/IndoNanoT5-base

Spaces using LazarusNLP/IndoNanoT5-base 2

Collection including LazarusNLP/IndoNanoT5-base

Indonesian T5 Language Models

Indonesian T5 models pre-trained with nanoT5 and fine-tuned on IndoNLG tasks. GitHub: https://github.com/LazarusNLP/IndoT5/ • 5 items • Updated May 11, 2024