onlplab
/

alephbert-base

Model card Files Files and versions

aseker00 commited on Mar 14, 2021

Commit

5cf1afd

·

1 Parent(s): 9e947bd

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -38,6 +38,8 @@ alephbert.eval()
 Trained on a DGX machine (8 V100 GPUs) using the standard huggingface training procedure.
 To optimize training time we split the data into 4 sections based on max number of tokens:
 1. num tokens < 32 (70M sentences)

 Trained on a DGX machine (8 V100 GPUs) using the standard huggingface training procedure.
+Since the larger part of our training data is based on tweets we decided to start by optimizing using Masked Language Model loss only.
 To optimize training time we split the data into 4 sections based on max number of tokens:
 1. num tokens < 32 (70M sentences)