lightblue
/

openorca_stx

Text Generation

text-generation-inference

Model card Files Files and versions

ptrdvn commited on Sep 13, 2023

Commit

c2fc322

·

1 Parent(s): 26a395b

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -15,7 +15,7 @@ We trained on equal samples of the following three datasets:
 * [TyDiQA (Ja)](https://huggingface.co/datasets/khalidalt/tydiqa-goldp)
 * [XLSUM (Ja)](https://huggingface.co/datasets/csebuetnlp/xlsum)
-which resulted in a dataset of 13167 samples total.
 These three datasets were chosen as they represent three distinct fine-tuning tasks (Text simplification, question answering, and text summarization, respectively) which we hypothesize can help to improve the language models suitability for dealing with Japanese data.
 These three datasets make up the model name: STX.

 * [TyDiQA (Ja)](https://huggingface.co/datasets/khalidalt/tydiqa-goldp)
 * [XLSUM (Ja)](https://huggingface.co/datasets/csebuetnlp/xlsum)
+which resulted in a dataset of 13,167 samples total.
 These three datasets were chosen as they represent three distinct fine-tuning tasks (Text simplification, question answering, and text summarization, respectively) which we hypothesize can help to improve the language models suitability for dealing with Japanese data.
 These three datasets make up the model name: STX.