Updated to do list
Browse files
README.md
CHANGED
|
@@ -8,6 +8,19 @@ The main goals of this project are:
|
|
| 8 |
2. Release the top performing models for further research and enhancement
|
| 9 |
3. Release all of the preprocessing and postprocessing scripts and findings for future research.
|
| 10 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
## 1. Model
|
| 12 |
|
| 13 |
We will be using T5 model.
|
|
@@ -35,4 +48,4 @@ We can make use of :
|
|
| 35 |
|
| 36 |
## 4. Additional Reading
|
| 37 |
|
| 38 |
-
- [How Much Knowledge Can You Pack Into the Parameters of a Language Model?](https://arxiv.org/pdf/2002.08910.pdf)
|
|
|
|
| 8 |
2. Release the top performing models for further research and enhancement
|
| 9 |
3. Release all of the preprocessing and postprocessing scripts and findings for future research.
|
| 10 |
|
| 11 |
+
## TO DO LIST:
|
| 12 |
+
- [x] Team members met and the following was discussed:
|
| 13 |
+
- Data preparation script is prepared that mixes CORD-19 and Pubmed.
|
| 14 |
+
- Agreed to finalize the training scripts by 9pm PDT 7/9/2021.
|
| 15 |
+
- Tokenizer is now trained.
|
| 16 |
+
- [ ] Setup the pretraining script
|
| 17 |
+
- [ ] Prepare the finetuning tasks inspired from [T5 Trivia Colab](https://colab.research.google.com/github/google-research/text-to-text-transfer-transformer/blob/master/notebooks/t5-trivia.ipynb)
|
| 18 |
+
- What datasets we want to go with?
|
| 19 |
+
- [Covid-QA](https://huggingface.co/datasets/covid_qa_deepset) (Maybe as test set?)
|
| 20 |
+
- [Trivia](https://huggingface.co/datasets/covid_qa_deepset)
|
| 21 |
+
- [CDC-QA](https://www.cdc.gov/coronavirus/2019-ncov/faq.html) (We can scrape quickly using beautiful soup or something)
|
| 22 |
+
- [More Medical Datasets](https://aclanthology.org/2020.findings-emnlp.289.pdf) (See the dataset section for inspiratio
|
| 23 |
+
|
| 24 |
## 1. Model
|
| 25 |
|
| 26 |
We will be using T5 model.
|
|
|
|
| 48 |
|
| 49 |
## 4. Additional Reading
|
| 50 |
|
| 51 |
+
- [How Much Knowledge Can You Pack Into the Parameters of a Language Model?](https://arxiv.org/pdf/2002.08910.pdf)
|