daekeun-ml
/

Phi-4-multimodal-finetune-ko-speech

phi-4-multimodal

Model card Files Files and versions

daekeun-ml commited on Mar 10

Commit

99edef4

·

verified ·

1 Parent(s): 6accd21

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -60,7 +60,7 @@ This is a fine-tuned model for Korean speech-to-text translation, from [microsof
 Total 35K samples. Each sample is a pair of Korean speech and its transcription. Dataset was sampled 16kHz.
-The model was trained on a single A100 80GB GPU for 1 epoch with a batch size of 16 using the `sample_finetune_speech.py` script from [microsoft/Phi-4-multimodal-instruct](https://huggingface.co/microsoft/Phi-4-multimodal-instruct)
 Note that this model is just a PoC/experimental purpose, and not intended to be used in production. More high-quality data, tuning, ablation studies, and experiments are needed.

 Total 35K samples. Each sample is a pair of Korean speech and its transcription. Dataset was sampled 16kHz.
+The model was trained on a single A100 80GB GPU for 4 epochs with a batch size of 16 using the `sample_finetune_speech.py` script from [microsoft/Phi-4-multimodal-instruct](https://huggingface.co/microsoft/Phi-4-multimodal-instruct)
 Note that this model is just a PoC/experimental purpose, and not intended to be used in production. More high-quality data, tuning, ablation studies, and experiments are needed.