Mastering-Python-HF
/

nvidia_tts_en_fastpitch_multispeaker

Model card Files Files and versions

Mastering-Python-HF commited on Jul 11, 2023

Commit

41c25c6

·

1 Parent(s): 02dd47f

Update README.md

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -51,6 +51,19 @@ model = HifiGanModel.restore_from(restore_path=path)
 ```
 import soundfile as sf
 parsed = spec_generator.parse("You can type your sentence here to get nemo to produce speech.")
 spectrogram = spec_generator.generate_spectrogram(tokens=parsed,speaker=92)
 audio = model.convert_spectrogram_to_audio(spec=spectrogram)
 sf.write("speech.wav", audio.to('cpu').detach().numpy()[0], 44100)
@@ -75,7 +88,7 @@ FastPitch multispeaker is a fully-parallel text-to-speech model based on FastSpe
 ## Training
-The NeMo toolkit [3] was used for training the models for 1000 epochs. These model are trained with this [example script](https://github.com/NVIDIA/NeMo/blob/main/examples/tts/fastpitch.py) and this [base config](https://github.com/NVIDIA/NeMo/blob/main/examples/tts/conf/fastpitch_align_v1.05.yaml).
 ## Datasets

 ```
 import soundfile as sf
 parsed = spec_generator.parse("You can type your sentence here to get nemo to produce speech.")
+"""
+speaker id:
+    92     Cori Samuel
+    6097   Phil Benson
+    9017   John Van Stan
+    6670   Mike Pelton
+    6671   Tony Oliva
+    8051   Maria Kasper
+    9136   Helen Taylor
+    11614  Sylviamb
+    11697  Celine Major
+    12787  LikeManyWaters
+"""
 spectrogram = spec_generator.generate_spectrogram(tokens=parsed,speaker=92)
 audio = model.convert_spectrogram_to_audio(spec=spectrogram)
 sf.write("speech.wav", audio.to('cpu').detach().numpy()[0], 44100)
 ## Training
+The NeMo toolkit [3] was used for training the models for 1000 epochs.
 ## Datasets