Whisper Medium - Karthik Avinash

This model is a fine-tuned version of openai/whisper-medium on the Common Voice 11.0 dataset. It achieves the following results on the evaluation set:

Loss: 0.4340
Wer: 39.4737

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 2
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 800
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer
1.0267	0.0166	20	1.1833	47.3684
0.7014	0.0333	40	0.9320	40.7895
0.5484	0.0499	60	0.6066	50.0
0.3202	0.0665	80	0.5057	56.5789
0.3791	0.0832	100	0.4702	47.3684
0.3701	0.0998	120	0.4606	46.0526
0.3584	0.1164	140	0.4618	47.3684
0.3459	0.1330	160	0.4809	51.3158
0.2758	0.1497	180	0.4729	52.6316
0.3636	0.1663	200	0.4597	48.6842
0.3649	0.1829	220	0.4475	43.4211
0.325	0.1996	240	0.4642	43.4211
0.3052	0.2162	260	0.4800	51.3158
0.1836	0.2328	280	0.4854	46.0526
0.2539	0.2495	300	0.4735	55.2632
0.3174	0.2661	320	0.4748	44.7368
0.3184	0.2827	340	0.4545	44.7368
0.2216	0.2994	360	0.4711	39.4737
0.2849	0.3160	380	0.4219	36.8421
0.2108	0.3326	400	0.4382	39.4737
0.2431	0.3493	420	0.4622	35.5263
0.2776	0.3659	440	0.4265	42.1053
0.3011	0.3825	460	0.4400	35.5263
0.2659	0.3991	480	0.5303	46.0526
0.3692	0.4158	500	0.4142	38.1579
0.3166	0.4324	520	0.4278	38.1579
0.2855	0.4490	540	0.4518	38.1579
0.2286	0.4657	560	0.4679	48.6842
0.2136	0.4823	580	0.4749	40.7895
0.2503	0.4989	600	0.4740	34.2105
0.1904	0.5156	620	0.4547	39.4737
0.376	0.5322	640	0.4272	40.7895
0.24	0.5488	660	0.4594	40.7895
0.2928	0.5655	680	0.4498	40.7895
0.2473	0.5821	700	0.4432	43.4211
0.5217	0.5987	720	0.4481	40.7895
0.1973	0.6154	740	0.4381	43.4211
0.272	0.6320	760	0.4407	39.4737
0.2364	0.6486	780	0.4345	40.7895
0.194	0.6652	800	0.4340	39.4737

Framework versions

Transformers 4.43.0.dev0
Pytorch 2.4.0+cu124
Datasets 2.20.0
Tokenizers 0.19.1

Downloads last month: -

Safetensors

Model size

0.8B params

Tensor type

F32

Model tree for KarthikAvinash/whisper-medium-arabic-suite-II

Base model

openai/whisper-medium

Finetuned

(721)

this model

Evaluation results

Wer on Common Voice 11.0
test set self-reported

39.474

View on Papers With Code