Whisper ONNX Model - openai/whisper-small.en

This is an ONNX conversion of the openai/whisper-small.en model, optimized for faster inference in production environments.

Model Details

  • Original Model: openai/whisper-small.en
  • Converted to: ONNX format using Optimum
  • Task: Automatic Speech Recognition (ASR)
  • Language: English
  • Framework: ONNX Runtime

Usage

from optimum.onnxruntime import ORTModelForSpeechSeq2Seq
from transformers import WhisperProcessor

# Load ONNX model (much faster than PyTorch conversion)
model = ORTModelForSpeechSeq2Seq.from_pretrained("mutisya/whisper-medium-en-onnx")
processor = WhisperProcessor.from_pretrained("mutisya/whisper-medium-en-onnx")

# Use for transcription
# ... (same API as original model)

Performance Benefits

  • Faster Loading: No runtime conversion from PyTorch to ONNX
  • Reduced Memory: Optimized model structure
  • Container Startup: Significantly faster Docker container initialization
  • Production Ready: Pre-optimized for inference

Original Model Info

This ONNX model maintains the same accuracy and capabilities as the original PyTorch model while providing better performance for production deployments.

Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mutisya/whisper-small-en-onnx

Quantized
(5)
this model