Whisper ONNX Model - openai/whisper-small.en

This is an ONNX conversion of the openai/whisper-small.en model, optimized for faster inference in production environments.

Model Details

Original Model: openai/whisper-small.en
Converted to: ONNX format using Optimum
Task: Automatic Speech Recognition (ASR)
Language: English
Framework: ONNX Runtime

Usage

from optimum.onnxruntime import ORTModelForSpeechSeq2Seq
from transformers import WhisperProcessor

# Load ONNX model (much faster than PyTorch conversion)
model = ORTModelForSpeechSeq2Seq.from_pretrained("mutisya/whisper-medium-en-onnx")
processor = WhisperProcessor.from_pretrained("mutisya/whisper-medium-en-onnx")

# Use for transcription
# ... (same API as original model)

Performance Benefits

Faster Loading: No runtime conversion from PyTorch to ONNX
Reduced Memory: Optimized model structure
Container Startup: Significantly faster Docker container initialization
Production Ready: Pre-optimized for inference

Original Model Info

This ONNX model maintains the same accuracy and capabilities as the original PyTorch model while providing better performance for production deployments.

Downloads last month: 4

Model tree for mutisya/whisper-small-en-onnx

Base model

openai/whisper-small.en

Quantized

(5)

this model