A newer version of the Gradio SDK is available:
5.49.1
metadata
license: mit
tags:
- gradio
- omni-api
- multimodal
- chat-interface
- pdf-processing
- image-processing
- audio-processing
- llm
- api-client
- chatbot
- text-generation
- document-analysis
- ocr
- transcription
widget:
- src: https://api.modelharbor.com
Omni API Gradio UI
This is a Gradio-based user interface for the Omni API that supports multimodal interactions with various file types including text, PDF documents, images, and audio files.
Model Description
The Omni API Gradio UI provides an easy-to-use web interface for interacting with the Omni API, which supports advanced multimodal AI capabilities. Users can send text prompts along with various file types and receive intelligent responses.
Supported Models
The interface supports several state-of-the-art models:
- typhoon-ocr-preview
- openai/gpt-5
- meta-llama/llama-4-maverick
- qwen/qwen3-vl-235b-a22b-instruct
- gemini/gemini-2.5-pro
- gemini/gemini-2.5-flash
Features
- Multimodal Support: Process text, PDFs, images, and audio files in a single interface
- File Ordering: Upload multiple files in a specific order for precise control
- Configurable Models: Switch between different AI models for different tasks
- Real-time Responses: Get immediate feedback from the API
- Customizable Parameters: Adjust max tokens and other settings
Intended Uses & Limitations
Intended Uses
- Document analysis and summarization
- Image OCR and analysis
- Audio transcription and analysis
- Multimodal chat applications
- Content extraction from various file formats
Limitations
- Requires access to the Omni API
- Dependent on network connectivity
- File size limitations based on API constraints
- Some models may require API keys
How to Use
- Configure the API base URL (defaults to https://api.modelharbor.com)
- Select your preferred model from the dropdown
- Enter your text message in the input box
- Upload files (PDF, images, or audio) as needed
- Click "Send Request" to interact with the API
- View the response in the output panel
Supported File Types
- PDFs: Document processing and analysis
- Images: JPG, PNG, GIF, BMP, WEBP for OCR and visual analysis
- Audio: MP3, WAV, M4A, FLAC, OGG for transcription
Technical Details
Frameworks and Libraries
- Gradio 4.0+
- Python 3.8+
- Requests library for API communication
Installation
# Install dependencies
uv sync
# Run the application
uv run python app.py
Development Mode
# Run with auto-reload for development
uv run python dev.py
Citation
If you use this interface in your work, please cite:
@misc{omni_api_gradio_ui,
title={Omni API Gradio UI},
author={ModelHarbor Team},
year={2025},
howpublished={\url{https://github.com/your-username/omni-api-gradio-ui}}
}