ai_ocr / README.md
utarn's picture
init
0168600
|
raw
history blame
1.17 kB
# Omni API Gradio UI
A Gradio-based user interface for the Omni API that supports text, PDF, image, and audio file processing.
## Features
- Text input for chat messages
- Multiple file upload support (PDF, images, audio)
- Configurable API base URL
- Real-time response display
- File ordering for multi-modal requests
## Installation
```bash
# Install dependencies
uv sync
# Run the application
uv run python app.py
```
### Development Mode (with auto-reload)
For development, you can use the auto-reload feature that will automatically restart the app when files change:
```bash
uv run python dev.py
```
This will monitor for changes in Python files, Markdown files, and TOML configuration files, automatically restarting the Gradio app when any of these files are modified.
## Usage
1. Configure the API base URL (defaults to https://api-omni.modelharbor.com)
2. Enter your text message
3. Upload files in the desired order (optional)
4. Click "Send Request" to interact with the API
5. View the response in the right panel
## Supported File Types
- **PDFs**: Document processing
- **Images**: JPG, PNG, GIF, BMP, WEBP
- **Audio**: MP3, WAV, M4A, FLAC, OGG