Spaces:

utarn
/

ai_ocr

Sleeping

ai_ocr / README.md

init

0168600 3 months ago

1.17 kB

	# Omni API Gradio UI

	A Gradio-based user interface for the Omni API that supports text, PDF, image, and audio file processing.

	## Features

	- Text input for chat messages
	- Multiple file upload support (PDF, images, audio)
	- Configurable API base URL
	- Real-time response display
	- File ordering for multi-modal requests

	## Installation

	```bash
	# Install dependencies
	uv sync

	# Run the application
	uv run python app.py
	```

	### Development Mode (with auto-reload)

	For development, you can use the auto-reload feature that will automatically restart the app when files change:

	```bash
	uv run python dev.py
	```

	This will monitor for changes in Python files, Markdown files, and TOML configuration files, automatically restarting the Gradio app when any of these files are modified.

	## Usage

	1. Configure the API base URL (defaults to https://api-omni.modelharbor.com)
	2. Enter your text message
	3. Upload files in the desired order (optional)
	4. Click "Send Request" to interact with the API
	5. View the response in the right panel

	## Supported File Types

	- PDFs: Document processing
	- Images: JPG, PNG, GIF, BMP, WEBP
	- Audio: MP3, WAV, M4A, FLAC, OGG