Spaces:
Sleeping
Sleeping
| # OllamaSpace Technical Specifications | |
| ## Project Overview | |
| OllamaSpace is a web-based chat application that serves as a frontend interface for interacting with Ollama language models. The application provides a real-time chat interface where users can communicate with AI models through a web browser. | |
| ## Architecture | |
| ### Backend | |
| - **Framework**: FastAPI (Python) | |
| - **API Gateway**: Acts as a proxy between the frontend and Ollama API | |
| - **Streaming**: Supports real-time streaming of model responses | |
| - **Default Model**: qwen3:4b | |
| ### Frontend | |
| - **Technology**: Pure HTML/CSS/JavaScript (no frameworks) | |
| - **Interface**: Simple chat interface with message history | |
| - **Interaction**: Real-time message streaming with typing indicators | |
| - **Styling**: Clean, minimal design with distinct user/bot message styling | |
| ## Components | |
| ### main.py | |
| - **Framework**: FastAPI | |
| - **Authentication**: Implements Bearer token authentication using HTTPBearer | |
| - **Endpoints**: | |
| - `GET /` - Redirects to `/chat` | |
| - `GET /chat` - Serves the chat HTML page | |
| - `POST /chat_api` - API endpoint that forwards requests to Ollama (requires authentication) | |
| - **Functionality**: | |
| - Proxies requests to local Ollama API (http://localhost:11434) | |
| - Streams model responses back to the frontend | |
| - Handles error cases and validation | |
| - Auto-generates secure API key if not provided via environment variable | |
| ### chat.html | |
| - **Template**: HTML structure for the chat interface with API key management | |
| - **Layout**: | |
| - Header with API key input and save button | |
| - Chat window area with message history | |
| - Message input field | |
| - Send button | |
| - **Static Assets**: Links to CSS and JavaScript files | |
| ### static/script.js | |
| - **Features**: | |
| - Real-time message streaming from the API | |
| - Message display in chat format | |
| - Enter key support for sending messages | |
| - Stream parsing to handle JSON responses | |
| - API key management with localStorage persistence | |
| - API key input UI with save functionality | |
| - **API Communication**: | |
| - Includes API key in Authorization header as Bearer token | |
| - POSTs to `/chat_api` endpoint | |
| - Receives streaming responses and displays incrementally | |
| - Handles error cases gracefully | |
| ### static/style.css | |
| - **Design**: Minimal, clean chat interface with API key management section | |
| - **Styling**: | |
| - Distinct colors for user vs. bot messages | |
| - Responsive layout | |
| - API key section in header with input field and save button | |
| - Auto-scrolling to latest messages | |
| ## Deployment | |
| ### Dockerfile | |
| - **Base Image**: ollama/ollama | |
| - **Environment**: Sets up Ollama server and FastAPI gateway | |
| - **Port Configuration**: Listens on port 7860 (Hugging Face Spaces default) | |
| - **Model Setup**: Downloads specified model during build process | |
| - **Dependencies**: Installs Python, FastAPI, and related libraries | |
| ### start.sh | |
| - **Initialization Sequence**: | |
| 1. Starts Ollama server in background | |
| 2. Health checks the Ollama server | |
| 3. Starts FastAPI gateway on port 7860 | |
| - **Error Handling**: Waits for Ollama to be ready before starting the gateway | |
| - **API Key**: If auto-generated, the API key will be displayed in the console logs during startup | |
| ## Configuration | |
| ### Environment Variables | |
| - `OLLAMA_HOST`: 0.0.0.0 (allows external connections) | |
| - `OLLAMA_ORIGINS`: '*' (allows CORS requests) | |
| - `OLLAMA_MODEL`: qwen3:4b (default model, can be overridden) | |
| - `OLLAMA_API_KEY`: (optional) Secure API key (auto-generated if not provided) | |
| ### Default Model | |
| - **Model**: qwen3:4b | |
| - **Fallback**: If no model specified in request, uses qwen3:4b | |
| ### API Key Management | |
| - **Generation**: If no OLLAMA_API_KEY environment variable is set, a cryptographically secure random key is generated at startup | |
| - **Access**: Generated API key is displayed in the application logs during startup | |
| - **Frontend Storage**: API key is stored in browser's localStorage after being entered once | |
| - **Authentication**: All API requests require a valid Bearer token in the Authorization header | |
| ## API Specification | |
| ### `/chat_api` Endpoint | |
| - **Method**: POST | |
| - **Authentication**: Requires Bearer token in Authorization header | |
| - **Content-Type**: application/json | |
| - **Request Headers**: | |
| - `Authorization`: Bearer {your_api_key} | |
| - `Content-Type`: application/json | |
| - **Request Body**: | |
| ```json | |
| { | |
| "model": "string (optional, defaults to qwen3:4b)", | |
| "prompt": "string (required)" | |
| } | |
| ``` | |
| - **Response**: Streaming response with incremental model output | |
| - **Error Handling**: | |
| - Returns 401 for invalid API key | |
| - Returns 400 for missing prompt | |
| ### Data Flow | |
| 1. Frontend sends user message to `/chat_api` | |
| 2. Backend forwards request to local Ollama API | |
| 3. Ollama processes request with specified model | |
| 4. Response is streamed back to frontend in real-time | |
| 5. Frontend displays response incrementally as it arrives | |
| ## Security Considerations | |
| - **API Key Authentication**: Required for all API access using Bearer token authentication | |
| - **Secure Key Generation**: API key is auto-generated using cryptographically secure random generator (secrets.token_urlsafe(32)) | |
| - **Configurable Keys**: API key can be set via environment variable (OLLAMA_API_KEY) or auto-generated | |
| - **Storage**: Client-side API key stored in browser's localStorage | |
| - **CORS**: Enabled for all origins (potential security concern in production) | |
| - **Input Validation**: Validates presence of prompt parameter | |
| - **Local API**: Communicates with Ollama through localhost only | |
| - **Key Exposure**: Auto-generated API key is displayed in console logs during startup (should be secured in production) | |
| ## Performance Features | |
| - **Streaming**: Real-time response streaming for better UX | |
| - **Client-side Display**: Incremental message display as responses arrive | |
| - **Efficient Communication**: Uses streaming HTTP responses to minimize latency | |
| ## Security Features | |
| - **Authentication**: Bearer token authentication for all API endpoints | |
| - **Key Generation**: Cryptographically secure random API key generation using secrets module | |
| - **Key Storage**: API key stored in browser localStorage (with option to enter via UI) | |
| - **Transport Security**: API key transmitted via Authorization header (should use HTTPS in production) | |
| ## Technologies Used | |
| - **Backend**: Python, FastAPI | |
| - **Frontend**: HTML5, CSS3, JavaScript (ES6+) | |
| - **Containerization**: Docker | |
| - **AI Model**: Ollama with qwen3:4b by default | |
| - **Web Server**: Uvicorn ASGI server | |
| ## File Structure | |
| ``` | |
| OllamaSpace/ | |
| βββ main.py (FastAPI application) | |
| βββ chat.html (Chat interface) | |
| βββ start.sh (Container startup script) | |
| βββ Dockerfile (Container configuration) | |
| βββ README.md (Project description) | |
| βββ static/ | |
| β βββ script.js (Frontend JavaScript) | |
| β βββ style.css (Frontend styling) | |
| ``` | |
| ## Build Process | |
| 1. Container built with Ollama and Python dependencies | |
| 2. Model specified by OLLAMA_MODEL environment variable is pre-pulled | |
| 3. Application files are copied into container | |
| 4. FastAPI dependencies are installed | |
| 5. Container starts with Ollama server and FastAPI gateway | |
| ## Deployment Target | |
| - **Platform**: Designed for Hugging Face Spaces | |
| - **Port**: 7860 (standard for Hugging Face Spaces) | |
| - **Runtime**: Docker container | |
| - **Model Serving**: Ollama with FastAPI gateway |