ARCH.AI_SPACE

Sleeping

File size: 7,253 Bytes

93c0ec9

# OllamaSpace Technical Specifications

## Project Overview
OllamaSpace is a web-based chat application that serves as a frontend interface for interacting with Ollama language models. The application provides a real-time chat interface where users can communicate with AI models through a web browser.

## Architecture

### Backend
- **Framework**: FastAPI (Python)
- **API Gateway**: Acts as a proxy between the frontend and Ollama API
- **Streaming**: Supports real-time streaming of model responses
- **Default Model**: qwen3:4b

### Frontend
- **Technology**: Pure HTML/CSS/JavaScript (no frameworks)
- **Interface**: Simple chat interface with message history
- **Interaction**: Real-time message streaming with typing indicators
- **Styling**: Clean, minimal design with distinct user/bot message styling

## Components

### main.py
- **Framework**: FastAPI
- **Authentication**: Implements Bearer token authentication using HTTPBearer
- **Endpoints**:
  - `GET /` - Redirects to `/chat`
  - `GET /chat` - Serves the chat HTML page
  - `POST /chat_api` - API endpoint that forwards requests to Ollama (requires authentication)
- **Functionality**: 
  - Proxies requests to local Ollama API (http://localhost:11434)
  - Streams model responses back to the frontend
  - Handles error cases and validation
  - Auto-generates secure API key if not provided via environment variable

### chat.html
- **Template**: HTML structure for the chat interface with API key management
- **Layout**: 
  - Header with API key input and save button
  - Chat window area with message history
  - Message input field
  - Send button
- **Static Assets**: Links to CSS and JavaScript files

### static/script.js
- **Features**:
  - Real-time message streaming from the API
  - Message display in chat format
  - Enter key support for sending messages
  - Stream parsing to handle JSON responses
  - API key management with localStorage persistence
  - API key input UI with save functionality
- **API Communication**: 
  - Includes API key in Authorization header as Bearer token
  - POSTs to `/chat_api` endpoint
  - Receives streaming responses and displays incrementally
  - Handles error cases gracefully

### static/style.css
- **Design**: Minimal, clean chat interface with API key management section
- **Styling**:
  - Distinct colors for user vs. bot messages
  - Responsive layout
  - API key section in header with input field and save button
  - Auto-scrolling to latest messages

## Deployment

### Dockerfile
- **Base Image**: ollama/ollama
- **Environment**: Sets up Ollama server and FastAPI gateway
- **Port Configuration**: Listens on port 7860 (Hugging Face Spaces default)
- **Model Setup**: Downloads specified model during build process
- **Dependencies**: Installs Python, FastAPI, and related libraries

### start.sh
- **Initialization Sequence**:
  1. Starts Ollama server in background
  2. Health checks the Ollama server
  3. Starts FastAPI gateway on port 7860
- **Error Handling**: Waits for Ollama to be ready before starting the gateway
- **API Key**: If auto-generated, the API key will be displayed in the console logs during startup

## Configuration

### Environment Variables
- `OLLAMA_HOST`: 0.0.0.0 (allows external connections)
- `OLLAMA_ORIGINS`: '*' (allows CORS requests)
- `OLLAMA_MODEL`: qwen3:4b (default model, can be overridden)
- `OLLAMA_API_KEY`: (optional) Secure API key (auto-generated if not provided)

### Default Model
- **Model**: qwen3:4b
- **Fallback**: If no model specified in request, uses qwen3:4b

### API Key Management
- **Generation**: If no OLLAMA_API_KEY environment variable is set, a cryptographically secure random key is generated at startup
- **Access**: Generated API key is displayed in the application logs during startup
- **Frontend Storage**: API key is stored in browser's localStorage after being entered once
- **Authentication**: All API requests require a valid Bearer token in the Authorization header

## API Specification

### `/chat_api` Endpoint
- **Method**: POST
- **Authentication**: Requires Bearer token in Authorization header
- **Content-Type**: application/json
- **Request Headers**:
  - `Authorization`: Bearer {your_api_key}
  - `Content-Type`: application/json
- **Request Body**:
  ```json
  {
    "model": "string (optional, defaults to qwen3:4b)",
    "prompt": "string (required)"
  }
  ```
- **Response**: Streaming response with incremental model output
- **Error Handling**: 
  - Returns 401 for invalid API key
  - Returns 400 for missing prompt

### Data Flow
1. Frontend sends user message to `/chat_api`
2. Backend forwards request to local Ollama API
3. Ollama processes request with specified model
4. Response is streamed back to frontend in real-time
5. Frontend displays response incrementally as it arrives

## Security Considerations
- **API Key Authentication**: Required for all API access using Bearer token authentication
- **Secure Key Generation**: API key is auto-generated using cryptographically secure random generator (secrets.token_urlsafe(32))
- **Configurable Keys**: API key can be set via environment variable (OLLAMA_API_KEY) or auto-generated
- **Storage**: Client-side API key stored in browser's localStorage
- **CORS**: Enabled for all origins (potential security concern in production)
- **Input Validation**: Validates presence of prompt parameter
- **Local API**: Communicates with Ollama through localhost only
- **Key Exposure**: Auto-generated API key is displayed in console logs during startup (should be secured in production)

## Performance Features
- **Streaming**: Real-time response streaming for better UX
- **Client-side Display**: Incremental message display as responses arrive
- **Efficient Communication**: Uses streaming HTTP responses to minimize latency

## Security Features
- **Authentication**: Bearer token authentication for all API endpoints
- **Key Generation**: Cryptographically secure random API key generation using secrets module
- **Key Storage**: API key stored in browser localStorage (with option to enter via UI)
- **Transport Security**: API key transmitted via Authorization header (should use HTTPS in production)

## Technologies Used
- **Backend**: Python, FastAPI
- **Frontend**: HTML5, CSS3, JavaScript (ES6+)
- **Containerization**: Docker
- **AI Model**: Ollama with qwen3:4b by default
- **Web Server**: Uvicorn ASGI server

## File Structure
```
OllamaSpace/
├── main.py (FastAPI application)
├── chat.html (Chat interface)
├── start.sh (Container startup script)
├── Dockerfile (Container configuration)
├── README.md (Project description)
├── static/
│   ├── script.js (Frontend JavaScript)
│   └── style.css (Frontend styling)
```

## Build Process
1. Container built with Ollama and Python dependencies
2. Model specified by OLLAMA_MODEL environment variable is pre-pulled
3. Application files are copied into container
4. FastAPI dependencies are installed
5. Container starts with Ollama server and FastAPI gateway

## Deployment Target
- **Platform**: Designed for Hugging Face Spaces
- **Port**: 7860 (standard for Hugging Face Spaces)
- **Runtime**: Docker container
- **Model Serving**: Ollama with FastAPI gateway