ARCH.AI_SPACE

Sleeping

App Files Files Community

ARCH.AI_SPACE / tech_spec.md

FadeClip

Add API key authentication and tech specifications document

93c0ec9 about 2 months ago

preview code

raw

history blame contribute delete

7.25 kB

OllamaSpace Technical Specifications

Project Overview

OllamaSpace is a web-based chat application that serves as a frontend interface for interacting with Ollama language models. The application provides a real-time chat interface where users can communicate with AI models through a web browser.

Architecture

Backend

Framework: FastAPI (Python)
API Gateway: Acts as a proxy between the frontend and Ollama API
Streaming: Supports real-time streaming of model responses
Default Model: qwen3:4b

Frontend

Technology: Pure HTML/CSS/JavaScript (no frameworks)
Interface: Simple chat interface with message history
Interaction: Real-time message streaming with typing indicators
Styling: Clean, minimal design with distinct user/bot message styling

Components

main.py

Framework: FastAPI
Authentication: Implements Bearer token authentication using HTTPBearer
Endpoints:
- GET / - Redirects to /chat
- GET /chat - Serves the chat HTML page
- POST /chat_api - API endpoint that forwards requests to Ollama (requires authentication)
Functionality:
- Proxies requests to local Ollama API (http://localhost:11434)
- Streams model responses back to the frontend
- Handles error cases and validation
- Auto-generates secure API key if not provided via environment variable

chat.html

Template: HTML structure for the chat interface with API key management
Layout:
- Header with API key input and save button
- Chat window area with message history
- Message input field
- Send button
Static Assets: Links to CSS and JavaScript files

static/script.js

Features:
- Real-time message streaming from the API
- Message display in chat format
- Enter key support for sending messages
- Stream parsing to handle JSON responses
- API key management with localStorage persistence
- API key input UI with save functionality
API Communication:
- Includes API key in Authorization header as Bearer token
- POSTs to /chat_api endpoint
- Receives streaming responses and displays incrementally
- Handles error cases gracefully

static/style.css

Design: Minimal, clean chat interface with API key management section
Styling:
- Distinct colors for user vs. bot messages
- Responsive layout
- API key section in header with input field and save button
- Auto-scrolling to latest messages

Deployment

Dockerfile

Base Image: ollama/ollama
Environment: Sets up Ollama server and FastAPI gateway
Port Configuration: Listens on port 7860 (Hugging Face Spaces default)
Model Setup: Downloads specified model during build process
Dependencies: Installs Python, FastAPI, and related libraries

start.sh

Initialization Sequence:
1. Starts Ollama server in background
2. Health checks the Ollama server
3. Starts FastAPI gateway on port 7860
Error Handling: Waits for Ollama to be ready before starting the gateway
API Key: If auto-generated, the API key will be displayed in the console logs during startup

Configuration

Environment Variables

OLLAMA_HOST: 0.0.0.0 (allows external connections)
OLLAMA_ORIGINS: '*' (allows CORS requests)
OLLAMA_MODEL: qwen3:4b (default model, can be overridden)
OLLAMA_API_KEY: (optional) Secure API key (auto-generated if not provided)

Default Model

Model: qwen3:4b
Fallback: If no model specified in request, uses qwen3:4b

API Key Management

Generation: If no OLLAMA_API_KEY environment variable is set, a cryptographically secure random key is generated at startup
Access: Generated API key is displayed in the application logs during startup
Frontend Storage: API key is stored in browser's localStorage after being entered once
Authentication: All API requests require a valid Bearer token in the Authorization header

API Specification

`/chat_api` Endpoint

Method: POST
Authentication: Requires Bearer token in Authorization header
Content-Type: application/json
Request Headers:
- Authorization: Bearer {your_api_key}
- Content-Type: application/json

Request Body:

{
  "model": "string (optional, defaults to qwen3:4b)",
  "prompt": "string (required)"
}

Response: Streaming response with incremental model output
Error Handling:
- Returns 401 for invalid API key
- Returns 400 for missing prompt

Data Flow

Frontend sends user message to /chat_api
Backend forwards request to local Ollama API
Ollama processes request with specified model
Response is streamed back to frontend in real-time
Frontend displays response incrementally as it arrives

Security Considerations

API Key Authentication: Required for all API access using Bearer token authentication
Secure Key Generation: API key is auto-generated using cryptographically secure random generator (secrets.token_urlsafe(32))
Configurable Keys: API key can be set via environment variable (OLLAMA_API_KEY) or auto-generated
Storage: Client-side API key stored in browser's localStorage
CORS: Enabled for all origins (potential security concern in production)
Input Validation: Validates presence of prompt parameter
Local API: Communicates with Ollama through localhost only
Key Exposure: Auto-generated API key is displayed in console logs during startup (should be secured in production)

Performance Features

Streaming: Real-time response streaming for better UX
Client-side Display: Incremental message display as responses arrive
Efficient Communication: Uses streaming HTTP responses to minimize latency

Security Features

Authentication: Bearer token authentication for all API endpoints
Key Generation: Cryptographically secure random API key generation using secrets module
Key Storage: API key stored in browser localStorage (with option to enter via UI)
Transport Security: API key transmitted via Authorization header (should use HTTPS in production)

Technologies Used

Backend: Python, FastAPI
Frontend: HTML5, CSS3, JavaScript (ES6+)
Containerization: Docker
AI Model: Ollama with qwen3:4b by default
Web Server: Uvicorn ASGI server

File Structure

OllamaSpace/
├── main.py (FastAPI application)
├── chat.html (Chat interface)
├── start.sh (Container startup script)
├── Dockerfile (Container configuration)
├── README.md (Project description)
├── static/
│   ├── script.js (Frontend JavaScript)
│   └── style.css (Frontend styling)

Build Process

Container built with Ollama and Python dependencies
Model specified by OLLAMA_MODEL environment variable is pre-pulled
Application files are copied into container
FastAPI dependencies are installed
Container starts with Ollama server and FastAPI gateway

Deployment Target

Platform: Designed for Hugging Face Spaces
Port: 7860 (standard for Hugging Face Spaces)
Runtime: Docker container
Model Serving: Ollama with FastAPI gateway