ARCH.AI_SPACE

Sleeping

App Files Files Community

ARCH.AI_SPACE / tech_spec.md

FadeClip

Add API key authentication and tech specifications document

93c0ec9 about 2 months ago

preview code

raw

history blame contribute delete

7.25 kB

	# OllamaSpace Technical Specifications

	## Project Overview
	OllamaSpace is a web-based chat application that serves as a frontend interface for interacting with Ollama language models. The application provides a real-time chat interface where users can communicate with AI models through a web browser.

	## Architecture

	### Backend
	- Framework: FastAPI (Python)
	- API Gateway: Acts as a proxy between the frontend and Ollama API
	- Streaming: Supports real-time streaming of model responses
	- Default Model: qwen3:4b

	### Frontend
	- Technology: Pure HTML/CSS/JavaScript (no frameworks)
	- Interface: Simple chat interface with message history
	- Interaction: Real-time message streaming with typing indicators
	- Styling: Clean, minimal design with distinct user/bot message styling

	## Components

	### main.py
	- Framework: FastAPI
	- Authentication: Implements Bearer token authentication using HTTPBearer
	- Endpoints:
	- `GET /` - Redirects to `/chat`
	- `GET /chat` - Serves the chat HTML page
	- `POST /chat_api` - API endpoint that forwards requests to Ollama (requires authentication)
	- Functionality:
	- Proxies requests to local Ollama API (http://localhost:11434)
	- Streams model responses back to the frontend
	- Handles error cases and validation
	- Auto-generates secure API key if not provided via environment variable

	### chat.html
	- Template: HTML structure for the chat interface with API key management
	- Layout:
	- Header with API key input and save button
	- Chat window area with message history
	- Message input field
	- Send button
	- Static Assets: Links to CSS and JavaScript files

	### static/script.js
	- Features:
	- Real-time message streaming from the API
	- Message display in chat format
	- Enter key support for sending messages
	- Stream parsing to handle JSON responses
	- API key management with localStorage persistence
	- API key input UI with save functionality
	- API Communication:
	- Includes API key in Authorization header as Bearer token
	- POSTs to `/chat_api` endpoint
	- Receives streaming responses and displays incrementally
	- Handles error cases gracefully

	### static/style.css
	- Design: Minimal, clean chat interface with API key management section
	- Styling:
	- Distinct colors for user vs. bot messages
	- Responsive layout
	- API key section in header with input field and save button
	- Auto-scrolling to latest messages

	## Deployment

	### Dockerfile
	- Base Image: ollama/ollama
	- Environment: Sets up Ollama server and FastAPI gateway
	- Port Configuration: Listens on port 7860 (Hugging Face Spaces default)
	- Model Setup: Downloads specified model during build process
	- Dependencies: Installs Python, FastAPI, and related libraries

	### start.sh
	- Initialization Sequence:
	1. Starts Ollama server in background
	2. Health checks the Ollama server
	3. Starts FastAPI gateway on port 7860
	- Error Handling: Waits for Ollama to be ready before starting the gateway
	- API Key: If auto-generated, the API key will be displayed in the console logs during startup

	## Configuration

	### Environment Variables
	- `OLLAMA_HOST`: 0.0.0.0 (allows external connections)
	- `OLLAMA_ORIGINS`: '*' (allows CORS requests)
	- `OLLAMA_MODEL`: qwen3:4b (default model, can be overridden)
	- `OLLAMA_API_KEY`: (optional) Secure API key (auto-generated if not provided)

	### Default Model
	- Model: qwen3:4b
	- Fallback: If no model specified in request, uses qwen3:4b

	### API Key Management
	- Generation: If no OLLAMA_API_KEY environment variable is set, a cryptographically secure random key is generated at startup
	- Access: Generated API key is displayed in the application logs during startup
	- Frontend Storage: API key is stored in browser's localStorage after being entered once
	- Authentication: All API requests require a valid Bearer token in the Authorization header

	## API Specification

	### `/chat_api` Endpoint
	- Method: POST
	- Authentication: Requires Bearer token in Authorization header
	- Content-Type: application/json
	- Request Headers:
	- `Authorization`: Bearer {your_api_key}
	- `Content-Type`: application/json
	- Request Body:
	```json
	{
	"model": "string (optional, defaults to qwen3:4b)",
	"prompt": "string (required)"
	}
	```
	- Response: Streaming response with incremental model output
	- Error Handling:
	- Returns 401 for invalid API key
	- Returns 400 for missing prompt

	### Data Flow
	1. Frontend sends user message to `/chat_api`
	2. Backend forwards request to local Ollama API
	3. Ollama processes request with specified model
	4. Response is streamed back to frontend in real-time
	5. Frontend displays response incrementally as it arrives

	## Security Considerations
	- API Key Authentication: Required for all API access using Bearer token authentication
	- Secure Key Generation: API key is auto-generated using cryptographically secure random generator (secrets.token_urlsafe(32))
	- Configurable Keys: API key can be set via environment variable (OLLAMA_API_KEY) or auto-generated
	- Storage: Client-side API key stored in browser's localStorage
	- CORS: Enabled for all origins (potential security concern in production)
	- Input Validation: Validates presence of prompt parameter
	- Local API: Communicates with Ollama through localhost only
	- Key Exposure: Auto-generated API key is displayed in console logs during startup (should be secured in production)

	## Performance Features
	- Streaming: Real-time response streaming for better UX
	- Client-side Display: Incremental message display as responses arrive
	- Efficient Communication: Uses streaming HTTP responses to minimize latency

	## Security Features
	- Authentication: Bearer token authentication for all API endpoints
	- Key Generation: Cryptographically secure random API key generation using secrets module
	- Key Storage: API key stored in browser localStorage (with option to enter via UI)
	- Transport Security: API key transmitted via Authorization header (should use HTTPS in production)

	## Technologies Used
	- Backend: Python, FastAPI
	- Frontend: HTML5, CSS3, JavaScript (ES6+)
	- Containerization: Docker
	- AI Model: Ollama with qwen3:4b by default
	- Web Server: Uvicorn ASGI server

	## File Structure
	```
	OllamaSpace/
	├── main.py (FastAPI application)
	├── chat.html (Chat interface)
	├── start.sh (Container startup script)
	├── Dockerfile (Container configuration)
	├── README.md (Project description)
	├── static/
	│ ├── script.js (Frontend JavaScript)
	│ └── style.css (Frontend styling)
	```

	## Build Process
	1. Container built with Ollama and Python dependencies
	2. Model specified by OLLAMA_MODEL environment variable is pre-pulled
	3. Application files are copied into container
	4. FastAPI dependencies are installed
	5. Container starts with Ollama server and FastAPI gateway

	## Deployment Target
	- Platform: Designed for Hugging Face Spaces
	- Port: 7860 (standard for Hugging Face Spaces)
	- Runtime: Docker container
	- Model Serving: Ollama with FastAPI gateway