Spaces:

GIZ
/

ChaBo_README

Running

App Files Files Community

ChaBo_README / orchestrator.py

ppsingh

adding other components

944aab6 about 1 month ago

raw

history blame contribute delete

22.5 kB

	orchestratorText = """
	# Chabo Orchestrator Documentation

	### Table of Contents
	1. Overview
	2. System Architecture
	3. Components
	4. Configuration
	5. Deployment Guide
	6. API Reference
	7. Usage Examples
	8. Troubleshooting

	## Overview

	The Chabo Orchestrator is the central coordination module of the Chabo RAG system. \
	It orchestrates the flow between multiple microservices to provide intelligent \
	document processing and question-answering capabilities. The system is designed for deployment on Huggingface Spaces.

	### Key Features:
	- Workflow Orchestration: Uses LangGraph to manage complex processing pipelines
	- Multi-Modal Support: Handles files dependent on ChatUI and Ingestor config (e.g. PDF, DOCX, GeoJSON, and JSON )
	- Streaming Responses: Real-time response generation with Server-Sent Events (SSE)
	- Dual Processing Modes:
	- Direct Output Mode: Returns ingestor results immediately (e.g. EUDR use case)
	- Standard RAG Mode: Full retrieval-augmented generation pipeline
	- Intelligent Caching: Prevents redundant file processing (e.g. EUDR use case)
	- Multiple Interfaces: FastAPI endpoints for modules; LangServe endpoints for ChatUI; Gradio UI for testing

	## System Architecture

	### High-Level Architecture

	```

	┌─────────────────┐
	│ ChatUI │
	│ Frontend │
	└────────┬────────┘
	│ HTTP/SSE
	▼
	┌─────────────────────────────────┐
	│ Chabo Orchestrator │
	│ ┌─────────────────────────┐ │
	│ │ LangGraph Workflow │ │
	│ │ ┌─────────────────┐ │ │
	│ │ │ Detect File │ │ │
	│ │ │ Type │ │ │
	│ │ └────────┬────────┘ │ │
	│ │ │ │ │
	│ │ ┌────────▼────────┐ │ │
	│ │ │ Ingest File │ │ │
	│ │ └────────┬────────┘ │ │
	│ │ │ │ │
	│ │ ┌─────┴──────┐ │ │
	│ │ │ │ │ │
	│ │ ┌──▼───┐ ┌────▼───┐ │ │
	│ │ │Direct│ │Retrieve│ │ │
	│ │ │Output│ │Context │ │ │
	│ │ └──┬───┘ └────┬───┘ │ │
	│ │ │ │ │ │
	│ │ │ ┌────▼───┐ │ │
	│ │ │ │Generate│ │ │
	│ │ │ │Response│ │ │
	│ │ │ └────────┘ │ │
	│ └──────┴──────────────────┘ │
	└──────┬───────────┬──────────┬───┘
	│ │ │
	┌───▼──┐ ┌───▼───┐ ┌──▼────┐
	│Ingest│ │Retrie-│ │Genera-│
	│or │ │ver │ │tor │
	└──────┘ └───────┘ └───────┘

	```
	### Component Communication

	All communication between modules happens over HTTP:
	- Orchestrator ↔ Ingestor: Gradio Client (file upload, processing)
	- Orchestrator ↔ Retriever: Gradio Client (semantic search)
	- Orchestrator ↔ Generator: HTTP streaming (SSE for real-time responses)
	- ChatUI ↔ Orchestrator: LangServe streaming endpoints

	### Workflow Logic

	The orchestrator implements two distinct workflows:
	Direct Output Workflow (when `DIRECT_OUTPUT=True` and file is new):
	```
	File Upload → Detect Type → Ingest → Direct Output → Return Results
	```
	Standard RAG Workflow (default or cached files):
	```
	Query → Retrieve Context → Generate Response → Stream to User
	```


	## Components

	### 1. Main Application (`main.py`)
	- LangServe endpoints for ChatUI integration
	- Gradio web interface for testing
	- FastAPI endpoints for diagnostics and future use (e.g. /health)
	- Cache management endpoint (for direct output use cases)


	Key Functions:
	- `chatui_adapter()`: Handles text-only queries
	- `chatui_file_adapter()`: Handles file uploads with queries
	- `create_gradio_interface()`: Test UI

	### 2. Workflow Nodes (`nodes.py`)

	LangGraph nodes that implement the processing pipeline:

	Node Functions:

	- `detect_file_type_node()`: Identifies file type and determines routing
	- `ingest_node()`: Processes files through appropriate ingestor
	- `direct_output_node()`: Returns raw ingestor results
	- `retrieve_node()`: Fetches relevant context from vector store
	- `generate_node_streaming()`: Streams LLM responses
	- `route_workflow()`: Conditional routing logic

	Helper Functions:

	- `process_query_streaming()`: Unified streaming interface
	- `compute_file_hash()`: SHA256 hashing for deduplication
	- `clear_direct_output_cache()`: Cache management

	### 3. Data Models (`models.py`)

	Pydantic models for type validation

	### 4. Retriever Adapter (`retriever_adapter.py`)

	Abstraction layer for managing different retriever configurations:
	- Handles authentication
	- Formats queries and filters

	### 5. Utilities (`utils.py`)

	Helper functions

	#### Conversation Context Management

	The `build_conversation_context()` function manages conversation history to provide relevant context to the generator while respecting token limits and conversation flow.

	Key Features:

	- Context Selection: Always includes the first user and assistant messages to maintain conversation context
	- Recent Turn Limiting: Includes only the last N complete turns (user + assistant pairs) to focus on recent conversation (default: 3)
	- Character Limit Management: Truncates to maximum character limits to prevent context overflow


	Function Parameters:

	```python
	def build_conversation_context(
	messages, # List of Message objects from conversation
	max_turns: int = 3, # Maximum number of recent turns to include
	max_chars: int = 8000 # Maximum total characters in context
	) -> str
	```


	## Configuration

	### Configuration File (`params.cfg`)

	```ini
	[file_processing]
	# Enable direct output mode: when True, ingestor results are returned directly
	# without going through the generator. When False, all files go through full RAG pipeline.
	# This also prevents ChatUI from resending the file in the conversation history with each turn
	# Note: File type validation is handled by the ChatUI frontend
	DIRECT_OUTPUT = True

	[conversation_history]
	# Limit the context window for the conversation history
	MAX_TURNS = 3
	MAX_CHARS = 12000

	[retriever]
	RETRIEVER = https://giz-chatfed-retriever0-3.hf.space/
	# Optional
	COLLECTION_NAME = EUDR

	[generator]
	GENERATOR = https://giz-eudr-chabo-generator.hf.space

	[ingestor]
	INGESTOR = https://giz-eudr-chabo-ingestor.hf.space

	[general]
	# need to include this for HF inference endpoint limits
	MAX_CONTEXT_CHARS = 15000
	```

	### Environment Variables

	Create a `.env` file with:

	```bash
	# Required for private HuggingFace Spaces
	HF_TOKEN=hf_xxxxxxxxxxxxxxxxxxxxx
	```

	### ChatUI Configuration

	ChatUI `DOTENV_LOCAL` example deployment configuration:

	```javascript
	MODELS=`[
	{
	"name": "asistente_eudr",
	"displayName": "Asistente EUDR",
	"description": "Retrieval-augmented generation on EUDR Whisp API powered by ChatFed modules.",
	"instructions": {
	"title": "EUDR Asistente: Instructiones",
	"content": "Hola, soy Asistente EUDR, un asistente conversacional basado en inteligencia artificial diseñado para ayudarle a comprender el cumplimiento y el análisis del Reglamento de la UE sobre la deforestación. Responderé a sus preguntas utilizando los informes EUDR y los archivos GeoJSON cargados.\\n\\n💡 Cómo utilizarlo (panel a la derecha)\\n\\nModo de uso: elija entre subir un archivo GeoJSON para su análisis o consultar los informes EUDR filtrados por país.\\n\\nEjemplos: seleccione entre preguntas de ejemplo seleccionadas de diferentes categorías.\\n\\nReferencias: consulte las fuentes de contenido utilizadas para la verificación de datos.\\n\\n⚠️ Para conocer las limitaciones y la información sobre la recopilación de datos, consulte la pestaña «Exención de responsabilidad»\\n\\n⚠️ Al utilizar esta aplicación, usted acepta que recopilemos estadísticas de uso (como preguntas formuladas, comentarios realizados, duración de la sesión, tipo de dispositivo e información geográfica anónima) para comprender el rendimiento y mejorar continuamente la herramienta, basándonos en nuestro interés legítimo por mejorar nuestros servicios."
	},
	"multimodal": true,
	"multimodalAcceptedMimetypes": [
	"application/geojson"
	],
	"chatPromptTemplate": "{{#each messages}}{{#ifUser}}{{content}}{{/ifUser}}{{#ifAssistant}}{{content}}{{/ifAssistant}}{{/each}}",
	"parameters": {
	"temperature": 0.0,
	"max_new_tokens": 2048
	},
	"endpoints": [{
	"type": "langserve-streaming",
	"url": "[https://giz-eudr-chabo-orchestrator.hf.space/chatfed-ui-stream](https://giz-eudr-chabo-orchestrator.hf.space/chatfed-ui-stream)",
	"streamingFileUploadUrl": "[https://giz-eudr-chabo-orchestrator.hf.space/chatfed-with-file-stream](https://giz-eudr-chabo-orchestrator.hf.space/chatfed-with-file-stream)",
	"inputKey": "text",
	"fileInputKey": "files"
	}]
	}
	]`

	PUBLIC_ANNOUNCEMENT_BANNERS=`[
	{
	"title": "This is Chat Prototype for DSC users",
	"linkTitle": "Keep it Clean"
	}
	]`

	PUBLIC_APP_DISCLAIMER_MESSAGE="Disclaimer: AI is an area of active research with known problems such as biased generation and misinformation. Do not use this application for high-stakes decisions or advice. Do not insert your personal data, especially sensitive, like health data."
	PUBLIC_APP_DESCRIPTION="Internal Chat-tool for DSC users for testing"

	PUBLIC_APP_NAME="EUDR ChatUI"
	ENABLE_ASSISTANTS=false
	ENABLE_ASSISTANTS_RAG=false
	COMMUNITY_TOOLS=false
	MONGODB_URL=mongodb://localhost:27017

	# Disable LLM-based title generation to prevent template queries
	LLM_SUMMARIZATION=false
	```
	Key things to ensure here:
	- multimodalAcceptedMimetypes: file types to accept for upload via ChatUI
	- endpoints: orchestrator url + endpoints

	## Deployment Guide

	### Local Development

	Prerequisites:
	- Python 3.10+
	- pip

	Steps:

	1. Clone the repository:
	```bash
	git clone <your-repo-url>
	cd chabo-orchestrator
	```

	2. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	3. Configure the system:
	```bash
	# Create .env file
	echo "HF_TOKEN=your_token_here" > .env

	# Edit params.cfg with your service URLs
	nano params.cfg
	```

	4. Run the application:
	```bash
	python app/main.py
	```

	5. Access interfaces:
	- Gradio UI: http://localhost:7860/gradio
	- API Docs: http://localhost:7860/docs
	- Health Check: http://localhost:7860/health

	### Docker Deployment

	Build the image:

	```bash
	docker build -t chabo-orchestrator .
	```

	Run the container:

	```bash
	docker run -d --name chabo-orchestrator -p 7860:7860 chabo-orchestrator
	```

	### HuggingFace Spaces Deployment


	Repository Structure:
	```
	your-space/
	├── app/
	│ ├── main.py
	│ ├── nodes.py
	│ ├── models.py
	│ ├── retriever_adapter.py
	│ └── utils.py
	├── Dockerfile
	├── requirements.txt
	├── params.cfg
	└── README.md
	```
	Steps:

	1. Create a new Space on HuggingFace
	2. Select "Docker" as the SDK
	3. Push your code to the Space repository
	4. Add secrets in Space settings:
	- `HF_TOKEN`: Your HuggingFace token
	5. The Space will automatically build and deploy

	Important: Ensure all service URLs in `params.cfg` are publicly accessible.

	### Docker Compose (Multi-Service)

	Example orchestrated deployment for the entire Chabo stack (NOTE - docker-compose will not run on Huggingface spaces)
	```yaml
	version: '3.8'

	services:
	orchestrator:
	build: ./orchestrator
	ports:
	- "7860:7860"
	environment:
	- HF_TOKEN=${HF_TOKEN}
	- RETRIEVER=http://retriever:7861
	- GENERATOR=http://generator:7862
	- INGESTOR=http://ingestor:7863
	depends_on:
	- retriever
	- generator
	- ingestor

	retriever:
	build: ./retriever
	ports:
	- "7861:7861"
	environment:
	- QDRANT_API_KEY=${QDRANT_API_KEY}

	generator:
	build: ./generator
	ports:
	- "7862:7862"
	environment:
	- HF_TOKEN=${HF_TOKEN}

	ingestor:
	build: ./ingestor
	ports:
	- "7863:7863"
	```
	## API Reference

	### Endpoints

	#### Health Check
	```
	GET /health
	```
	Returns service health status.

	Response:
	```json
	{
	"status": "healthy"
	}
	```

	#### Root Information
	```
	GET /
	```
	Returns API metadata and available endpoints.

	#### Text Query (Streaming)
	```
	POST /chatfed-ui-stream/stream
	Content-Type: application/json
	```

	Request Body:
	```json
	{
	"input": {
	"text": "What are EUDR requirements?"
	}
	}
	```

	Response: Server-Sent Events stream
	```
	event: data
	data: "The EUDR requires..."

	event: sources
	data: {"sources": [...]}

	event: end
	data: ""
	```
	#### File Upload Query (Streaming)
	```
	POST /chatfed-with-file-stream/stream
	Content-Type: application/json
	```

	Request Body:
	```json
	{
	"input": {
	"text": "Analyze this GeoJSON",
	"files": [
	{
	"name": "boundaries.geojson",
	"type": "base64",
	"content": "base64_encoded_content"
	}
	]
	}
	}
	```

	#### Clear Cache
	```
	POST /clear-cache
	```
	Clears the direct output file cache.

	Response:
	```json
	{
	"status": "cache cleared"
	}
	```

	### Gradio Interface

	#### Interactive Query

	Gradio's default API endpoint for UI interactions. If running on huggingface spaces, access via: https://[ORG_NAME]-[SPACE_NAME].hf.space/gradio/
	## Troubleshooting

	### Common Issues

	#### 1. File Upload Fails

	Symptoms: "Error reading file" or "Failed to decode uploaded file"

	Solutions:
	- Verify file is properly base64 encoded
	- Check file size limits (default: varies by deployment)
	- Ensure MIME type is in `multimodalAcceptedMimetypes`

	#### 2. Slow Responses

	Symptoms: Long wait times for responses

	Solutions:
	- Check network latency to external services
	- Verify `MAX_CONTEXT_CHARS` isn't too high
	- Consider enabling `DIRECT_OUTPUT` for suitable file types
	- Check logs for retrieval/generation bottlenecks

	#### 3. Cache Not Clearing

	Symptoms: Same file shows cached results when it shouldn't

	Solutions:
	- Call `/clear-cache` endpoint
	- Restart the service (clears in-memory cache)
	- Check if `DIRECT_OUTPUT=True` in config

	#### 4. Service Connection Errors

	Symptoms: "Connection refused" or timeout errors

	Solutions:
	- Verify all service URLs in `params.cfg` are accessible
	- Check HF_TOKEN is valid and has access to private spaces (NOTE - THE ORCHESTRATOR CURRENTLY MUST BE PUBLIC)
	- Test each service independently with health checks
	- Review firewall/network policies


	### Version History

	- v1.0.0: Initial release with LangGraph orchestration
	- Current implementation supports streaming, caching, and dual-mode processing

	---

	Documentation Last Updated: 2025-10-01
	Compatible With: Python 3.10+, LangGraph 0.2+, FastAPI 0.100+



	"""