Spaces:
Running
Running
metadata
title: RAG Research Assistant API
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
RAG Research Assistant API
This backend provides a sophisticated FastAPI-based service for a Retrieval Augmented Generation (RAG) system designed to assist with research paper searches and information retrieval.
Features
- ArXiv paper search with customizable filtering
- Document chunking and processing
- Embedding generation using Sentence Transformers
- Vector search using FAISS
- LLM response generation with multiple model options
- Markdown-formatted research results
Requirements
- Python 3.8+
- FastAPI
- Sentence Transformers
- FAISS
- Hugging Face API access
Installation
- Clone the repository
- Navigate to the backend directory
- Install dependencies:
pip install -r requirements.txt
- Create a
.envfile based on.env.exampleand add your Hugging Face API key
Running the Application
To run the development server:
python run.py
Or with uvicorn directly:
uvicorn app.main:app --reload
The API will be available at http://localhost:8000, and the API documentation at http://localhost:8000/docs.
API Endpoints
/rag/query: Process a query through the RAG pipeline/rag/search: Search for papers without LLM processing/rag/models: Get available LLM models/rag/stats: Get system statistics/rag/clear/cache: Clear the paper cache/rag/clear/database: Clear the vector database/health: Simple health check endpoint
Docker
You can also run the application using Docker:
docker build -t rag-backend .
docker run -p 8000:8000 -e HF_API_KEY=your_key_here rag-backend
Architecture
The application follows a service-oriented architecture:
ArxivService: Interface to ArXiv APIDocumentService: Process papers into chunksEmbeddingService: Generate embeddingsVectorService: Store and search vectorsLlmService: Generate responses using LLMsFormatterService: Format resultsRagService: Orchestrate the entire RAG pipeline