Spaces:
Running
Running
File size: 2,114 Bytes
dede15a 24177aa |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 |
---
title: RAG Research Assistant API
emoji: π
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
---
# RAG Research Assistant API
This backend provides a sophisticated FastAPI-based service for a Retrieval Augmented Generation (RAG) system designed to assist with research paper searches and information retrieval.
## Features
- ArXiv paper search with customizable filtering
- Document chunking and processing
- Embedding generation using Sentence Transformers
- Vector search using FAISS
- LLM response generation with multiple model options
- Markdown-formatted research results
## Requirements
- Python 3.8+
- FastAPI
- Sentence Transformers
- FAISS
- Hugging Face API access
## Installation
1. Clone the repository
2. Navigate to the backend directory
3. Install dependencies:
```bash
pip install -r requirements.txt
```
4. Create a `.env` file based on `.env.example` and add your Hugging Face API key
## Running the Application
To run the development server:
```bash
python run.py
```
Or with uvicorn directly:
```bash
uvicorn app.main:app --reload
```
The API will be available at http://localhost:8000, and the API documentation at http://localhost:8000/docs.
## API Endpoints
- `/rag/query`: Process a query through the RAG pipeline
- `/rag/search`: Search for papers without LLM processing
- `/rag/models`: Get available LLM models
- `/rag/stats`: Get system statistics
- `/rag/clear/cache`: Clear the paper cache
- `/rag/clear/database`: Clear the vector database
- `/health`: Simple health check endpoint
## Docker
You can also run the application using Docker:
```bash
docker build -t rag-backend .
docker run -p 8000:8000 -e HF_API_KEY=your_key_here rag-backend
```
## Architecture
The application follows a service-oriented architecture:
- `ArxivService`: Interface to ArXiv API
- `DocumentService`: Process papers into chunks
- `EmbeddingService`: Generate embeddings
- `VectorService`: Store and search vectors
- `LlmService`: Generate responses using LLMs
- `FormatterService`: Format results
- `RagService`: Orchestrate the entire RAG pipeline |