Spaces:
Sleeping
Sleeping
| title: Rag As A Service | |
| emoji: π | |
| colorFrom: purple | |
| colorTo: pink | |
| sdk: gradio | |
| sdk_version: 5.42.0 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Minimal RAG API with MiniLM embeddings and FAISS | |
| # RAG API (Minimal) β MiniLM + FAISS (Gradio) | |
| Minimal Retrieval-Augmented Generation (RAG) service built with: | |
| - **Sentence-Transformers MiniLM** for embeddings | |
| - **FAISS** for vector search (cosine similarity) | |
| - **Gradio** for both UI and API exposure | |
| --- | |
| ## Features | |
| - Ingest documents (one per line) with configurable chunk size/overlap | |
| - Query top-K relevant chunks with similarity search | |
| - Get concise answers composed from retrieved context | |
| - Reset index at any time | |
| - Call endpoints via **UI or API** (`/api/ingest`, `/api/answer`, `/api/reset`) | |
| --- | |
| ## Quick Start | |
| 1. **Load sample docs β Ingest β Ask a query** using the Gradio UI. | |
| 2. Programmatic access: | |
| ## ```bash | |
| ## Ingest | |
| curl -s -X POST https://<your-space>.hf.space/api/ingest \ | |
| -H "content-type: application/json" \ | |
| -d '{"data": ["PySpark scales ETL across clusters.\nFAISS powers fast vector similarity search used in retrieval.", 256, 32]}' | |
| # Answer | |
| curl -s -X POST https://<your-space>.hf.space/api/answer \ | |
| -H "content-type: application/json" \ | |
| -d '{"data": ["What does FAISS do?", 5, 1000]}' | |
| ## Python Client | |
| from gradio_client import Client | |
| client = Client("https://<your-space>.hf.space") | |
| status, size = client.predict("FAISS powers fast vector search.", 256, 32, api_name="/ingest") | |
| res = client.predict("What does FAISS do?", 5, 1000, api_name="/answer") | |
| print(res["answer"]) | |
| ## Tech Stack | |
| - Embeddings: sentence-transformers/all-MiniLM-L6-v2 (384-dim) | |
| - Vector DB: FAISS (FlatIP index, normalized vectors) | |
| - UI & API: Gradio Blocks | |
| ## Notes | |
| - In-memory index only; resets when Space sleeps. | |
| - For persistence, extend with save/load to ./data/. | |
| - Demo-focused β fast, light, minimal surface. | |
| ## Author/Developer: Naga Adithya Kaushik (GenAIDevTOProd) | |
| ## Utilized AI CoPilot for development purpose : Yes (minimal) - Debug, test cases, experimentation only | |
| Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference | |