Xlordo's picture
Update README.md
8435783 verified
---
title: SBERT + FAISS Semantic Search
emoji: πŸ”
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.45.0
app_file: app.py
pinned: false
---
# SBERT + FAISS Semantic Search + Evaluation Metrics
This Hugging Face Space hosts a **semantic search system** built with:
- [Sentence-BERT (SBERT)](https://www.sbert.net/) for embeddings
- [FAISS](https://faiss.ai/) for fast vector search
- [MS MARCO v1.1 dataset](https://microsoft.github.io/msmarco/) (10,000 passages subset)
- [Gradio](https://gradio.app/) for the interactive interface
---
## πŸ”Ή Features
- Enter a **query** to retrieve the **Top-10 most similar passages**.
- Computes **true IR metrics** when the query matches one in MS MARCO validation set:
- Precision@10
- Recall@10
- F1-score
- Mean Reciprocal Rank (MRR)
- Normalized Discounted Cumulative Gain (nDCG@10)
---
## πŸ”Ή How to Use
1. Type a query into the input box.
2. Press **Submit**.
3. View:
- **Top-10 retrieved passages** with similarity scores
- **Evaluation metrics** if the query exists in the validation set
---
## πŸ”Ή Tech Stack
- **Embeddings:** `sentence-transformers/all-mpnet-base-v2`
- **Indexing:** FAISS (L2 similarity)
- **Dataset:** MS MARCO v1.1 (first 10,000 passages)
- **Interface:** Gradio
---
## πŸ”Ή Citation
If you use this system in research, please cite:
- [Sentence-BERT](https://arxiv.org/abs/1908.10084)
- [MS MARCO](https://microsoft.github.io/msmarco/)
---
## πŸ”Ή Author
Built for a research project on **user-centered evaluation of semantic search systems**.