Spaces:
Runtime error
Runtime error
File size: 1,598 Bytes
71d0757 8435783 71d0757 461dfcb 829cfaa 461dfcb 829cfaa 29a733e 829cfaa 461dfcb 829cfaa 29a733e 829cfaa 8435783 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
title: SBERT + FAISS Semantic Search
emoji: π
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 5.45.0
app_file: app.py
pinned: false
---
# SBERT + FAISS Semantic Search + Evaluation Metrics
This Hugging Face Space hosts a **semantic search system** built with:
- [Sentence-BERT (SBERT)](https://www.sbert.net/) for embeddings
- [FAISS](https://faiss.ai/) for fast vector search
- [MS MARCO v1.1 dataset](https://microsoft.github.io/msmarco/) (10,000 passages subset)
- [Gradio](https://gradio.app/) for the interactive interface
---
## πΉ Features
- Enter a **query** to retrieve the **Top-10 most similar passages**.
- Computes **true IR metrics** when the query matches one in MS MARCO validation set:
- Precision@10
- Recall@10
- F1-score
- Mean Reciprocal Rank (MRR)
- Normalized Discounted Cumulative Gain (nDCG@10)
---
## πΉ How to Use
1. Type a query into the input box.
2. Press **Submit**.
3. View:
- **Top-10 retrieved passages** with similarity scores
- **Evaluation metrics** if the query exists in the validation set
---
## πΉ Tech Stack
- **Embeddings:** `sentence-transformers/all-mpnet-base-v2`
- **Indexing:** FAISS (L2 similarity)
- **Dataset:** MS MARCO v1.1 (first 10,000 passages)
- **Interface:** Gradio
---
## πΉ Citation
If you use this system in research, please cite:
- [Sentence-BERT](https://arxiv.org/abs/1908.10084)
- [MS MARCO](https://microsoft.github.io/msmarco/)
---
## πΉ Author
Built for a research project on **user-centered evaluation of semantic search systems**. |