cloudzy_ai_challenge / context.txt
matinsn2000's picture
Added context.txt
57860a9
# 🧭 Cloudzy AI - Cloud Photo Management Service
A FastAPI-based cloud photo management service with AI tagging, captioning, and semantic search using FAISS.
## 🎯 Features
- **Photo Upload** - Upload images with automatic metadata generation
- **AI Analysis** - Automatic tag and caption generation
- **Semantic Search** - FAISS-powered similarity search on embeddings
- **Image-to-Image Search** - Find similar photos to a reference image
- **RESTful API** - Full REST API with automatic documentation
- **Docker Support** - Production-ready Docker and Docker Compose setup
## πŸ› οΈ Tech Stack
- **Backend**: FastAPI
- **Database**: SQLModel + SQLite (PostgreSQL ready)
- **Search Engine**: FAISS (Fast Approximate Nearest Neighbors)
- **Image Processing**: Pillow
- **ORM**: SQLModel
- **API Documentation**: Swagger/OpenAPI
## πŸ“‹ Prerequisites
- Python 3.10+
- Docker & Docker Compose (optional)
- 2GB+ RAM for FAISS index
## βš™οΈ Installation
### Local Development
1. **Clone and setup**
```bash
cd image_embedder
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
2. **Install dependencies**
```bash
pip install -r requirements.txt
```
3. **Create uploads directory**
```bash
mkdir -p uploads
```
4. **Run the server**
```bash
uvicorn app.main:app --reload --host 0.0.0.0 --port 8000
```
Server will start at `http://localhost:8000`
### Docker
```bash
# Build and run
docker compose up --build
# Run in background
docker compose up -d
# View logs
docker compose logs -f cloudzy_api
# Stop
docker compose down
```
## πŸš€ API Endpoints
### Upload Photo
```bash
POST /upload
Content-Type: multipart/form-data
# Returns:
{
"id": 1,
"filename": "photo_20231023_120000.jpg",
"tags": ["nature", "landscape", "mountain"],
"caption": "A beautiful nature photograph",
"message": "Photo uploaded successfully with ID 1"
}
```
### Get Photo Metadata
```bash
GET /photo/{id}
# Returns:
{
"id": 1,
"filename": "photo_20231023_120000.jpg",
"tags": ["nature", "landscape"],
"caption": "A beautiful landscape",
"embedding": [0.123, -0.456, ...], # 512-dim vector
"created_at": "2023-10-23T12:00:00"
}
```
### List All Photos
```bash
GET /photos?skip=0&limit=10
# Returns: List of photo objects with pagination
```
### Semantic Search
```bash
GET /search?q=mountain&top_k=5
# Returns:
{
"query": "mountain",
"results": [
{
"photo_id": 1,
"filename": "photo_1.jpg",
"tags": ["nature", "mountain"],
"caption": "Mountain landscape",
"distance": 0.123
},
...
],
"total_results": 5
}
```
### Image-to-Image Search
```bash
POST /search/image-to-image?reference_photo_id=1&top_k=5
# Returns similar photos to reference photo 1
```
### Health Check
```bash
GET /health
# Returns service status and FAISS index stats
```
## πŸ“š API Documentation
**Interactive Docs (Swagger UI)**:
```
http://localhost:8000/docs
```
**Alternative Docs (ReDoc)**:
```
http://localhost:8000/redoc
```
## πŸ—‚οΈ Project Structure
```
image_embedder/
β”œβ”€β”€ app/
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ main.py # FastAPI app entry point
β”‚ β”œβ”€β”€ database.py # SQLModel engine + session
β”‚ β”œβ”€β”€ models.py # Photo database model
β”‚ β”œβ”€β”€ schemas.py # Pydantic response models
β”‚ β”œβ”€β”€ ai_utils.py # AI generation (tags, captions, embeddings)
β”‚ β”œβ”€β”€ search_engine.py # FAISS index manager
β”‚ β”‚
β”‚ β”œβ”€β”€ routes/
β”‚ β”‚ β”œβ”€β”€ __init__.py
β”‚ β”‚ β”œβ”€β”€ upload.py # POST /upload endpoint
β”‚ β”‚ β”œβ”€β”€ photo.py # GET /photo/:id and /photos endpoints
β”‚ β”‚ └── search.py # GET /search and image-to-image endpoints
β”‚ β”‚
β”‚ └── utils/
β”‚ β”œβ”€β”€ __init__.py
β”‚ └── file_utils.py # File saving and management
β”‚
β”œβ”€β”€ uploads/ # Stored images (created at runtime)
β”œβ”€β”€ faiss_index.bin # FAISS index file (created at runtime)
β”œβ”€β”€ photos.db # SQLite database (created at runtime)
β”‚
β”œβ”€β”€ requirements.txt # Python dependencies
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ docker-compose.yml
└── README.md
```
## πŸ”„ Development Workflow
### Test Upload
```bash
# Use curl
curl -X POST -F "file=@/path/to/image.jpg" http://localhost:8000/upload
# Or use Python
import requests
with open("image.jpg", "rb") as f:
response = requests.post(
"http://localhost:8000/upload",
files={"file": f}
)
print(response.json())
```
### Test Search
```bash
# Query-based search
curl "http://localhost:8000/search?q=tree&top_k=5"
# Image-to-image search
curl -X POST "http://localhost:8000/search/image-to-image?reference_photo_id=1&top_k=5"
```
### View Database
```bash
# Install sqlite3 CLI and view database
sqlite3 photos.db
> .tables
> SELECT * FROM photo;
> .quit
```
## 🧠 AI Features (Placeholder Phase)
Currently, AI functions use placeholder implementations:
- **Tags**: Generated from filename patterns + random selection from common tags
- **Captions**: Template-based generation from tags
- **Embeddings**: Deterministic random vectors (reproducible from filename)
### Upgrade Path (Production)
1. **CLIP Integration** (Recommended)
- Zero-shot image understanding
- Excellent for tagging and search
- ~1-2 sec per image on GPU
2. **BLIP Integration** (Alternative)
- Visual question answering
- Better captions
- ~2-3 sec per image on GPU
3. **Fine-tuned Models**
- Train on domain-specific data
- Improved accuracy
- Higher latency/complexity
## πŸ“Š Performance Considerations
- **FAISS Index**: Supports millions of embeddings
- **Database**: SQLite suitable for 100k+ photos; PostgreSQL for larger scale
- **Embeddings**: 512-dim vectors (adjustable)
- **Search**: <100ms for 100k+ embeddings on CPU
## 🚨 Troubleshooting
### FAISS Installation Issues
```bash
# If faiss-cpu fails, try:
pip install faiss-cpu==1.7.4 --no-cache-dir
```
### SQLite Lock Error
```bash
# Restart the application or remove locked database
rm photos.db
```
### Docker Build Issues
```bash
# Rebuild without cache
docker compose build --no-cache
```
## πŸ” Security Notes
- ⚠️ Currently no authentication - add for production
- ⚠️ CORS allows all origins - restrict for production
- ⚠️ File upload validation needed - add size limits
- ⚠️ Use PostgreSQL + proper secrets management for production
## πŸ“ Next Steps
1. βœ… Core backend working
2. ⬜ Add authentication (JWT)
3. ⬜ Implement real AI models (CLIP/BLIP)
4. ⬜ Add background job processing (Celery)
5. ⬜ Frontend dashboard
6. ⬜ Production deployment (Railway/AWS)
## πŸ“„ License
MIT License
## 🀝 Contributing
Contributions welcome! Please test thoroughly before submitting.
---
**Questions?** Check the interactive docs at `/docs` or review the code comments.