Spaces:
Running
Running
| # π§ Cloudzy AI - Cloud Photo Management Service | |
| A FastAPI-based cloud photo management service with AI tagging, captioning, and semantic search using FAISS. | |
| ## π― Features | |
| - **Photo Upload** - Upload images with automatic metadata generation | |
| - **AI Analysis** - Automatic tag and caption generation | |
| - **Semantic Search** - FAISS-powered similarity search on embeddings | |
| - **Image-to-Image Search** - Find similar photos to a reference image | |
| - **RESTful API** - Full REST API with automatic documentation | |
| - **Docker Support** - Production-ready Docker and Docker Compose setup | |
| ## π οΈ Tech Stack | |
| - **Backend**: FastAPI | |
| - **Database**: SQLModel + SQLite (PostgreSQL ready) | |
| - **Search Engine**: FAISS (Fast Approximate Nearest Neighbors) | |
| - **Image Processing**: Pillow | |
| - **ORM**: SQLModel | |
| - **API Documentation**: Swagger/OpenAPI | |
| ## π Prerequisites | |
| - Python 3.10+ | |
| - Docker & Docker Compose (optional) | |
| - 2GB+ RAM for FAISS index | |
| ## βοΈ Installation | |
| ### Local Development | |
| 1. **Clone and setup** | |
| ```bash | |
| cd image_embedder | |
| python -m venv venv | |
| source venv/bin/activate # On Windows: venv\Scripts\activate | |
| ``` | |
| 2. **Install dependencies** | |
| ```bash | |
| pip install -r requirements.txt | |
| ``` | |
| 3. **Create uploads directory** | |
| ```bash | |
| mkdir -p uploads | |
| ``` | |
| 4. **Run the server** | |
| ```bash | |
| uvicorn app.main:app --reload --host 0.0.0.0 --port 8000 | |
| ``` | |
| Server will start at `http://localhost:8000` | |
| ### Docker | |
| ```bash | |
| # Build and run | |
| docker compose up --build | |
| # Run in background | |
| docker compose up -d | |
| # View logs | |
| docker compose logs -f cloudzy_api | |
| # Stop | |
| docker compose down | |
| ``` | |
| ## π API Endpoints | |
| ### Upload Photo | |
| ```bash | |
| POST /upload | |
| Content-Type: multipart/form-data | |
| # Returns: | |
| { | |
| "id": 1, | |
| "filename": "photo_20231023_120000.jpg", | |
| "tags": ["nature", "landscape", "mountain"], | |
| "caption": "A beautiful nature photograph", | |
| "message": "Photo uploaded successfully with ID 1" | |
| } | |
| ``` | |
| ### Get Photo Metadata | |
| ```bash | |
| GET /photo/{id} | |
| # Returns: | |
| { | |
| "id": 1, | |
| "filename": "photo_20231023_120000.jpg", | |
| "tags": ["nature", "landscape"], | |
| "caption": "A beautiful landscape", | |
| "embedding": [0.123, -0.456, ...], # 512-dim vector | |
| "created_at": "2023-10-23T12:00:00" | |
| } | |
| ``` | |
| ### List All Photos | |
| ```bash | |
| GET /photos?skip=0&limit=10 | |
| # Returns: List of photo objects with pagination | |
| ``` | |
| ### Semantic Search | |
| ```bash | |
| GET /search?q=mountain&top_k=5 | |
| # Returns: | |
| { | |
| "query": "mountain", | |
| "results": [ | |
| { | |
| "photo_id": 1, | |
| "filename": "photo_1.jpg", | |
| "tags": ["nature", "mountain"], | |
| "caption": "Mountain landscape", | |
| "distance": 0.123 | |
| }, | |
| ... | |
| ], | |
| "total_results": 5 | |
| } | |
| ``` | |
| ### Image-to-Image Search | |
| ```bash | |
| POST /search/image-to-image?reference_photo_id=1&top_k=5 | |
| # Returns similar photos to reference photo 1 | |
| ``` | |
| ### Health Check | |
| ```bash | |
| GET /health | |
| # Returns service status and FAISS index stats | |
| ``` | |
| ## π API Documentation | |
| **Interactive Docs (Swagger UI)**: | |
| ``` | |
| http://localhost:8000/docs | |
| ``` | |
| **Alternative Docs (ReDoc)**: | |
| ``` | |
| http://localhost:8000/redoc | |
| ``` | |
| ## ποΈ Project Structure | |
| ``` | |
| image_embedder/ | |
| βββ app/ | |
| β βββ __init__.py | |
| β βββ main.py # FastAPI app entry point | |
| β βββ database.py # SQLModel engine + session | |
| β βββ models.py # Photo database model | |
| β βββ schemas.py # Pydantic response models | |
| β βββ ai_utils.py # AI generation (tags, captions, embeddings) | |
| β βββ search_engine.py # FAISS index manager | |
| β β | |
| β βββ routes/ | |
| β β βββ __init__.py | |
| β β βββ upload.py # POST /upload endpoint | |
| β β βββ photo.py # GET /photo/:id and /photos endpoints | |
| β β βββ search.py # GET /search and image-to-image endpoints | |
| β β | |
| β βββ utils/ | |
| β βββ __init__.py | |
| β βββ file_utils.py # File saving and management | |
| β | |
| βββ uploads/ # Stored images (created at runtime) | |
| βββ faiss_index.bin # FAISS index file (created at runtime) | |
| βββ photos.db # SQLite database (created at runtime) | |
| β | |
| βββ requirements.txt # Python dependencies | |
| βββ Dockerfile | |
| βββ docker-compose.yml | |
| βββ README.md | |
| ``` | |
| ## π Development Workflow | |
| ### Test Upload | |
| ```bash | |
| # Use curl | |
| curl -X POST -F "file=@/path/to/image.jpg" http://localhost:8000/upload | |
| # Or use Python | |
| import requests | |
| with open("image.jpg", "rb") as f: | |
| response = requests.post( | |
| "http://localhost:8000/upload", | |
| files={"file": f} | |
| ) | |
| print(response.json()) | |
| ``` | |
| ### Test Search | |
| ```bash | |
| # Query-based search | |
| curl "http://localhost:8000/search?q=tree&top_k=5" | |
| # Image-to-image search | |
| curl -X POST "http://localhost:8000/search/image-to-image?reference_photo_id=1&top_k=5" | |
| ``` | |
| ### View Database | |
| ```bash | |
| # Install sqlite3 CLI and view database | |
| sqlite3 photos.db | |
| > .tables | |
| > SELECT * FROM photo; | |
| > .quit | |
| ``` | |
| ## π§ AI Features (Placeholder Phase) | |
| Currently, AI functions use placeholder implementations: | |
| - **Tags**: Generated from filename patterns + random selection from common tags | |
| - **Captions**: Template-based generation from tags | |
| - **Embeddings**: Deterministic random vectors (reproducible from filename) | |
| ### Upgrade Path (Production) | |
| 1. **CLIP Integration** (Recommended) | |
| - Zero-shot image understanding | |
| - Excellent for tagging and search | |
| - ~1-2 sec per image on GPU | |
| 2. **BLIP Integration** (Alternative) | |
| - Visual question answering | |
| - Better captions | |
| - ~2-3 sec per image on GPU | |
| 3. **Fine-tuned Models** | |
| - Train on domain-specific data | |
| - Improved accuracy | |
| - Higher latency/complexity | |
| ## π Performance Considerations | |
| - **FAISS Index**: Supports millions of embeddings | |
| - **Database**: SQLite suitable for 100k+ photos; PostgreSQL for larger scale | |
| - **Embeddings**: 512-dim vectors (adjustable) | |
| - **Search**: <100ms for 100k+ embeddings on CPU | |
| ## π¨ Troubleshooting | |
| ### FAISS Installation Issues | |
| ```bash | |
| # If faiss-cpu fails, try: | |
| pip install faiss-cpu==1.7.4 --no-cache-dir | |
| ``` | |
| ### SQLite Lock Error | |
| ```bash | |
| # Restart the application or remove locked database | |
| rm photos.db | |
| ``` | |
| ### Docker Build Issues | |
| ```bash | |
| # Rebuild without cache | |
| docker compose build --no-cache | |
| ``` | |
| ## π Security Notes | |
| - β οΈ Currently no authentication - add for production | |
| - β οΈ CORS allows all origins - restrict for production | |
| - β οΈ File upload validation needed - add size limits | |
| - β οΈ Use PostgreSQL + proper secrets management for production | |
| ## π Next Steps | |
| 1. β Core backend working | |
| 2. β¬ Add authentication (JWT) | |
| 3. β¬ Implement real AI models (CLIP/BLIP) | |
| 4. β¬ Add background job processing (Celery) | |
| 5. β¬ Frontend dashboard | |
| 6. β¬ Production deployment (Railway/AWS) | |
| ## π License | |
| MIT License | |
| ## π€ Contributing | |
| Contributions welcome! Please test thoroughly before submitting. | |
| --- | |
| **Questions?** Check the interactive docs at `/docs` or review the code comments. |