metadata
title: FastVLM Screen Observer
emoji: ๐ฅ๏ธ๐๏ธ
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: '3.9'
app_port: 7860
pinned: false
license: mit
models:
- apple/FastVLM-7B
suggested_hardware: t4-small
custom_headers:
cross-origin-embedder-policy: require-corp
cross-origin-opener-policy: same-origin
FastVLM Screen Observer ๐ฅ๏ธ๐๏ธ
Real-time screen observation and analysis using Apple's FastVLM-7B model, optimized for low-RAM systems (3-8GB).
Features
- ๐ฏ Real-time screen capture and analysis
- ๐ค FastVLM-7B vision-language model integration
- ๐ UI element detection
- ๐ Text extraction from screenshots
- โ ๏ธ Risk detection for security concerns
- ๐ฎ Browser automation demo
- ๐พ Export logs and captured frames
- ๐ Optimized for 3-8GB RAM with 4-bit quantization
How to Use
- Click "Capture Screen" to analyze your current screen
- Enable "Auto Capture" for continuous monitoring
- Use "Run Demo" to see browser automation
- Export logs as ZIP archive
Model Information
- Model: Apple FastVLM-7B
- Optimization: Extreme memory optimization with 4-bit quantization
- Memory: Runs on 3-8GB RAM systems
- Device: Supports CPU, CUDA, and MPS (Apple Silicon)
API Endpoints
GET /api/- Status checkPOST /api/analyze- Screen analysisPOST /api/demo- Automation demoGET /api/export- Export logsGET /api/logs/stream- Stream logs via SSE
GitHub Repository
https://github.com/crosse712/fastvlm-screen-observer
Built with โค๏ธ using FastAPI, React, and FastVLM-7B# Force rebuild Mon Sep 8 15:44:59 PST 2025