crosse712
Trigger HF Space rebuild - 20250908_172311
5bcaefa
metadata
title: FastVLM Screen Observer
emoji: ๐Ÿ–ฅ๏ธ๐Ÿ‘๏ธ
colorFrom: blue
colorTo: purple
sdk: docker
sdk_version: '3.9'
app_port: 7860
pinned: false
license: mit
models:
  - apple/FastVLM-7B
suggested_hardware: t4-small
custom_headers:
  cross-origin-embedder-policy: require-corp
  cross-origin-opener-policy: same-origin

FastVLM Screen Observer ๐Ÿ–ฅ๏ธ๐Ÿ‘๏ธ

Real-time screen observation and analysis using Apple's FastVLM-7B model, optimized for low-RAM systems (3-8GB).

Features

  • ๐ŸŽฏ Real-time screen capture and analysis
  • ๐Ÿค– FastVLM-7B vision-language model integration
  • ๐Ÿ” UI element detection
  • ๐Ÿ“ Text extraction from screenshots
  • โš ๏ธ Risk detection for security concerns
  • ๐ŸŽฎ Browser automation demo
  • ๐Ÿ’พ Export logs and captured frames
  • ๐Ÿš€ Optimized for 3-8GB RAM with 4-bit quantization

How to Use

  1. Click "Capture Screen" to analyze your current screen
  2. Enable "Auto Capture" for continuous monitoring
  3. Use "Run Demo" to see browser automation
  4. Export logs as ZIP archive

Model Information

  • Model: Apple FastVLM-7B
  • Optimization: Extreme memory optimization with 4-bit quantization
  • Memory: Runs on 3-8GB RAM systems
  • Device: Supports CPU, CUDA, and MPS (Apple Silicon)

API Endpoints

  • GET /api/ - Status check
  • POST /api/analyze - Screen analysis
  • POST /api/demo - Automation demo
  • GET /api/export - Export logs
  • GET /api/logs/stream - Stream logs via SSE

GitHub Repository

https://github.com/crosse712/fastvlm-screen-observer


Built with โค๏ธ using FastAPI, React, and FastVLM-7B# Force rebuild Mon Sep 8 15:44:59 PST 2025

Force rebuild Mon 8 Sep 2025 17:23:11 PST