Felipe Meres
Remove false GPU advertising and fix misleading claims
ce3fde5

A newer version of the Gradio SDK is available: 5.49.1

Upgrade
metadata
title: Granite Docling 258M Demo
emoji: πŸ”¬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
models:
  - ibm-granite/granite-docling-258M

πŸ”¬ Granite Docling 258M - Online Demo

Experience IBM's cutting-edge Vision-Language Model for document processing and conversion directly in your browser on Hugging Face Spaces!

πŸ€– This Space uses the IBM Granite Docling 258M Vision-Language Model hosted on Hugging Face Hub

🌟 What is Granite Docling 258M?

The IBM Granite Docling 258M is a state-of-the-art Vision-Language Model (VLM) designed for advanced document understanding and conversion. This model excels at:

  • πŸ“„ Multi-Format Processing: PDF, DOCX, images
  • πŸ” Intelligent Analysis: Document structure detection
  • πŸ“ Smart Conversion: Semantic Markdown generation
  • ⚑ Fast Processing: 19x faster document insights
  • πŸ–ΌοΈ Vision Understanding: OCR and image analysis

πŸš€ Features Available in This Demo

πŸ” Document Analysis (Fast) - Recommended

  • 19x faster than full conversion
  • Quick structural insights and metadata
  • Perfect for understanding document layout
  • Ideal for the free tier with processing time limits

πŸ“ Full Markdown Conversion

  • Complete document-to-Markdown transformation
  • Preserves formatting and structure
  • Comprehensive text extraction

πŸ“Š Table Extraction

  • Detects and extracts tabular data
  • Maintains table structure in Markdown format

πŸ‘€ Quick Preview

  • Fast content sampling
  • Great for quick document verification

πŸ’‘ How to Use

  1. πŸ“€ Upload your document (PDF, DOCX, or image)
  2. βš™οΈ Select processing mode (try "Document Analysis" first!)
  3. πŸš€ Click "Process Document"
  4. πŸ“Š View results in the tabs below

⚑ Performance & Tips

  • Document Analysis mode is optimized for speed and works great on the free tier
  • CPU processing optimized for reliable performance
  • Processing time varies based on document size and complexity
  • Free CPU tier provides reliable processing with timeout limitations for very large documents
  • Upgrade to GPU tier for faster processing speeds

πŸ› οΈ Technical Details

  • Model: IBM Granite Docling 258M Vision-Language Model
  • Model Hub: Automatically loaded from ibm-granite/granite-docling-258M on Hugging Face
  • Backend: Docling framework with PyMuPDF optimization
  • Processing: CPU optimized (GPU available with paid tier upgrade)
  • Hosting: πŸ€— Hugging Face Spaces (Free CPU Tier)

πŸ”— Links & Resources

🎯 Perfect For

  • πŸ“‹ Document Analysis: Quick insights into document structure
  • πŸ”„ Format Conversion: PDF/DOCX to clean Markdown
  • πŸ“Š Data Extraction: Tables and structured content
  • πŸ§ͺ Research: Testing document processing capabilities
  • πŸš€ Prototyping: Exploring Vision-Language Model capabilities

πŸ—οΈ Built With

  • IBM Granite Docling 258M - State-of-the-art VLM
  • Gradio - Interactive web interface
  • PyMuPDF - Fast PDF processing optimization
  • Hugging Face Transformers - Model inference
  • PyTorch - Deep learning framework

πŸŽ‰ Try it now! Upload a document above and experience the power of IBM's Granite Docling model with free GPU acceleration!

This demo showcases a production-ready implementation with comprehensive security auditing and performance optimizations.