Spaces:

fmeres
/

granite-docling-demo

Running

App Files Files Community

granite-docling-demo / README.md

Felipe Meres

Remove false GPU advertising and fix misleading claims

ce3fde5 about 2 months ago

preview code

raw

history blame contribute delete

3.99 kB

A newer version of the Gradio SDK is available: 5.49.1

Upgrade

metadata

title: Granite Docling 258M Demo
emoji: 🔬
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: false
license: apache-2.0
models:
  - ibm-granite/granite-docling-258M

🔬 Granite Docling 258M - Online Demo

Experience IBM's cutting-edge Vision-Language Model for document processing and conversion directly in your browser on Hugging Face Spaces!

🤖 This Space uses the IBM Granite Docling 258M Vision-Language Model hosted on Hugging Face Hub

🌟 What is Granite Docling 258M?

The IBM Granite Docling 258M is a state-of-the-art Vision-Language Model (VLM) designed for advanced document understanding and conversion. This model excels at:

📄 Multi-Format Processing: PDF, DOCX, images
🔍 Intelligent Analysis: Document structure detection
📝 Smart Conversion: Semantic Markdown generation
⚡ Fast Processing: 19x faster document insights
🖼️ Vision Understanding: OCR and image analysis

🚀 Features Available in This Demo

🔍 Document Analysis (Fast) - Recommended

19x faster than full conversion
Quick structural insights and metadata
Perfect for understanding document layout
Ideal for the free tier with processing time limits

📝 Full Markdown Conversion

Complete document-to-Markdown transformation
Preserves formatting and structure
Comprehensive text extraction

📊 Table Extraction

Detects and extracts tabular data
Maintains table structure in Markdown format

👀 Quick Preview

Fast content sampling
Great for quick document verification

💡 How to Use

📤 Upload your document (PDF, DOCX, or image)
⚙️ Select processing mode (try "Document Analysis" first!)
🚀 Click "Process Document"
📊 View results in the tabs below

⚡ Performance & Tips

Document Analysis mode is optimized for speed and works great on the free tier
CPU processing optimized for reliable performance
Processing time varies based on document size and complexity
Free CPU tier provides reliable processing with timeout limitations for very large documents
Upgrade to GPU tier for faster processing speeds

🛠️ Technical Details

Model: IBM Granite Docling 258M Vision-Language Model
Model Hub: Automatically loaded from ibm-granite/granite-docling-258M on Hugging Face
Backend: Docling framework with PyMuPDF optimization
Processing: CPU optimized (GPU available with paid tier upgrade)
Hosting: 🤗 Hugging Face Spaces (Free CPU Tier)

🔗 Links & Resources

📂 GitHub Repository: granite-docling-implementation
🤗 Model Hub: IBM Granite Docling 258M
📚 Documentation: Docling Framework
🏆 Production Ready: Full security audit with zero vulnerabilities

🎯 Perfect For

📋 Document Analysis: Quick insights into document structure
🔄 Format Conversion: PDF/DOCX to clean Markdown
📊 Data Extraction: Tables and structured content
🧪 Research: Testing document processing capabilities
🚀 Prototyping: Exploring Vision-Language Model capabilities

🏗️ Built With

IBM Granite Docling 258M - State-of-the-art VLM
Gradio - Interactive web interface
PyMuPDF - Fast PDF processing optimization
Hugging Face Transformers - Model inference
PyTorch - Deep learning framework

🎉 Try it now! Upload a document above and experience the power of IBM's Granite Docling model with free GPU acceleration!

This demo showcases a production-ready implementation with comprehensive security auditing and performance optimizations.