--- title: Granite Docling 258M Demo emoji: ๐Ÿ”ฌ colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.44.0 app_file: app.py pinned: false license: apache-2.0 models: - ibm-granite/granite-docling-258M --- # ๐Ÿ”ฌ Granite Docling 258M - Online Demo Experience IBM's cutting-edge Vision-Language Model for document processing and conversion directly in your browser on Hugging Face Spaces! > **๐Ÿค– This Space uses the [IBM Granite Docling 258M](https://huggingface.co/ibm-granite/granite-docling-258M) Vision-Language Model hosted on Hugging Face Hub** ## ๐ŸŒŸ What is Granite Docling 258M? The IBM Granite Docling 258M is a state-of-the-art Vision-Language Model (VLM) designed for advanced document understanding and conversion. This model excels at: - **๐Ÿ“„ Multi-Format Processing**: PDF, DOCX, images - **๐Ÿ” Intelligent Analysis**: Document structure detection - **๐Ÿ“ Smart Conversion**: Semantic Markdown generation - **โšก Fast Processing**: 19x faster document insights - **๐Ÿ–ผ๏ธ Vision Understanding**: OCR and image analysis ## ๐Ÿš€ Features Available in This Demo ### ๐Ÿ” Document Analysis (Fast) - **Recommended** - **19x faster** than full conversion - Quick structural insights and metadata - Perfect for understanding document layout - Ideal for the free tier with processing time limits ### ๐Ÿ“ Full Markdown Conversion - Complete document-to-Markdown transformation - Preserves formatting and structure - Comprehensive text extraction ### ๐Ÿ“Š Table Extraction - Detects and extracts tabular data - Maintains table structure in Markdown format ### ๐Ÿ‘€ Quick Preview - Fast content sampling - Great for quick document verification ## ๐Ÿ’ก How to Use 1. **๐Ÿ“ค Upload** your document (PDF, DOCX, or image) 2. **โš™๏ธ Select** processing mode (try "Document Analysis" first!) 3. **๐Ÿš€ Click** "Process Document" 4. **๐Ÿ“Š View** results in the tabs below ## โšก Performance & Tips - **Document Analysis mode** is optimized for speed and works great on the free tier - **CPU processing** optimized for reliable performance - **Processing time varies** based on document size and complexity - **Free CPU tier** provides reliable processing with timeout limitations for very large documents - **Upgrade to GPU tier** for faster processing speeds ## ๐Ÿ› ๏ธ Technical Details - **Model**: [IBM Granite Docling 258M](https://huggingface.co/ibm-granite/granite-docling-258M) Vision-Language Model - **Model Hub**: Automatically loaded from `ibm-granite/granite-docling-258M` on Hugging Face - **Backend**: Docling framework with PyMuPDF optimization - **Processing**: CPU optimized (GPU available with paid tier upgrade) - **Hosting**: ๐Ÿค— Hugging Face Spaces (Free CPU Tier) ## ๐Ÿ”— Links & Resources - **๐Ÿ“‚ GitHub Repository**: [granite-docling-implementation](https://github.com/felipemeres/granite-docling-implementation) - **๐Ÿค— Model Hub**: [IBM Granite Docling 258M](https://huggingface.co/ibm-granite/granite-docling-258M) - **๐Ÿ“š Documentation**: [Docling Framework](https://github.com/DS4SD/docling) - **๐Ÿ† Production Ready**: Full security audit with zero vulnerabilities ## ๐ŸŽฏ Perfect For - **๐Ÿ“‹ Document Analysis**: Quick insights into document structure - **๐Ÿ”„ Format Conversion**: PDF/DOCX to clean Markdown - **๐Ÿ“Š Data Extraction**: Tables and structured content - **๐Ÿงช Research**: Testing document processing capabilities - **๐Ÿš€ Prototyping**: Exploring Vision-Language Model capabilities ## ๐Ÿ—๏ธ Built With - **IBM Granite Docling 258M** - State-of-the-art VLM - **Gradio** - Interactive web interface - **PyMuPDF** - Fast PDF processing optimization - **Hugging Face Transformers** - Model inference - **PyTorch** - Deep learning framework --- **๐ŸŽ‰ Try it now!** Upload a document above and experience the power of IBM's Granite Docling model with free GPU acceleration! *This demo showcases a production-ready implementation with comprehensive security auditing and performance optimizations.*