Spaces:
Sleeping
Sleeping
| title: Typhoon OCR | |
| emoji: 🌍 | |
| colorFrom: gray | |
| colorTo: red | |
| sdk: gradio | |
| sdk_version: 5.34.0 | |
| app_file: app.py | |
| pinned: false | |
| license: apache-2.0 | |
| short_description: Convert Image & PDF to Markdown | |
| ## Typhoon OCR | |
| Typhoon OCR is a model for extracting structured markdown from images or PDFs. It supports document layout analysis and table extraction, returning results in markdown or HTML. This package is a simple Gradio website to demonstrate the performance of Typhoon OCR. | |
| ### Features | |
| - Upload a PDF or image (single page) | |
| - Extracts and reconstructs document content as markdown | |
| - Supports different prompt modes for layout or structure | |
| - Language: English, Thai | |
| - Uses a local or remote OpenAI-compatible API (e.g., vllm) | |
| ### Install | |
| ```bash | |
| pip install -r requirements.txt | |
| # edit .env | |
| # pip install vllm # optional for hosting a local server | |
| ``` | |
| ### Mac specific | |
| ``` | |
| brew install poppler | |
| # The following binaries are required and provided by poppler: | |
| # - pdfinfo | |
| # - pdftoppm | |
| ``` | |
| ### Linux specific | |
| ``` | |
| sudo apt-get update | |
| sudo apt-get install poppler-utils | |
| # The following binaries are required and provided by poppler-utils: | |
| # - pdfinfo | |
| # - pdftoppm | |
| ``` | |
| ### Start vllm | |
| ```bash | |
| vllm serve scb10x/typhoon-ocr-7b --served-model-name typhoon-ocr --dtype bfloat16 --port 8101 | |
| ``` | |
| ### Run Gradio demo | |
| ```bash | |
| python app.py | |
| ``` | |
| ### Dependencies | |
| - openai | |
| - python-dotenv | |
| - ftfy | |
| - pypdf | |
| - gradio | |
| - vllm (for hosting an inference server) | |
| - pillow | |
| ### License | |
| This project is licensed under the Apache 2.0 License. See individual datasets and checkpoints for their respective licenses. |