ChengCui commited on
Commit
7bb1266
·
verified ·
1 Parent(s): 311e855

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -54
README.md CHANGED
@@ -141,60 +141,6 @@ for res in output:
141
 
142
  **For more usage details and parameter explanations, see the [documentation](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/PaddleOCR-VL.html).**
143
 
144
- ## PaddleOCR-VL-0.9B Usage with transformers
145
-
146
-
147
- Currently, we support inference using the PaddleOCR-VL-0.9B model with the `transformers` library, which can recognize texts, formulas, tables, and chart elements. In the future, we plan to support full document parsing inference with `transformers`. Below is a simple script we provide to support inference using the PaddleOCR-VL-0.9B model with `transformers`.
148
-
149
- > [!NOTE]
150
- > Note: We currently recommend using the official method for inference, as it is faster and supports page-level document parsing. The example code below only supports element-level recognition.
151
-
152
-
153
- ```python
154
- from PIL import Image
155
- import torch
156
- from transformers import AutoModelForCausalLM, AutoProcessor
157
-
158
- DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
159
-
160
- CHOSEN_TASK = "ocr" # Options: 'ocr' | 'table' | 'chart' | 'formula'
161
- PROMPTS = {
162
- "ocr": "OCR:",
163
- "table": "Table Recognition:",
164
- "formula": "Formula Recognition:",
165
- "chart": "Chart Recognition:",
166
- }
167
-
168
- model_path = "PaddlePaddle/PaddleOCR-VL"
169
- image_path = "test.png"
170
- image = Image.open(image_path).convert("RGB")
171
-
172
- model = AutoModelForCausalLM.from_pretrained(
173
- model_path, trust_remote_code=True, torch_dtype=torch.bfloat16
174
- ).to(DEVICE).eval()
175
- processor = AutoProcessor.from_pretrained(model_path, trust_remote_code=True)
176
-
177
- messages = [
178
- {"role": "user",
179
- "content": [
180
- {"type": "image", "image": image},
181
- {"type": "text", "text": PROMPTS[CHOSEN_TASK]},
182
- ]
183
- }
184
- ]
185
- inputs = processor.apply_chat_template(
186
- messages,
187
- tokenize=True,
188
- add_generation_prompt=True,
189
- return_dict=True,
190
- return_tensors="pt"
191
- ).to(DEVICE)
192
-
193
- outputs = model.generate(**inputs, max_new_tokens=1024)
194
- outputs = processor.batch_decode(outputs, skip_special_tokens=True)[0]
195
- print(outputs)
196
- ```
197
-
198
  ## Performance
199
 
200
  ### Page-Level Document Parsing
 
141
 
142
  **For more usage details and parameter explanations, see the [documentation](https://www.paddleocr.ai/latest/en/version3.x/pipeline_usage/PaddleOCR-VL.html).**
143
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
144
  ## Performance
145
 
146
  ### Page-Level Document Parsing