infer_donut

infer_donut

About

1.0.2
MIT

OCR-free model for document understanding

Task: OCR
document
understanding
kie

Donut 🍩, Document understanding transformer, is a new method of document understanding that utilizes an OCR-free end-to-end Transformer model. Donut does not require off-the-shelf OCR engines/APIs, yet it shows state-of-the-art performances on various visual document understanding tasks, such as visual document classification or information extraction (a.k.a. document parsing).

ocr illustration

🚀 Use with Ikomia API

1. Install Ikomia API

We strongly recommend using a virtual environment. If you're not sure where to start, we offer a tutorial here.

pip install ikomia

2. Create your workflow

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_donut", auto_connect=True)

# Run on your image
wf.run_on(url="https://github.com/Ikomia-hub/infer_donut/blob/main/images/example.jpg?raw=true")

# Display results
extracted_data = algo.get_output(1)
print(extracted_data.data)

☀️ Use with Ikomia Studio

Ikomia Studio offers a friendly UI with the same features as the API.

  • If you haven't started using Ikomia Studio yet, download and install it from this page.
  • For additional guidance on getting started with Ikomia Studio, check out this blog post.

📝 Set algorithm parameters

  • model_name (str) - default 'naver-clova-ix/donut-base-finetuned-docvqa': Name of the Donut pre-trained model for VGA. Other models available:
    • naver-clova-ix/donut-base-finetuned-rvlcdip
    • naver-clova-ix/donut-base-finetuned-cord-v1
    • naver-clova-ix/donut-base-finetuned-cord-v2
  • prompt (str): question about document understanding for example.
  • cuda (bool): If True, CUDA-based inference (GPU). If False, run on CPU.
  • custom_model_folder: custom model folder (optional)
  • task_name: in case of custom model, you should specify the corresponding task

Parameters should be in strings format when added to the dictionary.

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_donut", auto_connect=True)

algo.set_parameters({
"model_name": "naver-clova-ix/donut-base-finetuned-docvqa",
"prompt": "What is the date of the document",
"cuda": "True"
})

wf.run_on(url="https://github.com/Ikomia-hub/infer_donut/blob/main/images/example.jpg?raw=true")

# Display results
extracted_data = algo.get_output(1)
print(extracted_data.data)

🔍 Explore algorithm outputs

Every algorithm produces specific outputs, yet they can be explored them the same way using the Ikomia API. For a more in-depth understanding of managing algorithm outputs, please refer to the documentation.

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_donut", auto_connect=True)

# Run on your image
wf.run_on(url="https://github.com/Ikomia-hub/infer_donut/blob/main/images/example.jpg?raw=true")

# Iterate over outputs
for output in algo.get_outputs():
# Print information
print(output)
# Export it to JSON
output.to_json()

Developer

  • Ikomia
    Ikomia

License

A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.

PermissionsConditionsLimitations

Commercial use

License and copyright notice

Liability

Modification

Warranty

Distribution

Private use

This is not legal advice: this description is for informational purposes only and does not constitute the license itself. Provided by choosealicense.com.