infer_donut
About
OCR-free model for document understanding
Donut 🍩, Document understanding transformer, is a new method of document understanding that utilizes an OCR-free end-to-end Transformer model. Donut does not require off-the-shelf OCR engines/APIs, yet it shows state-of-the-art performances on various visual document understanding tasks, such as visual document classification or information extraction (a.k.a. document parsing).
🚀 Use with Ikomia API
1. Install Ikomia API
We strongly recommend using a virtual environment. If you're not sure where to start, we offer a tutorial here.
pip install ikomia
2. Create your workflow
from ikomia.dataprocess.workflow import Workflow# Init your workflowwf = Workflow()# Add algorithmalgo = wf.add_task(name="infer_donut", auto_connect=True)# Run on your imagewf.run_on(url="https://github.com/Ikomia-hub/infer_donut/blob/main/images/example.jpg?raw=true")# Display resultsextracted_data = algo.get_output(1)print(extracted_data.data)
☀️ Use with Ikomia Studio
Ikomia Studio offers a friendly UI with the same features as the API.
- If you haven't started using Ikomia Studio yet, download and install it from this page.
- For additional guidance on getting started with Ikomia Studio, check out this blog post.
📝 Set algorithm parameters
- model_name (str) - default 'naver-clova-ix/donut-base-finetuned-docvqa': Name of the Donut pre-trained model for VGA. Other models available:
- naver-clova-ix/donut-base-finetuned-rvlcdip
- naver-clova-ix/donut-base-finetuned-cord-v1
- naver-clova-ix/donut-base-finetuned-cord-v2
- prompt (str): question about document understanding for example.
- cuda (bool): If True, CUDA-based inference (GPU). If False, run on CPU.
- custom_model_folder: custom model folder (optional)
- task_name: in case of custom model, you should specify the corresponding task
Parameters should be in strings format when added to the dictionary.
from ikomia.dataprocess.workflow import Workflow# Init your workflowwf = Workflow()# Add algorithmalgo = wf.add_task(name="infer_donut", auto_connect=True)algo.set_parameters({"model_name": "naver-clova-ix/donut-base-finetuned-docvqa","prompt": "What is the date of the document","cuda": "True"})wf.run_on(url="https://github.com/Ikomia-hub/infer_donut/blob/main/images/example.jpg?raw=true")# Display resultsextracted_data = algo.get_output(1)print(extracted_data.data)
🔍 Explore algorithm outputs
Every algorithm produces specific outputs, yet they can be explored them the same way using the Ikomia API. For a more in-depth understanding of managing algorithm outputs, please refer to the documentation.
from ikomia.dataprocess.workflow import Workflow# Init your workflowwf = Workflow()# Add algorithmalgo = wf.add_task(name="infer_donut", auto_connect=True)# Run on your imagewf.run_on(url="https://github.com/Ikomia-hub/infer_donut/blob/main/images/example.jpg?raw=true")# Iterate over outputsfor output in algo.get_outputs():# Print informationprint(output)# Export it to JSONoutput.to_json()
Developer
Ikomia
License
MIT License
A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
Permissions | Conditions | Limitations |
---|---|---|
Commercial use | License and copyright notice | Liability |
Modification | Warranty | |
Distribution | ||
Private use |
This is not legal advice: this description is for informational purposes only and does not constitute the license itself. Provided by choosealicense.com.