infer_mmlab_text_recognition

infer_mmlab_text_recognition

About

2.0.3
Apache-2.0

Inference for MMOCR from MMLAB text recognition models

Task: OCR
inference
mmlab
mmocr
ocr
text
recognition
pytorch
satrn
seg

Run text recognition algorithms from MMLAB framework. This algorithm will often be applied after a text detection algorithm. You can use infer_mmlab_text_detection from Ikomia HUB for this task.

Models will come from MMLAB's model zoo if custom training is disabled. If not, you can choose to load your model trained with algorithm train_mmlab_detection from Ikomia HUB. In this case, make sure to set parameters for config file (.py) and model file (.pth). Both of these files are produced by the train algorithm.

Example image

🚀 Use with Ikomia API

1. Install Ikomia API

We strongly recommend using a virtual environment. If you're not sure where to start, we offer a tutorial here.

pip install ikomia

2. Create your workflow

from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display

# Init your workflow
wf = Workflow()

# Add text detection algorithm
text_det = wf.add_task(name="infer_mmlab_text_detection", auto_connect=True)

# Add text recognition algorithm
text_rec = wf.add_task(name="infer_mmlab_text_recognition", auto_connect=True)

# Run the workflow on image
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-hub/infer_mmlab_text_recognition/main/images/billboard.jpg")

# Display results
img_output = text_rec.get_output(0)
recognition_output = text_rec.get_output(1)
display(img_output.get_image_with_mask_and_graphics(recognition_output), title="MMLAB text recognition")

☀️ Use with Ikomia Studio

Ikomia Studio offers a friendly UI with the same features as the API.

  • If you haven't started using Ikomia Studio yet, download and install it from this page.

  • For additional guidance on getting started with Ikomia Studio, check out this blog post.

📝 Set algorithm parameters

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add text detection algorithm
text_det = wf.add_task(name="infer_mmlab_text_detection", auto_connect=True)

# Add text recognition algorithm
text_rec = wf.add_task(name="infer_mmlab_text_recognition", auto_connect=True)

text_rec.set_parameters({
"model_name": "satrn",
"cfg": "satrn_shallow-small_5e_st_mj.py",
"config_file": "",
"model_weight_file": "",
"batch_size": "64",
"dict_file": "dicts/english_digits_symbols.txt",
})

# Run the workflow on image
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-hub/infer_mmlab_text_recognition/main/images/billboard.jpg")
  • model_name (str, default="satrn"): model name.
  • cfg (str, default="satrn_shallow-small_5e_st_mj"): name of the model configuration file.
  • conf_thres (float, default=0.5): object detection confidence.
  • config_file (str, default=""): path to model config file (only if custom_training=True). The file is generated at the end of a custom training. Use algorithm train_mmlab_text_recognition from Ikomia HUB to train custom model.
  • model_weight_file (str, default=""): path to model weights file (.pt) (only if custom_training=True). The file is generated at the end of a custom training.
  • batch_size (int, default=64): batch processing to speed up inference time.
  • dict_file (str, default="dicts/english_digits_symbols.txt"): characters dictionary.

MMLab framework for text recognition offers a large range of models. To ease the choice of couple (model_name/cfg), you can call the function get_model_zoo() to get a list of possible values.

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add text recognition algorithm
text_rec = wf.add_task(name="infer_mmlab_text_recognition", auto_connect=True)

# Get list of possible models (model_name, model_config)
print(text_rec.get_model_zoo())

🔍 Explore algorithm outputs

Every algorithm produces specific outputs, yet they can be explored them the same way using the Ikomia API. For a more in-depth understanding of managing algorithm outputs, please refer to the documentation.

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add text detection algorithm
text_det = wf.add_task(name="infer_mmlab_text_detection", auto_connect=True)

# Add text recognition algorithm
text_rec = wf.add_task(name="infer_mmlab_text_recognition", auto_connect=True)

# Run the workflow on image
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-hub/infer_mmlab_text_recognition/main/images/billboard.jpg")

# Iterate over outputs
for output in text_rec.get_outputs():
# Print information
print(output)
# Export it to JSON
output.to_json()

MMLab text recognition algorithm generates 2 outputs:

  1. Forwarded original image (CImageIO)
  2. Text detection output (CTextIO)

Developer

  • Ikomia
    Ikomia

License

Apache License 2.0
Read license full text

A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.

PermissionsConditionsLimitations

Commercial use

License and copyright notice

Trademark use

Modification

State changes

Liability

Distribution

Warranty

Patent use

Private use

This is not legal advice: this description is for informational purposes only and does not constitute the license itself. Provided by choosealicense.com.