infer_grounding_dino

About

1.0.3

Apache-2.0

Inference of the Grounding DINO model

Task: Object detection

Object

Detection

Grounding

DINO

Zero Shot

Bert

Swin Transformer

The Algorithm proposes a zero-shot object grounding model that can localize objects in an image with a natural language query.

Grounding Dino dog detection

🚀 Use with Ikomia API

1. Install Ikomia API

We strongly recommend using a virtual environment. If you're not sure where to start, we offer a tutorial here.

pip install ikomia

2. Create your workflow

from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display


# Init your workflow
wf = Workflow()    

# Add the Grounding DINO Object Detector
dino = wf.add_task(name="infer_grounding_dino", auto_connect=True)

# Run on your image  
# wf.run_on(path="path/to/your/image.png")
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_dog.png")

# Inspect your results
display(dino.get_image_with_graphics())

☀️ Use with Ikomia Studio

Ikomia Studio offers a friendly UI with the same features as the API.

If you haven't started using Ikomia Studio yet, download and install it from this page.
For additional guidance on getting started with Ikomia Studio, check out this blog post.

📝 Set algorithm parameters

model_name (str) - default 'Swin-T': The GroundingDINO algorithm has two different checkpoint models: ‘Swin-B’ and ‘Swin-T’, with respectively, 172M and 341M of parameters.
prompt (str) - default 'car . person . dog .': Text prompt for the model
conf_thres (float) - default '0.35': Box threshold for the prediction‍
conf_thres_text (float) - default '0.25': Text threshold for the prediction
cuda (bool): If True, CUDA-based inference (GPU). If False, run on CPU

Parameters should be in strings format when added to the dictionary.

from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display

# Init your workflow
wf = Workflow()    

# Add the Grounding DINO Object Detector
dino = wf.add_task(name="infer_grounding_dino", auto_connect=True)

dino.set_parameters({
    "model_name": "Swin-B",
    "prompt": "laptops . smartphone . headphone .",
    "conf_thres": "0.35",
    "conf_thres_text": "0.25"
})

# Run on your image  
# wf.run_on(path="path/to/your/image.png")
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_work.jpg")

# Inspect your results
display(dino.get_image_with_graphics())

🔍 Explore algorithm outputs

Every algorithm produces specific outputs, yet they can be explored them the same way using the Ikomia API. For a more in-depth understanding of managing algorithm outputs, please refer to the documentation.

import ikomia
from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_grounding_dino", auto_connect=True)

# Run on your image  
wf.run_on(url="https://raw.githubusercontent.com/Ikomia-dev/notebooks/main/examples/img/img_dog.png")

# Iterate over outputs
for output in algo.get_outputs():
    # Print information
    print(output)
    # Export it to JSON
    output.to_json()

⏩ Advanced usage

Check out the Grounding Dino blog post for more information on this algorithm.

Developer

Ikomia

License

Apache License 2.0

Read license full text

A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.

Permissions	Conditions	Limitations
Commercial use	License and copyright notice	Trademark use
Modification	State changes	Liability
Distribution		Warranty
Patent use
Private use

This is not legal advice: this description is for informational purposes only and does not constitute the license itself. Provided by choosealicense.com.