infer_qwen2_5_vl

About

1.1.1

Apache-2.0

Run vision-language model series based on Qwen2.5

Task: VLM

VLM

Qwen

Qwen2.5

Vision-Language

Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

🚀 Use with Ikomia API

1. Install Ikomia API

We strongly recommend using a virtual environment. If you're not sure where to start, we offer a tutorial here.

pip install ikomia

2. Create your workflow

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_qwen2_5_vl", auto_connect=True)

# Run on your image  
wf.run_on(url='https://github.com/Ikomia-dev/notebooks/blob/main/examples/img/img_people_workspace.jpg?raw=true')

# Save output .json
qwen_output = algo.get_output(1)
qwen_output.save('qwen_output.json')

☀️ Use with Ikomia Studio

Ikomia Studio offers a friendly UI with the same features as the API.

If you haven't started using Ikomia Studio yet, download and install it from this page.
For additional guidance on getting started with Ikomia Studio, check out this blog post.

📝 Set algorithm parameters

Parameters	Description
`model_name`	Name or path of the Qwen VL model. Default: `"Qwen/Qwen2.5-VL-3B-Instruct"`.
`prompt`	Custom prompt to guide the model's response for the given image. Default: `"Describe the image in detail."`
`system_prompt`	System prompt to set the behavior and context for the model. Default: `"You are a helpful assistant."`
`cuda`	If True, CUDA-based inference (GPU). If False, run on CPU.
`do_sample`	Whether or not to use sampling ; use greedy decoding otherwise (return the word/token which has the highest probability). If set to `True`, token validation incorporates resampling for generating more diverse outputs. Acceptable values are `True` or `False`. Default: `False`.
`max_new_tokens`	The maximum numbers of tokens to generate, ignoring the number of tokens in the prompt. Default: `512`. (For `essais` reports, reducing this value can significantly speed up inference time. Lower values are recommended for `essais` to mitigate hallucinations.)
`temperature`	Sampling temperature for text generation. Default: `1`. (Only used if `--do_sample=True`.)
`top_p`	Top-p sampling parameter. Default: `1`. (Only used if `--do_sample=True`.)
`top_k`	Top-k sampling parameter. Default: `50`. (Only used if `--do_sample=True`.)
`repetition_penalty`	The parameter for repetition penalty. 1.0 means no penalty. . Default: `1.0`.

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_qwen2_5_vl", auto_connect=True)

algo.set_parameters({
    "model_name": "Qwen/Qwen2.5-VL-3B-Instruct",
    "cuda": "True",
    "prompt": "Describe the image in detail.",
    "max_new_tokens": "512", 
    "do_sample": "False",
    "temperature": "1",
    "top_p": "1",
    "top_k": "50",
    "repetition_penalty": "1.0"
})

# Run on your image  
wf.run_on(url='https://github.com/Ikomia-dev/notebooks/blob/main/examples/img/img_people_workspace.jpg?raw=true')

# Save output .json
qwen_output = algo.get_output(1)
qwen_output.save('qwen_output.json')

🔍 Explore algorithm outputs

Every algorithm produces specific outputs, yet they can be explored them the same way using the Ikomia API. For a more in-depth understanding of managing algorithm outputs, please refer to the documentation.

from ikomia.dataprocess.workflow import Workflow

# Init your workflow
wf = Workflow()

# Add algorithm
algo = wf.add_task(name="infer_qwen2_5_vl", auto_connect=True)

# Run on your image  
wf.run_on(url='https://github.com/Ikomia-dev/notebooks/blob/main/examples/img/img_people_workspace.jpg?raw=true')

# Iterate over outputs
for output in algo.get_outputs():
    # Print information
    print(output)
    # Export it to JSON
    output.to_json()

Developer

Ikomia

License

Apache License 2.0

Read license full text

A permissive license whose main conditions require preservation of copyright and license notices. Contributors provide an express grant of patent rights. Licensed works, modifications, and larger works may be distributed under different terms and without source code.

Permissions	Conditions	Limitations
Commercial use	License and copyright notice	Trademark use
Modification	State changes	Liability
Distribution		Warranty
Patent use
Private use

This is not legal advice: this description is for informational purposes only and does not constitute the license itself. Provided by choosealicense.com.