pip install ikomia
infer_florence_2_segmentation
About
Run florence 2 segmentation with or without text prompt
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks. With this algorithm you can leverage Florence-2 for instance segmentation:
🚀 Use with Ikomia API
1. Install Ikomia API
We strongly recommend using a virtual environment. If you're not sure where to start, we offer a tutorial here.
2. Create your workflow
from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display
# Init your workflow
wf = Workflow()
# Add algorithm
algo = wf.add_task(name="infer_florence_2_segmentation", auto_connect=True)
# Run on your image
wf.run_on(url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true")
# Display results
display(algo.get_image_with_mask_and_graphics())
☀️ Use with Ikomia Studio
Ikomia Studio offers a friendly UI with the same features as the API.
- If you haven't started using Ikomia Studio yet, download and install it from this page.
- For additional guidance on getting started with Ikomia Studio, check out this blog post.
📝 Set algorithm parameters
- model_name (str) - default 'microsoft/Florence-2-base': Name of the Florence-2 pre-trained model. Other models available:
- microsoft/Florence-2-large
- microsoft/Florence-2-base-ft
- microsoft/Florence-2-large-ft
- task_prompt (str) - default 'REFERRING_EXPRESSION_SEGMENTATION': Type of the segmentation task. List of the task available:
- REFERRING_EXPRESSION_SEGMENTATION
- REGION_TO_SEGMENTATION ; format is '<loc_x1><loc_y1><loc_x2><loc_y2>', [x1, y1, x2, y2] is the quantized corrdinates in [0, 999].
- prompt (str): Text input to guide the object detection task.
- num_beams (int) - default '3': By specifying a number of beams higher than 1, you are effectively switching from greedy search to beam search. This strategy evaluates several hypotheses at each time step and eventually chooses the hypothesis that has the overall highest probability for the entire sequence. This has the advantage of identifying high-probability sequences that start with a lower probability initial tokens and would’ve been ignored by the greedy search.
- do_sample (bool) - default 'False': If set to True, this parameter enables decoding strategies such as multinomial sampling, beam-search multinomial sampling, Top-K sampling and Top-p sampling. All these strategies select the next token from the probability distribution over the entire vocabulary with various strategy-specific adjustments.
- early_stopping (bool) - default 'False': Controls the stopping condition for beam-based methods, like beam-search. It accepts the following values: True, where the generation stops as soon as there are num_beams complete candidates; False, where an heuristic is applied and the generation stops when is it very unlikely to find better candidates; "never", where the beam search procedure only stops when there cannot be better candidates (canonical beam search algorithm).
- cuda (bool): If True, CUDA-based inference (GPU). If False, run on CPU. Optionally, you can load a custom model:
Parameters should be in strings format when added to the dictionary.
from ikomia.dataprocess.workflow import Workflow
from ikomia.utils.displayIO import display
# Init your workflow
wf = Workflow()
# Add algorithm
algo = wf.add_task(name="infer_florence_2_segmentation", auto_connect=True)
algo.set_parameters({
"model_name":"microsoft/Florence-2-large",
"task_prompt":"REFERRING_EXPRESSION_SEGMENTATION",
"prompt":"a green car",
"max_new_tokens":"1024",
"num_beams":"3",
"do_sample":"False",
"early_stopping":"False",
"cuda":"True"
})
# Run on your image
wf.run_on(url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true")
# Display results
display(algo.get_image_with_mask_and_graphics())
🔍 Explore algorithm outputs
Every algorithm produces specific outputs, yet they can be explored them the same way using the Ikomia API. For a more in-depth understanding of managing algorithm outputs, please refer to the documentation.
from ikomia.dataprocess.workflow import Workflow
# Init your workflow
wf = Workflow()
# Add algorithm
algo = wf.add_task(name="infer_florence_2_segmentation", auto_connect=True)
# Run on your image
wf.run_on(url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg?download=true")
# Iterate over outputs
for output in algo.get_outputs():
# Print information
print(output)
# Export it to JSON
output.to_json()
Developer
Ikomia
License
MIT License
A short and simple permissive license with conditions only requiring preservation of copyright and license notices. Licensed works, modifications, and larger works may be distributed under different terms and without source code.
Permissions | Conditions | Limitations |
---|---|---|
Commercial use | License and copyright notice | Liability |
Modification | Warranty | |
Distribution | ||
Private use |
This is not legal advice: this description is for informational purposes only and does not constitute the license itself. Provided by choosealicense.com.