Image Object Detection and Bounding Box Annotation Workflow

somdn_product_page

This n8n workflow demonstrates how to perform prompt-based object detection within an image using Google’s Gemini 2.0 API and subsequently annotate detected objects with bounding boxes directly on the image. The workflow begins with a manual trigger, allowing users to test the process with any image.

The first step involves retrieving the image and extracting its width and height using the Get Image Info node. The image is fetched via an HTTP request, which can be any suitable image URL, especially those with clear subjects for accurate detection.

Next, the image data is sent to the Gemini 2.0 API through an HTTP POST request, with a prompt asking the AI to detect all bounding boxes of specific objects, such as rabbits. The API responds with normalized coordinates (0-1000 range) of detected objects.

The workflow then scales these normalized coordinates back to the original image dimensions using a JavaScript Code node, ensuring the annotations align accurately. Subsequently, these bounding box coordinates are used by the Draw Bounding Boxes node to overlay rectangles onto the original image, highlighting the detected objects.

Finally, a Sticky Note node provides an explanatory note with visual aids to summarize the process. This setup is ideal for scenarios requiring automated image annotation, such as content moderation, object tracking, or visual data analysis, especially when combined with prompt-based AI detection.

Node Count

11 – 20 Nodes

Nodes Used

code, editImage, httpRequest, manualTrigger, set, stickyNote

Reviews

There are no reviews yet.

Be the first to review “Image Object Detection and Bounding Box Annotation Workflow”

Your email address will not be published. Required fields are marked *