Image Analysis with Local Ollama Vision Models

somdn_product_page

This workflow enables detailed image analysis by processing images stored on Google Drive through locally hosted Ollama Vision Language Models. It automates the download of an image, runs multiple models (e.g., Granite3.2, Llama3.2, Gemma3) to generate comprehensive descriptions, and saves the results in Google Docs. The process involves setting prompts for exhaustive analysis, including object inventory, contextual insights, spatial relationships, and text extraction, all formatted in markdown for clarity.

The workflow starts with a manual trigger for testing, then proceeds to download a specified image file from Google Drive. It splits a list of vision models for processing in a loop, where each model analyzes the image and generates a detailed textual description based on the provided prompts. The descriptions are structured and stored in Google Docs, enabling easy collaboration and review. Additionally, sticky notes guide users through the process, highlighting key steps like downloading images and creating model lists.

This automation is ideal for developers, data analysts, real estate professionals, or AI enthusiasts who need in-depth image understanding, structured data extraction, and documentation. Use cases include real estate image analysis, product photography reviews, visual research, and AI training datasets.

Node Count

11 – 20 Nodes

Nodes Used

extractFromFile, googleDocs, googleDrive, httpRequest, manualTrigger, set, splitInBatches, splitOut, stickyNote

Reviews

There are no reviews yet.

Be the first to review “Image Analysis with Local Ollama Vision Models”

Your email address will not be published. Required fields are marked *