This n8n workflow automates the process of generating captions for images using AI, overlaying those captions onto the images, and providing a visually enhanced output. Designed for content creators, marketers, or social media managers, the workflow begins with importing an image from a URL—such as from Pexels or other stock photo sites. It then employs Google’s Gemini AI model to analyze the image and generate a descriptive caption that includes context, who, when, and where details. After generating the caption, the workflow calculates optimal positioning to overlay the caption smoothly onto the image, usually at the bottom. This involves resizing and preparing the image, running custom code to determine text placement, and then drawing both background and text onto the image using n8n’s image editing nodes. The final result is an image with an embedded caption, ready for publication or reuse. This workflow showcases powerful automation for enriching visual content with AI-generated descriptions, useful for social media posts, documentation, or watermarking images.

Node Count	11 – 20 Nodes
Nodes Used	@n8n/n8n-nodes-langchain.chainLlm, @n8n/n8n-nodes-langchain.lmChatGoogleGemini, @n8n/n8n-nodes-langchain.outputParserStructured, code, editImage, httpRequest, manualTrigger, merge, stickyNote