This n8n workflow automates the process of generating a narrated video by extracting frames from a video file, creating a script using AI, and producing a voiceover using text-to-speech technology. It is ideal for content creators or marketers looking to automate video narration or enhance multimedia presentations.

The workflow begins with manually triggering the process, followed by downloading a video from a specified URL. The ‘Capture Frames’ Python node then extracts evenly distributed frames from the video to ensure a representative sampling. These frames are split into batches, resized for consistency, and then processed by a multimodal language model to generate a cohesive narration script in the style of a specified narrator (e.g., David Attenborough).

Subsequently, the full script is sent to OpenAI’s text-to-speech API to generate an audio narration. The final step uploads the generated voiceover to Google Drive for easy access and integration into videos. Throughout the workflow, various notes provide explanations and performance tips, making it clear when and why to optimize resources.

This automation streamlines the creation of narrated videos, saving time and effort while leveraging powerful AI tools for professional-grade multimedia content.

Node Count	>20 Nodes
Nodes Used	@n8n/n8n-nodes-langchain.chainLlm, @n8n/n8n-nodes-langchain.lmChatOpenAi, @n8n/n8n-nodes-langchain.openAi, aggregate, code, convertToFile, editImage, googleDrive, httpRequest, manualTrigger, splitInBatches, splitOut, stickyNote, wait