Video Narration Generation with Frame Extraction and TTS

This workflow automates the process of extracting frames from a video, generating a narration script using AI, and creating a voiceover clip. It starts by downloading a video from a URL, then uses OpenCV within a Python node to extract evenly distributed frames—a maximum of 90—to ensure efficient processing. These frames are batched into groups of 15, resized, and sent to an AI language model (GPT-4) in loops to generate a cohesive narration script that mimics a style like David Attenborough. The complete script is then converted into an audio file via OpenAI’s text-to-speech capabilities. The final voiceover MP3 is uploaded to Google Drive. This workflow is ideal for creating narrated videos from existing footage, useful for content creators, educators, or marketing teams seeking automated video narration production.

Node Count	>20 Nodes
Nodes Used	@n8n/n8n-nodes-langchain.chainLlm, @n8n/n8n-nodes-langchain.lmChatOpenAi, @n8n/n8n-nodes-langchain.openAi, aggregate, code, convertToFile, editImage, googleDrive, httpRequest, manualTrigger, splitInBatches, splitOut, stickyNote, wait