This n8n workflow automates the process of extracting structured data from bank statements stored in Google Drive, transforming scanned or PDF documents into usable financial information. The workflow is designed to handle both digitally generated PDFs and scanned images, making it particularly useful for accurate data retrieval from various bank statement formats.

The process begins with a manual trigger or can be integrated with other systems as a trigger, then downloads a bank statement PDF from Google Drive. It converts each PDF page into images using an external web service, then unzips the resulting ZIP file into individual images. These images are resized to optimize them for optical recognition.

Next, the images are transcribed into markdown text by a vision-language model (like Google Gemini). This conversion captures all visible layout details, such as tables and headings, with high fidelity. The markdown text is then processed by a language model to extract specific financial data, such as deposit transactions, from the transcribed content.

Throughout the workflow, sticky notes provide detailed explanations and guidance, making it accessible for users looking to automate document processing and data extraction tasks, especially for scanned PDFs or complex financial reports.

Node Count	11 – 20 Nodes
Nodes Used	@n8n/n8n-nodes-langchain.chainLlm, @n8n/n8n-nodes-langchain.informationExtractor, @n8n/n8n-nodes-langchain.lmChatGoogleGemini, aggregate, code, compression, editImage, googleDrive, httpRequest, manualTrigger, sort, stickyNote