This n8n workflow automates the process of extracting textual data from PDF documents and images stored in Google Drive, converting it into structured CSV files, and saving the results back to Google Drive. The primary goal is to streamline data extraction workflows, especially useful for processing bank statements, invoices, or receipts without manual data entry.

The workflow begins with a trigger that monitors a specified Google Drive folder for new files. When a file is added, it uses a switch node to route the process based on the MIME type — handling PDFs and images differently.

For PDF files, it downloads the file from Google Drive, extracts text data, and sends the content to an AI language model (Vertex AI or Gemini) for advanced data parsing. For images, it downloads the image, then sends it to Vertex AI for optical character recognition (OCR) to extract the textual content.

Once the text has been extracted from either source, the workflow makes an HTTP POST request to an OpenRouter API, instructing the AI to read the transactions, categorize them, and output a CSV format starting with a header row. The CSV data is then converted into a file format suitable for upload.

Finally, the generated CSV files are uploaded back to a designated Google Drive folder for easy access and further processing. Additionally, sticky notes provide instructions on setting up Google Drive folders, Google Cloud permissions, and OpenRouter API configuration.

This automation is ideal for finance teams, administrative assistants, or any organization that needs to regularly process large volumes of financial documents, receipts, or invoices efficiently and accurately—eliminating manual data entry and reducing errors.

Node Count	11 – 20 Nodes
Nodes Used	@n8n/n8n-nodes-langchain.chainLlm, @n8n/n8n-nodes-langchain.lmChatGoogleGemini, convertToFile, extractFromFile, googleDrive, googleDriveTrigger, httpRequest, stickyNote, switch