Context-Aware Document Processing from Google Drive to Pinecone

somdn_product_page

This workflow automates the process of extracting, splitting, and embedding document data from Google Drive into a vector database for advanced search and retrieval. Starting with a manual trigger, it retrieves a Google document, extracts its text, and segments it into sections based on custom delimiters. Each section is then enriched with a contextual summary generated by an OpenRouter Chat Model, which helps situate the text within the overall document. The combined context and section are converted into vector embeddings using Google Gemini, then stored in a Pinecone vector database, enabling efficient similarity searches. This setup is particularly useful for managing large documents, research papers, or knowledge bases where contextual searchability is key.

Node Count

11 – 20 Nodes

Nodes Used

@n8n/n8n-nodes-langchain.agent, @n8n/n8n-nodes-langchain.documentDefaultDataLoader, @n8n/n8n-nodes-langchain.embeddingsGoogleGemini, @n8n/n8n-nodes-langchain.lmChatOpenRouter, @n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter, @n8n/n8n-nodes-langchain.vectorStorePinecone, code, extractFromFile, googleDrive, manualTrigger, set, splitInBatches, splitOut, stickyNote

Reviews

There are no reviews yet.

Be the first to review “Context-Aware Document Processing from Google Drive to Pinecone”

Your email address will not be published. Required fields are marked *