This n8n workflow is designed for efficient processing and semantic search of large documents stored in Google Drive. It automates the extraction, segmentation, contextual analysis, and vectorization of document chunks, enabling rapid retrieval and analysis in applications like AI-powered searches or knowledge management systems. The workflow begins when triggered manually, and it retrieves a Google Doc by its file ID. The document text is extracted, cleaned, and split into sections based on custom delimiters. For each section, it prepares contextual summaries using the OpenRouter Chat Model, which helps situate each chunk within the larger document. These enriched sections are then converted into embedding vectors with Google Gemini’s API and stored in a Pinecone vector database. This process allows highly relevant, context-aware search results across extensive text data. Practical use cases include advanced document search systems, knowledge base indexing, or AI assistants that require deep contextual understanding of large texts.
Context-Aware Document Chunking with Google Drive to Pinecone
Node Count | 11 – 20 Nodes |
---|---|
Nodes Used | @n8n/n8n-nodes-langchain.agent, @n8n/n8n-nodes-langchain.documentDefaultDataLoader, @n8n/n8n-nodes-langchain.embeddingsGoogleGemini, @n8n/n8n-nodes-langchain.lmChatOpenRouter, @n8n/n8n-nodes-langchain.textSplitterRecursiveCharacterTextSplitter, @n8n/n8n-nodes-langchain.vectorStorePinecone, code, extractFromFile, googleDrive, manualTrigger, set, splitInBatches, splitOut, stickyNote |
Reviews
There are no reviews yet.