This n8n workflow automates the process of scraping web content from a list of URLs, converting the content into markdown format, and organizing it into Google Drive documents. It is designed to streamline content collection for knowledge bases, competitive analysis, or website migration projects. The workflow begins either through a chat trigger or manual input, retrieves URLs from a Google Sheets template, and then sequentially scrapes each webpage using the Firecrawl API. Valid URLs are processed to extract content in markdown, which is then saved as individual Google Docs in a specified Drive folder. The workflow updates the Google Sheets with status markers to track progress and prevents duplicate scraping. Upon completion, it sends a web response with a link to the scraped content folder, providing a clear end-of-process notification.
Automated Web Content Scraping to Google Drive
Node Count | 6 – 10 Nodes |
---|---|
Nodes Used | @mendable/n8n-nodes-firecrawl.firecrawl, @n8n/n8n-nodes-langchain.chatTrigger, filter, googleDrive, googleSheets, if, respondToWebhook, splitInBatches, stickyNote |
Reviews
There are no reviews yet.