Automated Web Scraper & Content Storage with n8n

somdn_product_page

This n8n workflow automates the process of scraping multiple web pages, extracting content, and storing it in Google Drive, making it highly useful for content aggregation, research, or website monitoring. Starting with a manual trigger, the workflow fetches a list of URLs from a sitemap or predefined list, processes each URL in batches, and filters URLs based on specific topics or pages. A web scraper tool then retrieves the webpage content, which is subsequently parsed to extract titles and markdown content via a custom JavaScript function. The extracted content is saved directly to Google Drive with clear naming conventions. This workflow is ideal for automating large-scale content collection, monitoring website updates, or building a curated content repository efficiently.

Node Count

11 – 20 Nodes

Nodes Used

code, filter, googleDrive, httpRequest, limit, manualTrigger, set, splitInBatches, splitOut, stickyNote, wait, xml

Reviews

There are no reviews yet.

Be the first to review “Automated Web Scraper & Content Storage with n8n”

Your email address will not be published. Required fields are marked *