Vision-Based Web Scraper with AI and Google Sheets

somdn_product_page

This n8n workflow automates the process of extracting structured data from web pages using a combination of AI, web scraping, and Google Sheets. The workflow is triggered manually via a ‘Test workflow’ button but can be adapted with other triggers as needed. It begins by retrieving a list of URLs from a Google Sheet, then processes each URL by taking a full-page screenshot with ScrapingBee. The core component is an AI-powered agent (using Google Gemini-1.5-Pro) that analyzes the screenshot to extract key product information such as titles, prices, brands, and promotional details.

If the AI cannot accurately extract data from the image, the workflow falls back to fetching the HTML content of the page for additional scraping. The extracted data is then formatted into a structured JSON object, which is split into individual entries and appended to a dedicated results sheet in Google Sheets. This workflow is highly suitable for automating product data collection on e-commerce websites, saving time, and ensuring data accuracy for inventory or market analysis.

Node Count

>20 Nodes

Nodes Used

@n8n/n8n-nodes-langchain.agent, @n8n/n8n-nodes-langchain.lmChatGoogleGemini, @n8n/n8n-nodes-langchain.outputParserStructured, @n8n/n8n-nodes-langchain.toolWorkflow, executeWorkflowTrigger, googleSheets, httpRequest, manualTrigger, markdown, set, splitOut, stickyNote

Reviews

There are no reviews yet.

Be the first to review “Vision-Based Web Scraper with AI and Google Sheets”

Your email address will not be published. Required fields are marked *