Automated Vision-Based Web Data Extraction with AI and Google Sheets

somdn_product_page

This n8n workflow automates the process of extracting structured data from web pages using a combination of AI, web scraping tools, and Google Sheets. It is designed to handle e-commerce product data but can be adapted for other use cases. The workflow begins with a manual trigger to initiate the process, which fetches a list of URLs from Google Sheets. For each URL, the workflow captures a full-page screenshot using ScrapingBee to enable vision-based data extraction through an AI model (Google Gemini-1.5-Pro). If the AI cannot extract all necessary data from the image, it falls back to retrieving and parsing the HTML content of the page. The extracted data—product titles, prices, brands, and promotional details—are structured into JSON format and appended to a Google Sheet, providing an organized and automated data collection system. This setup is useful for market research, price monitoring, or automating product data collection for eCommerce management.

Node Count

>20 Nodes

Nodes Used

@n8n/n8n-nodes-langchain.agent, @n8n/n8n-nodes-langchain.lmChatGoogleGemini, @n8n/n8n-nodes-langchain.outputParserStructured, @n8n/n8n-nodes-langchain.toolWorkflow, executeWorkflowTrigger, googleSheets, httpRequest, manualTrigger, markdown, set, splitOut, stickyNote

Reviews

There are no reviews yet.

Be the first to review “Automated Vision-Based Web Data Extraction with AI and Google Sheets”

Your email address will not be published. Required fields are marked *