Automated PDF to HTML Conversion and URL Extraction Workflow

somdn_product_page

This n8n workflow automates the process of uploading a PDF file, converting it to HTML, extracting URLs from the content, and retrieving the HTML version for further use. The workflow begins with a form trigger that allows users to upload a PDF, which is then sent to PDF.co for conversion to HTML. The HTML URL is fetched via an HTTP request, and a JavaScript code node scans the HTML content to extract all embedded URLs using regex. Throughout the process, sticky notes are added as visual guides, making the workflow understandable. This automation is practical for scenarios such as processing uploaded PDFs, extracting links for SEO analysis, or content scraping tasks.

Node Count

6 – 10 Nodes

Nodes Used

code, formTrigger, httpRequest, n8n-nodes-pdfco.PDFco Api, stickyNote

Reviews

There are no reviews yet.

Be the first to review “Automated PDF to HTML Conversion and URL Extraction Workflow”

Your email address will not be published. Required fields are marked *