Data Processing & Analysisbeginner
October 14, 2025
5 min read
30 minutes
Extract Amazon Product Data Using Scrape.do, GPT, and Google Sheets: A Step-by-Step Workflow
Automate Amazon product tracking with n8n: scrape prices, reviews, ratings, clean data with AI, save to Google Sheets, and get instant notifications.
By Kazi Sakib

Tracking product prices, reviews, and ratings across Amazon can feel like a full time job. You want to monitor competitor products, analyze market trends, or keep tabs on inventory details, but manually copying data from product pages? That gets old fast.
This n8n workflow transforms that tedious process into a smooth, automated system. Feed it a list of Amazon product URLs, and it scrapes the data, cleans it with AI, saves everything to Google Sheets, and even sends you notifications via Telegram and WhatsApp. No copy-paste marathons. No spreadsheet headaches. Just clean product data delivered exactly where you need it.
What You Need Before Starting
Before diving into the workflow build, make sure you have these essentials ready:
API Credentials
- Scrape.do API token: This handles the heavy lifting of fetching Amazon HTML without getting blocked
- OpenAI API key: Powers the GPT model that cleans and structures your scraped data
- Google Sheets OAuth2: Connects n8n to your spreadsheet for reading URLs and writing results
- Telegram Bot token: Optional but useful for instant notifications
- WhatsApp Business API: Another notification channel if you prefer WhatsApp
Key Components
This workflow uses several n8n nodes working in harmony:
- Manual Trigger: Kicks off the workflow when you're ready
- Google Sheets: Reads product URLs and writes extracted data
- Split in Batches: Processes URLs one at a time to avoid overload
- HTTP Request: Communicates with Scrape.do API
- HTML Extractor: Pulls specific elements from the page HTML
- LangChain LLM: Leverages GPT to clean and standardize data
- Code Node: Handles JSON transformation
- Telegram & WhatsApp: Deliver instant notifications
Building Your Amazon Scraper: Step by Step
Step 1: Set Up Your Product URL Source
Start by creating a Google Sheet with a column named "Products" containing your Amazon product URLs. The workflow begins with a Manual Trigger node, which connects to a Google Sheets node configured to read from your spreadsheet.

img_1.png
The Google Sheets node pulls every URL you've listed, creating the foundation for your automation. Think of this as loading bullets into a chamber, each URL ready to be processed in sequence.
Step 2: Loop and Scrape Each Product
Here's where the magic starts. The Split in Batches node takes your URL list and processes them one by one. This prevents overwhelming the scraper and keeps everything running smoothly.
Each URL gets passed to an HTTP Request node configured with the Scrape.do API endpoint. The request includes your API token, the product URL, and parameters like geoCode set to "us" for US-based results. The timeout is set to 60 seconds because Amazon pages can be hefty.
The Scrape.do API handles the complexity of rendering JavaScript and avoiding blocks, returning clean HTML you can actually parse.
Step 3: Extract the Good Stuff
Raw HTML is messy. The HTML Extractor node uses CSS selectors to pinpoint exactly what you need from each product page:
- Product Title: Grabbed from #productTitle or h1 elements
- Price: Extracted from .a-price elements and their variants
- Rating: Pulled from star rating elements
- Review Count: Found in customer review sections
- Feature Bullets: The key selling points listed by sellers
- Product Description: The detailed description text
This node outputs a structured object, but the data still needs cleaning. Ratings might say "4.5 out of 5 stars" instead of just "4.5", and review counts could be "1,234 ratings" instead of a clean number.

img_2.png
Step 4: Let AI Clean Your Data
This is where GPT-4o-mini enters the picture. The LangChain LLM node receives your extracted data and follows detailed instructions to standardize everything.
The AI prompt tells GPT exactly how to handle each field. Extract numeric values from ratings. Remove commas from review counts. Combine feature bullets or use product descriptions for a concise 150-character summary. Handle missing data gracefully with null values or "No description" defaults.
The model is configured with a temperature of 0 for consistent results, a 500 token limit to keep things efficient, and JSON output format to ensure clean, parseable responses.
A Code node then transforms the output, extracting the nested "output" object to make the data ready for your spreadsheet.
Step 5: Save and Notify
The cleaned data flows into another Google Sheets node, this time configured to append rows. Each product gets written with columns for Name, Description, Rating, Reviews, Price, and image URLs.
But why stop there? The workflow branches to send notifications through both Telegram and WhatsApp. You get instant messages with product details and a direct link back to the Amazon page. Perfect for quick reviews or sharing with team members.
After notifications are sent, the workflow loops back to process the next URL in your list. Rinse and repeat until every product is scraped, cleaned, saved, and announced.

img_3.png
Why This Workflow Changes the Game
The real power here isn't just automation. It's the combination of reliable scraping, intelligent data cleaning, and multi-channel distribution.
For E-commerce Teams: Monitor competitor pricing and product launches without manual checks. Set this to run daily and track market movements automatically.
For Market Researchers: Build datasets of product information across categories. Analyze rating trends, price points, and feature patterns at scale.
For Affiliate Marketers: Keep your product comparison pages updated with fresh data. Never show outdated prices or discontinued products again.
For Dropshippers: Track supplier products and pricing changes in real time. Get alerted when prices drop or inventory shifts.
The workflow handles all the messy parts: dealing with Amazon's complex HTML structure, cleaning inconsistent data formats, and organizing everything into a usable format. You just maintain a list of URLs and let n8n do the heavy lifting.
Making It Your Own
This workflow is a starting point, not a rigid template. Want to track additional fields like seller information or shipping details? Add more CSS selectors to the HTML Extractor node. Need to filter products by rating or price? Insert a Filter node after the AI cleaning step. Prefer Slack notifications instead of Telegram? Swap the nodes.
The modular nature of n8n means you can extend this workflow in countless directions. Add a scheduler to run it automatically every morning. Connect it to a database instead of Google Sheets. Build dashboards with the data. The infrastructure is here. The possibilities are yours.
Scraping Amazon product data doesn't have to be complicated. With the right tools working together, it becomes almost trivial. Feed in URLs, get structured data. That's the promise this workflow delivers on, and it does so without breaking a sweat.
Share this article
Help others discover this content
Tap and hold the link button above to access your device's native sharing options
More in Data Processing & Analysis
Continue exploring workflows in this category

Data Processing & Analysisintermediate
1 min read
Generate Creative Solutions with Dual AI Agents, Randomization & Redis - Workflow
Kazi Sakib
Oct 21
Est: 35 minutes

Data Processing & Analysisadvanced
1 min read
Fully Automated Tech News Publishing Pipeline with n8n, OpenAI, Google Workspace & Slack
Nayma Sultana
Oct 20
Est: 1 hour

Data Processing & Analysisintermediate
1 min read
Build an AI-Powered Data Analytics Department with n8n and OpenAI: A Multi-Agent Workflow Guide
Kazi Sakib
Oct 20
Est: 45 minutes