Amazon Product Scraper & Analytics Dashboard

A Python web scraping application that collects product data from Amazon India and displays it in a beautiful vintage-styled analytics dashboard.

Features

Multi-Category Scraping: Scrapes 5 product categories (Laptops, Headphones, Keyboards, Gaming Mice, Monitors)
Concurrent Scraping: Uses ThreadPoolExecutor for fast parallel scraping
SQLite Database: Stores product data with name, price, rating, image URL, and timestamp
Beautiful Dashboard: Flask web app with vintage notebook aesthetic
Rich Analytics:
- 4 stat cards (total products, avg/min/max prices)
- Bar chart (average price by category)
- Pie chart (product distribution)
- Histogram (price distribution)
- Data table with all products

Tech Stack

Python 3.x
BeautifulSoup4 - HTML parsing
Requests - HTTP client
Flask - Web framework
Pandas - Data manipulation
Matplotlib & Seaborn - Data visualization
SQLite3 - Database

Installation

Clone the repository:

git clone <your-repo-url>
cd web_scraper

Create and activate virtual environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt

Usage

Quick Start (Single Command)

./run.sh

This script will:

Run the scraper to collect ~110 products from Amazon
Start the Flask dashboard at http://127.0.0.1:5000/

Manual Run

Run scraper only:

python main.py

Run dashboard only:

python web/app.py

Then open http://127.0.0.1:5000/ in your browser.

Project Structure

web_scraper/
├── main.py                 # Main scraping orchestration
├── requirements.txt        # Python dependencies
├── run.sh                 # Convenience script to run everything
├── scrapers/
│   ├── base_scraper.py    # Abstract base scraper class
│   └── specific_scraper.py # Amazon product scraper implementation
├── utils/
│   └── database_manager.py # SQLite database context manager
└── web/
    ├── app.py             # Flask application
    ├── static/
    │   └── style.css      # Additional styles (if needed)
    └── templates/
        └── index.html     # Dashboard HTML template

Data Collected

For each product:

Product name
Price (₹)
Rating
Image URL
Source URL
Scraped timestamp

Dashboard Features

Vintage Aesthetic: Beige/brown color palette with notebook-style design
Responsive Design: Works on desktop, tablet, and mobile
Real-time Stats: Live calculations from database
Visual Analytics: Multiple chart types for data insights
Clean Tables: Top 50 products displayed with full details

Notes

Scraper uses realistic browser headers to avoid detection
Concurrent scraping with 5 workers for optimal performance
Data persists in data.db SQLite database
Charts use vintage color palette matching the dashboard theme

License

MIT License - Feel free to use and modify!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

Repository files navigation

Amazon Product Scraper & Analytics Dashboard

Features

Tech Stack

Installation

Usage

Quick Start (Single Command)

Manual Run

Project Structure

Data Collected

Dashboard Features

Notes

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
scrapers		scrapers
utils		utils
web		web
.gitignore		.gitignore
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
run.sh		run.sh

Uh oh!

Uh oh!

mohar-xe/Amazon-product-scraper

Folders and files

Latest commit

History

Repository files navigation

Amazon Product Scraper & Analytics Dashboard

Features

Tech Stack

Installation

Usage

Quick Start (Single Command)

Manual Run

Project Structure

Data Collected

Dashboard Features

Notes

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages