An enterprise-grade competitor intelligence solution for retail and ecommerce.
This platform scrapes data from major retailers (e.g., Walmart, Amazon, Target), processes it via ETL workflows, stores in PostgreSQL, and provides interactive analytics with Streamlit.
| Dashboard Home | Filter Side-bar | Cards And Products |
|---|---|---|
![]() |
![]() |
![]() |
| Top Products | Insights | bestseller |
|---|---|---|
![]() |
![]() |
![]() |
- Multi-site scraping with Scrapy + Playwright (Walmart, Amazon, Target).
- ETL Pipeline → Extract (scraping), Transform (cleaning/normalization), Load (PostgreSQL).
- Streamlit Dashboard → Visualize competitor trends, and export data.
- Scraping: Scrapy + Playwright
- Database: PostgreSQL
- ETL: Python (data cleaning + normalization)
- Dashboard: Streamlit (charts, tables, exports)
-
Database Integration
- Real-time connection with PostgreSQL
- Auto data cleaning and type conversion
-
Interactive Filters
- 🔍 Search products by title
- 📦 Filter by platform
- 💲 Price range slider
- 🏷️ Bestseller filter (Yes/No)
↕️ Sort by price or rating
-
Smart Product Cards
- Product image, title, and price
- ⭐ Star rating with review counts
- Platform and bestseller badges
- 🔗 Direct “Buy Now” product link
-
Pagination
- 8 products per page (2 rows × 4 columns)
- Sidebar page navigation
-
Top Products
- ⭐ Top 5 Rated Products
- 💬 Top 5 Most Reviewed Products
-
Data Export
- ⬇️ Download filtered results as CSV
-
📊 Visual Insights (Altair Charts)
- Platform share (bar chart)
- Price distribution (histogram)
- Bestseller breakdown (pie chart)
- Rating vs Price (scatter chart with tooltips)
- Average price per platform (bar chart)
git clone https://github.com/your-username/retail-intelligence-platform.git
cd retail-intelligence-platformpython -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -r requirements.txtplaywright installretail_intelligence/
│
├── retail_intelligence/ # Scrapy project files
│ ├── settings.py
│ ├── spiders/
│ └── ...
├── .env # 🔑 Environment variables here
├── scrapy.cfg
├── requirements.txt
└── README.mdSCRAPEOPS_API_KEY=your_scrapeops_api_keyCREATE DATABASE retail_intelligence;Configure connection in .env file:
DB_HOST=localhost
DB_PORT=5432
DB_NAME=retail_intelligence
DB_USER=your_username
DB_PASS=your_password
7. Run the Scrapy spider with custom arguments to fetch products from the given URLs and save their HTML content locally...
scrapy crawl snapshot_spider -a query="mobile" -a product_limit=150 8. After completing step 7, run the parser spider to parse the saved HTML files, clean the data, and store it in your PostgreSQL database. You can run it in two ways:
# Store results directly in PostgreSQL
scrapy crawl snapshot_parser_spider
# Or save output to a JSON file
scrapy crawl snapshot_parser_spider -o data.json streamlit run dashboard/app.pyThis project is licensed under the MIT License – completely free for both personal and commercial use.
See the LICENSE file for details.
👨💻 Created & maintained by Shahzaib Ali
📬 For collaboration or freelance work: sa4715228@gmail.com
Contributions, issues, and feature requests are welcome!
Feel free to open an issue or submit a PR.





