Retail Intelligence Platform

An enterprise-grade competitor intelligence solution for retail and ecommerce.
This platform scrapes data from major retailers (e.g., Walmart, Amazon, Target), processes it via ETL workflows, stores in PostgreSQL, and provides interactive analytics with Streamlit.

📸 Streamlit Dashboard Screenshots

Dashboard Home	Filter Side-bar	Cards And Products

Top Products	Insights	bestseller

🚀 Scrapy Features

Multi-site scraping with Scrapy + Playwright (Walmart, Amazon, Target).
ETL Pipeline → Extract (scraping), Transform (cleaning/normalization), Load (PostgreSQL).
Streamlit Dashboard → Visualize competitor trends, and export data.

🛠️ Tech Stack

Scraping: Scrapy + Playwright
Database: PostgreSQL
ETL: Python (data cleaning + normalization)
Dashboard: Streamlit (charts, tables, exports)

🚀 Stream-Lit Features

Database Integration
- Real-time connection with PostgreSQL
- Auto data cleaning and type conversion
Interactive Filters
- 🔍 Search products by title
- 📦 Filter by platform
- 💲 Price range slider
- 🏷️ Bestseller filter (Yes/No)
- ↕️ Sort by price or rating
Smart Product Cards
- Product image, title, and price
- ⭐ Star rating with review counts
- Platform and bestseller badges
- 🔗 Direct “Buy Now” product link
Pagination
- 8 products per page (2 rows × 4 columns)
- Sidebar page navigation
Top Products
- ⭐ Top 5 Rated Products
- 💬 Top 5 Most Reviewed Products
Data Export
- ⬇️ Download filtered results as CSV
📊 Visual Insights (Altair Charts)
- Platform share (bar chart)
- Price distribution (histogram)
- Bestseller breakdown (pie chart)
- Rating vs Price (scatter chart with tooltips)
- Average price per platform (bar chart)

⚡ Setup Instructions

1. Clone the Repository

git clone https://github.com/your-username/retail-intelligence-platform.git
cd retail-intelligence-platform

2. Create Virtual Environment & Install Dependencies

python -m venv venv
source venv/bin/activate   # On Windows: venv\Scripts\activate
pip install -r requirements.txt

3. Install Playwright

playwright install

4. Create .env File

retail_intelligence/
│
├── retail_intelligence/   # Scrapy project files
│   ├── settings.py
│   ├── spiders/
│   └── ...
├── .env                   # 🔑 Environment variables here
├── scrapy.cfg
├── requirements.txt
└── README.md

5. Inside .env File

SCRAPEOPS_API_KEY=your_scrapeops_api_key

6. Setup PostgreSQL Database

CREATE DATABASE retail_intelligence;

Configure connection in .env file:

DB_HOST=localhost
DB_PORT=5432
DB_NAME=retail_intelligence
DB_USER=your_username
DB_PASS=your_password

7. Run the Scrapy spider with custom arguments to fetch products from the given URLs and save their HTML content locally...

scrapy crawl snapshot_spider -a query="mobile" -a product_limit=150

8. After completing step 7, run the parser spider to parse the saved HTML files, clean the data, and store it in your PostgreSQL database. You can run it in two ways:

# Store results directly in PostgreSQL
scrapy crawl snapshot_parser_spider
# Or save output to a JSON file
scrapy crawl snapshot_parser_spider -o data.json

9. After Streamlit installed.

streamlit run dashboard/app.py

🔹 License

This project is licensed under the MIT License – completely free for both personal and commercial use.
See the LICENSE file for details.

🔹 Author

👨‍💻 Created & maintained by Shahzaib Ali
📬 For collaboration or freelance work: sa4715228@gmail.com

🤝 Contributions

Contributions, issues, and feature requests are welcome!
Feel free to open an issue or submit a PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retail Intelligence Platform

📸 Streamlit Dashboard Screenshots

🚀 Scrapy Features

🛠️ Tech Stack

🚀 Stream-Lit Features

⚡ Setup Instructions

1. Clone the Repository

2. Create Virtual Environment & Install Dependencies

3. Install Playwright

4. Create .env File

5. Inside .env File

6. Setup PostgreSQL Database

7. Run the Scrapy spider with custom arguments to fetch products from the given URLs and save their HTML content locally...

8. After completing step 7, run the parser spider to parse the saved HTML files, clean the data, and store it in your PostgreSQL database. You can run it in two ways:

9. After Streamlit installed.

🔹 License

🔹 Author

🤝 Contributions

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
images		images
retail_intelligence		retail_intelligence
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Retail Intelligence Platform

📸 Streamlit Dashboard Screenshots

🚀 Scrapy Features

🛠️ Tech Stack

🚀 Stream-Lit Features

⚡ Setup Instructions

1. Clone the Repository

2. Create Virtual Environment & Install Dependencies

3. Install Playwright

4. Create .env File

5. Inside .env File

6. Setup PostgreSQL Database

7. Run the Scrapy spider with custom arguments to fetch products from the given URLs and save their HTML content locally...

8. After completing step 7, run the parser spider to parse the saved HTML files, clean the data, and store it in your PostgreSQL database. You can run it in two ways:

9. After Streamlit installed.

🔹 License

🔹 Author

🤝 Contributions

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages