Rottentomatoes Reviews Scraper

Rottentomatoes Reviews Scraper collects structured critic and audience review data from Rotten Tomatoes pages, so you can analyze opinions at scale without manual copy-paste. It’s built for research workflows where consistent fields (scores, freshness, verification, spoilers, and more) matter. Use this Rotten Tomatoes reviews scraper to power dashboards, datasets, and media insights.

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for rottentomatoes-reviews-scraper you've just found your team — Let’s Chat. 👆👆

Introduction

This project extracts user and critic reviews from Rotten Tomatoes movie and TV pages and returns normalized, analysis-ready records. It solves the common problem of having review data scattered across paginated pages with inconsistent presentation. It’s for developers, analysts, and researchers who need repeatable review collection for sentiment, scoring trends, and audience-vs-critic comparisons.

Built for Review Intelligence

Supports both critic and audience/verified audience review pages via URL inputs
Captures structured review metadata (scores, freshness, verification, spoiler/profanity flags)
Handles large pagination ranges with resilient retries and consistent output schema
Produces clean datasets suitable for BI tools, ML pipelines, and monitoring jobs
Works across multiple titles by accepting a list of review page URLs

Features

Feature	Description
Multi-URL scraping	Process multiple Rotten Tomatoes review pages in a single run for batch collection.
Audience + critic coverage	Extract reviews for audience, verified audience, and critic sections based on the provided URLs.
Rich review metadata	Collects rating/score, freshness flags, verification, spoiler/profanity indicators, and IDs.
Stable pagination handling	Automatically traverses review pages to gather complete review sets reliably.
Retry & resilience	Built-in retries and defensive parsing to reduce failures from transient network or layout issues.
Clean, analysis-ready output	Produces consistent JSON records ideal for sentiment analysis and trend reporting.
Proxy-ready networking	Optional proxy support for safer large-scale runs and reduced rate-limit risk.

What Data This Scraper Extracts

Field Name	Field Description
rating	Normalized rating value (often numeric, may include halves depending on page type).
score	Computed score value when available (can match rating for audience entries).
originalScore	Raw score as shown on the page (string form for exact fidelity).
quote	The full review text/quote provided by the reviewer.
reviewId	Unique identifier for the review entry.
reviewUrl	Direct URL to the review when available.
creationDate	Display date for when the review was created/published.
isVerified	Indicates whether the review is verified (e.g., verified audience).
isSuperReviewer	Indicates elevated reviewer status where supported.
isTopCritic	Indicates top critic status for critic reviews where supported.
isFresh	Indicates “fresh” classification when provided by the page.
isRotten	Indicates “rotten” classification when provided by the page.
hasSpoilers	True if the review is flagged as containing spoilers.
hasProfanity	True if the review is flagged as containing profanity.
userDisplayName	Display name of the reviewer (audience).
userId	Reviewer identifier when present.
userRealm	Source realm/provider label when present (e.g., ticketing/account provider).
name	Reviewer name field (normalized convenience alias when present).
publicationName	Publication/outlet name for critic reviews when present.
criticName	Critic identity field when present.
avatarImageUrl	Reviewer/critic profile image URL when present.

Example Output

[
      {
            "rating": 5,
            "quote": "Fun, entertaining, and suspenseful! Worth every penny! I think this movie can be enjoyed by anyone not just motor sport enthusiasts. Go watch it - youll be glad you did!",
            "reviewId": "b595b8b1-32ad-49da-9856-65402a765869",
            "isVerified": true,
            "isSuperReviewer": false,
            "hasSpoilers": false,
            "hasProfanity": false,
            "score": 5,
            "creationDate": "Jun 29, 2025",
            "userDisplayName": "Donnie",
            "userRealm": "Fandango",
            "userId": "2A48F1D8-971E-4636-BB2F-367192ED6B1C",
            "originalScore": "5",
            "name": "Donnie"
      },
      {
            "rating": 2.5,
            "quote": "First half of the movie is very slow. Pacing is off. Movie could be about an hour shorter overall. Great cinematography and driving sequences but the story is stale and predictable with lots of contrived elements.",
            "reviewId": "4a85e68d-fd03-4821-9c22-b604b6102646",
            "isVerified": true,
            "isSuperReviewer": false,
            "hasSpoilers": false,
            "hasProfanity": false,
            "score": 2.5,
            "creationDate": "Jun 29, 2025",
            "userDisplayName": "Josh",
            "userRealm": "Fandango",
            "userId": "847768b1-452e-4d3a-aae3-95d308383088",
            "originalScore": "2.5",
            "name": "Josh"
      }
]

Directory Structure Tree

rigelbytes/rottentomatoes-reviews-scraper (IMPORTANT :!! always keep this name as the name of the apify actor !!! Rottentomatoes Reviews Scraper )/
├── src/
│   ├── main.js
│   ├── router/
│   │   ├── routes.js
│   │   └── validators.js
│   ├── crawlers/
│   │   ├── reviewsCrawler.js
│   │   └── pagination.js
│   ├── extractors/
│   │   ├── parseReview.js
│   │   ├── parseCriticReview.js
│   │   ├── parseAudienceReview.js
│   │   └── normalizeFields.js
│   ├── utils/
│   │   ├── http.js
│   │   ├── retry.js
│   │   ├── logger.js
│   │   └── url.js
│   └── config/
│       ├── defaults.json
│       └── schema.json
├── data/
│   ├── input.example.json
│   └── output.sample.json
├── tests/
│   ├── parseReview.test.js
│   └── normalizeFields.test.js
├── .env.example
├── .gitignore
├── package.json
├── package-lock.json
└── README.md

Use Cases

Media analysts use it to track audience vs critic sentiment over time, so they can spot polarization and momentum early.
Data teams use it to build movie review datasets for NLP, so they can train sentiment and topic models with consistent fields.
Marketing teams use it to monitor release reception across titles, so they can adjust messaging based on real review signals.
Researchers use it to study spoiler/profanity prevalence and review behavior, so they can quantify qualitative patterns at scale.
Product teams use it to compare freshness and scoring distributions, so they can create ranking and recommendation experiments.

FAQs

How do I choose the right URLs to scrape? Use full review-page URLs for the titles you want, including any query parameters that select the review type (for example, verified audience). The scraper follows pagination from those starting points and collects all accessible reviews for each URL.

Does it scrape both critic and audience reviews automatically? It scrapes what your input URLs point to. Provide critic review URLs for critic data and audience/verified audience URLs for audience data. You can include multiple URLs in one run to mix both.

What should I do if I hit rate limits or partial loads? Enable proxy usage and reduce concurrency if your environment supports it. This scraper is designed with retries, but proxies and moderate request pacing improve stability on large runs.

Why do some fields appear missing in certain records? Some properties are page-type specific (e.g., top critic, publication name). When the source page doesn’t expose a field for a review type, the scraper outputs only what is available while keeping the schema consistent.

Performance Benchmarks and Results

Primary Metric: Typical throughput of 250–600 reviews/minute depending on page complexity, pagination depth, and network conditions.

Reliability Metric: 97–99% successful page fetch rate on stable networks with retries enabled; higher stability when using proxies for larger batches.

Efficiency Metric: Low-memory streaming extraction that writes records incrementally, keeping runs stable even on long paginations.

Quality Metric: High field completeness for audience reviews (IDs, quotes, scores, flags) with consistent normalization across records; critic-specific fields populate when present on critic pages.

"Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time."

Nathan Pennington
Marketer
★★★★★

"Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on."

Eliza
SEO Affiliate Expert
★★★★★

"Exceptional results, clear communication, and flawless delivery.
Bitbash nailed it."

Syed
Digital Strategist
★★★★★

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Rottentomatoes Reviews Scraper

Introduction

Built for Review Intelligence

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Rottentomatoes Reviews Scraper

Introduction

Built for Review Intelligence

Features

What Data This Scraper Extracts

Example Output

Directory Structure Tree

Use Cases

FAQs

Performance Benchmarks and Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages