Skip to content

[Bug/Feature]: LUA script OOM error with Dragonfly causing workers to stop (requesting queue size limit) #3834

@NickZambrano

Description

@NickZambrano

Version

v5.70.1

Platform

NodeJS

What happened?

Description
Hi !
We are experiencing a critical issue in our production environment where workers completely stop consuming tasks when a massive influx of jobs saturates our Dragonfly instance.

We are using BullMQ Pro v7.39.1 with Dragonfly, configured with maxmemory-policy=noeviction.

Expected behavior
Ideally, if the queue reaches a defined memory or task limit, subsequent add operations should be rejected or dropped without crashing the LUA script execution for existing tasks. Workers should be able to continue consuming the tasks already present in the queue to clear the backlog.

Actual behavior
We get Out of Memory errors directly inside the LUA script executed by BullMQ (evalsha). Once this occurs, workers completely fail to consume any remaining tasks, effectively paralyzing the queue. The Dragonfly instance also throws LUA script errors downstream.
Ours worker consume jobs with these options:

      removeOnComplete: {
        age: 1,
        count: 1,
      },
      removeOnFail: {
        age: 1,
        count: 1,
      },

Feature Request
Is there a recommended pattern or an existing feature to define a hard limit on the queue size (e.g., maximum number of wait jobs) to prevent this OOM state? In our use case, it is completely acceptable to lose/drop incoming jobs if the queue is full, as long as the workers remain alive to process the existing backlog.

How to reproduce.

Configure Dragonfly with maxmemory-policy=noeviction.
Rapidly insert hundreds of thousands of tasks into a single queue.
The queue saturates the available memory.
BullMQ cannot consume any jobs.

Relevant log output

ReplyError: ERR Error running script (call to 861ba35a698a949808d371213e913903f59472ad): @user_script:247: Out of memory
    at parseError (/usr/src/A/node_modules/redis-parser/lib/parser.js:179:12)
    at parseType (/usr/src/A/node_modules/redis-parser/lib/parser.js:302:14) {
  command: {
    name: 'evalsha',
    args: [
      '861ba35a698a949808d371213e913903f59472ad',
      '8',
      '{tasks_queue_A}:AAAA:stalled',
      '{tasks_queue_A}:AAAA:wait',
      '{tasks_queue_A}:AAAA:active',
      '{tasks_queue_A}:AAAA:stalled-check',
      '{tasks_queue_A}:AAAA:meta',
      '{tasks_queue_A}:AAAA:paused',
      '{tasks_queue_A}:AAAA:marker',
      '{tasks_queue_A}:AAAA:events',
      '1',
      '{tasks_queue_A}:AAAA:',
      '1772119814080',
      '30000',
      ''
    ]
  }
}

Code of Conduct

  • I agree to follow this project's Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions