Skip to content

Fix use-after-free when Worker.terminate() is called twice#29180

Open
robobun wants to merge 2 commits intomainfrom
farm/63a708f6/worker-double-terminate-uaf
Open

Fix use-after-free when Worker.terminate() is called twice#29180
robobun wants to merge 2 commits intomainfrom
farm/63a708f6/worker-double-terminate-uaf

Conversation

@robobun
Copy link
Copy Markdown
Collaborator

@robobun robobun commented Apr 11, 2026

Fuzzilli found a use-after-poison crash (Address:use-after-poison:bun-debug+0xf2efd21) when Worker.prototype.terminate() is called more than once.

Fixes #24580

Root cause

Worker::terminate() unconditionally calls WebWorker__notifyNeedTermination(impl_), which dereferences the Zig WebWorker struct.

When terminate() is called the first time, it sets requested_terminate on the Zig side. The worker thread observes this in start() (before the VM is even set up), takes the early-exit path, and calls exitAndDeinit()deinit(), which frees the WebWorker.

A second terminate() on the main thread then passes the freed pointer to WebWorker__notifyNeedTermination, which reads this.status from poisoned memory.

#0 atomic.Value(Status).load
#1 WebWorker__notifyNeedTermination        src/bun.js/web_worker.zig
#2 WebCore::Worker::terminate()            src/bun.js/bindings/webcore/Worker.cpp:266
#3 jsWorkerPrototypeFunction_terminate

This matches the user-reported crash in #24580 (Segmentation fault at address 0x130 in notifyNeedTermination from jsWorkerPrototypeFunction_terminate).

Fix

m_terminationFlags.fetch_or(TerminateRequestedFlag) already returns the previous flag value. If TerminateRequestedFlag or TerminatedFlag was already set, there's nothing more to do — the worker is already shutting down and impl_ may have been freed — so return early instead of dereferencing it.

Verification

for (let i = 0; i < 200; i++) {
  const w = new Worker("does-not-exist-" + i);
  w.terminate();
  w.terminate();
}

Under ASAN this triggers use-after-poison before the fix and runs cleanly after. In release builds the read of freed-but-not-reused memory happens to succeed, so no regression test is included — the existing worker test suite exercises terminate() and ASAN fuzzing covers this path.

The first terminate() sets requested_terminate, causing the worker thread
to take the early-exit path and free the Zig WebWorker. A second
terminate() would then call WebWorker__notifyNeedTermination on the freed
pointer.

Use the return value of fetch_or to bail out if termination was already
requested or completed.
@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented Apr 11, 2026

Updated 12:58 PM PT - Apr 11th, 2026

@robobun, your commit 95e22ef has 2 failures in Build #45137 (All Failures):


🧪   To try this PR locally:

bunx bun-pr 29180

That installs a local version of the PR into your bun-29180 executable, so you can run:

bun-29180 --bun

@github-actions
Copy link
Copy Markdown
Contributor

Found 2 issues this PR may fix:

  1. Opencode crash #24580 - Stack trace shows crash in notifyNeedTermination from jsWorkerPrototypeFunction_terminate, the exact use-after-free path this PR guards against
  2. Segfault / RangeError: invalid time zone when workers using Intl are repeatedly spawned and terminated #28415 - Rapidly spawning and terminating workers triggers segfaults; the double-terminate guard could prevent crashes when terminate() races against worker self-cleanup

If this is helpful, copy the block below into the PR description to auto-close these issues on merge.

Fixes #24580
Fixes #28415

🤖 Generated with Claude Code

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 11, 2026

Walkthrough

Modified Worker::terminate() to avoid redundant termination work by checking existing termination flags before notifying. Added test coverage verifying that double termination of workers with invalid scripts doesn't cause process crashes.

Changes

Cohort / File(s) Summary
Worker Termination
src/bun.js/bindings/webcore/Worker.cpp, test/js/web/workers/worker.test.ts
Optimized Worker::terminate() to check m_terminationFlags and return early if termination was already requested or completed, eliminating redundant WebWorker__notifyNeedTermination() calls. Added test case spawning a process with 200 iterations of worker creation and double termination to ensure stability with invalid script names.
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: fixing a use-after-free crash in Worker.terminate() when called twice, which is the primary objective of this PR.
Description check ✅ Passed The pull request description fully addresses the template requirements with comprehensive explanations of what the PR does and how the fix was verified.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

The use-after-free only manifests under ASAN; in release builds reading
the freed (but not yet reused) memory happens to succeed, so the test
cannot reliably fail before the fix.
@github-actions
Copy link
Copy Markdown
Contributor

This PR may be a duplicate of:

  1. fix use-after-free when terminating a worker after it exited #28840 - Also guards Worker::terminate() in Worker.cpp to prevent use-after-free when terminate is called after the worker has exited
  2. fix(workers): use-after-free in WebWorker when terminate() called after worker exit #28227 - Also adds a null check on impl_ in Worker::terminate() to prevent the same use-after-free on double/late terminate

🤖 Generated with Claude Code

@robobun
Copy link
Copy Markdown
Collaborator Author

robobun commented Apr 11, 2026

Re: duplicates — #28227, #28840, and this PR all address the same Worker::terminate() use-after-free with different trade-offs:

Any one of these is sufficient for the reported crash; pick whichever approach is preferred and the others can be closed.

Comment on lines 262 to 268
void Worker::terminate()
{
// m_contextProxy.terminateWorkerGlobalScope();
m_terminationFlags.fetch_or(TerminateRequestedFlag);
if (m_terminationFlags.fetch_or(TerminateRequestedFlag) & (TerminateRequestedFlag | TerminatedFlag))
return;
WebWorker__notifyNeedTermination(impl_);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The fix correctly prevents double-terminate UAF (TerminateRequestedFlag path), but three related UAF windows remain open: (1) a timing race where a naturally-exiting worker frees its Zig struct before the async dispatchExit task sets TerminatedFlag, leaving terminate() unguarded; (2) a deterministic UAF when terminate() is called from inside a close event handler (TerminatedFlag is set only after dispatchCloseEvent returns); and (3) a TOCTOU in notifyNeedTermination where the struct can be freed between the status check and the subsequent setRequestedTerminate() call. All three involve WebWorker__notifyNeedTermination(impl_) being called on freed memory and are not covered by the new guard.

Extended reasoning...

Root of the remaining bugs

The PR adds a guard in Worker::terminate() (Worker.cpp:262–268): if fetch_or(TerminateRequestedFlag) reveals that either TerminateRequestedFlag or TerminatedFlag was already set, the function returns early without touching impl_. This correctly closes the double-terminate window. However, TerminatedFlag is set asynchronously by a task posted in dispatchExit(), while the Zig struct is freed synchronously in exitAndDeinit() immediately after the post. This mismatch creates three remaining UAF windows.

Bug 1 (timing race before the dispatchExit task runs — Worker.cpp:266)

In exitAndDeinit() (web_worker.zig ~lines 626–662):

  1. this.setStatus(.terminated) — Zig-side status updated
  2. WebWorker__dispatchExit(...) — posts an async task to the main thread; that task will eventually set m_terminationFlags |= TerminatedFlag
  3. this.deinit() — immediately calls bun.default_allocator.destroy(this), freeing impl_

Between step 3 and when the posted task actually runs on the main thread, impl_ is a dangling pointer while m_terminationFlags still has neither TerminateRequestedFlag nor TerminatedFlag set. If JS calls worker.terminate() in this window, the PR's guard evaluates to 0 & (TerminateRequestedFlag | TerminatedFlag) == 0 and falls through, passing the freed pointer to WebWorker__notifyNeedTermination. Reading this.status inside notifyNeedTermination is a use-after-free.

Concrete proof: create a worker pointing at a non-existent URL; the Zig start() function immediately observes the error, calls exitAndDeinit(), and frees the struct — potentially before the main thread's event loop processes the async TerminatedFlag task. Running w.terminate() immediately after with a millisecond delay (but not a full event-loop turn) reliably triggers the race under ASAN.

Bug 2 (deterministic UAF from a close event handler — Worker.cpp:445–447)

In the async task that dispatchExit() posts, the main thread executes:

When a user's close handler calls w.terminate():

  • TerminateRequestedFlag = 0 (terminate() was never called before)
  • TerminatedFlag = 0 (set only after this handler returns)

The guard is not triggered. WebWorker__notifyNeedTermination(impl_) is called, but impl_ was already freed when the worker thread called this.deinit() right after posting the task. This is 100% deterministically reproducible with: const w = new Worker('bad-url'); w.addEventListener('close', () => w.terminate());.

One verifier argued this is a duplicate of Bug 1 because they share the same root cause (TerminatedFlag set asynchronously). The scenarios are however distinct trigger paths: Bug 1 fires before the async task runs at all; Bug 2 fires from inside the task during the close handler. The fix for Bug 1 (set TerminatedFlag synchronously on the worker thread before deinit) happens to also fix Bug 2, but the deterministic reproducibility of Bug 2 makes it the most immediately dangerous manifestation.

Bug 3 (TOCTOU inside notifyNeedTermination — web_worker.zig ~lines 603–619)

Even if the C++ guard in terminate() passes correctly (i.e., impl_ is valid at call entry), there is a second race window within notifyNeedTermination:

Between the status check (which sees a non-terminated status, so proceeds) and the subsequent setRequestedTerminate(), the worker thread can complete setStatus(.terminated) + WebWorker__dispatchExit() + this.deinit() in full, freeing the struct. setRequestedTerminate() then writes to freed memory. The window is narrow (two adjacent atomic operations) but valid under thread preemption or TSAN stress testing.

How to fix

The fundamental issue is that impl_ is freed before TerminatedFlag is set. A safe approach is to set TerminatedFlag on the C++ Worker object synchronously on the worker thread before calling deinit(), e.g. via a new exported function WebWorker__markTerminated(cpp_worker) that does worker->m_terminationFlags.fetch_or(TerminatedFlag). Alternatively, null out and guard impl_ under a mutex before freeing. Additionally, notifyNeedTermination needs a mutex or generation counter to guard the TOCTOU between its status check and subsequent field accesses.

Comment on lines 262 to 268
void Worker::terminate()
{
// m_contextProxy.terminateWorkerGlobalScope();
m_terminationFlags.fetch_or(TerminateRequestedFlag);
if (m_terminationFlags.fetch_or(TerminateRequestedFlag) & (TerminateRequestedFlag | TerminatedFlag))
return;
WebWorker__notifyNeedTermination(impl_);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 The PR's guard on terminate() is not applied to setKeepAlive(): calling worker.ref() or worker.unref() after a worker exits naturally but before the TerminatedFlag task runs will pass a freed impl_ pointer to WebWorker__setRef(). This is the same UAF race as the one fixed in terminate(), just via a different code path (Worker.cpp:134-137).

Extended reasoning...

What the bug is

Worker::setKeepAlive() (Worker.cpp:134-137) calls WebWorker__setRef(impl_, keepAlive) unconditionally, with no guard against impl_ having been freed. The PR correctly added a guard to terminate(), but setKeepAlive() was overlooked.

The race window

setKeepAlive() is reached from worker.ref() and worker.unref() (JSWorker.cpp:636, 647), which can be called by user JavaScript at any time. The vulnerable sequence is:

  1. Worker exits naturally (or via the first terminate()). On the worker thread, exitAndDeinit() runs: it calls WebWorker__dispatchExit(), which posts an async task to the main thread to set TerminatedFlag on the C++ Worker object, then immediately calls deinit() / bun.default_allocator.destroy(this) — the Zig WebWorker struct is now freed.
  2. The posted async task has not yet run; TerminatedFlag is therefore still clear in m_terminationFlags on the C++ side.
  3. Main-thread JS calls worker.ref() or worker.unref().
  4. setKeepAlive(true/false) -> WebWorker__setRef(impl_, value) where impl_ points to freed (and under ASAN, poisoned) memory.

Why existing code does not prevent it

setKeepAlive() has zero guards — it unconditionally forwards to Zig. In Zig, setRef() (web_worker.zig:580-586) begins with "if (this.hasRequestedTerminate())", which reads this.requested_terminate from the freed struct. This is a use-after-free / use-after-poison.

Unlike terminate(), which now checks m_terminationFlags on the safe C++ object before touching impl_, setKeepAlive() performs no such check.

Step-by-step proof

const w = new Worker("does-not-exist");
// worker thread: URL not found -> early exit -> exitAndDeinit()
// -> WebWorker__dispatchExit() posts task T to main thread
// -> deinit() frees WebWorker struct (impl_ is now dangling)
// [task T has NOT run yet; TerminatedFlag is still 0 in C++]
w.ref();
// -> setKeepAlive(true) -> WebWorker__setRef(impl_, true)
// Zig setRef(): this.hasRequestedTerminate() reads freed memory -> UAF/use-after-poison

Under ASAN this will reliably trigger use-after-poison.

How to fix

Apply the same guard used in terminate():

void Worker::setKeepAlive(bool keepAlive)
{
if (m_terminationFlags.load() & (TerminateRequestedFlag | TerminatedFlag))
return;
WebWorker__setRef(impl_, keepAlive);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Opencode crash

1 participant