Fix use-after-free when Worker.terminate() is called twice#29180
Fix use-after-free when Worker.terminate() is called twice#29180
Conversation
The first terminate() sets requested_terminate, causing the worker thread to take the early-exit path and free the Zig WebWorker. A second terminate() would then call WebWorker__notifyNeedTermination on the freed pointer. Use the return value of fetch_or to bail out if termination was already requested or completed.
|
Updated 12:58 PM PT - Apr 11th, 2026
❌ @robobun, your commit 95e22ef has 2 failures in
🧪 To try this PR locally: bunx bun-pr 29180That installs a local version of the PR into your bun-29180 --bun |
|
Found 2 issues this PR may fix:
🤖 Generated with Claude Code |
WalkthroughModified Changes
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
The use-after-free only manifests under ASAN; in release builds reading the freed (but not yet reused) memory happens to succeed, so the test cannot reliably fail before the fix.
|
This PR may be a duplicate of:
🤖 Generated with Claude Code |
|
Re: duplicates — #28227, #28840, and this PR all address the same
Any one of these is sufficient for the reported crash; pick whichever approach is preferred and the others can be closed. |
| void Worker::terminate() | ||
| { | ||
| // m_contextProxy.terminateWorkerGlobalScope(); | ||
| m_terminationFlags.fetch_or(TerminateRequestedFlag); | ||
| if (m_terminationFlags.fetch_or(TerminateRequestedFlag) & (TerminateRequestedFlag | TerminatedFlag)) | ||
| return; | ||
| WebWorker__notifyNeedTermination(impl_); | ||
| } |
There was a problem hiding this comment.
🔴 The fix correctly prevents double-terminate UAF (TerminateRequestedFlag path), but three related UAF windows remain open: (1) a timing race where a naturally-exiting worker frees its Zig struct before the async dispatchExit task sets TerminatedFlag, leaving terminate() unguarded; (2) a deterministic UAF when terminate() is called from inside a close event handler (TerminatedFlag is set only after dispatchCloseEvent returns); and (3) a TOCTOU in notifyNeedTermination where the struct can be freed between the status check and the subsequent setRequestedTerminate() call. All three involve WebWorker__notifyNeedTermination(impl_) being called on freed memory and are not covered by the new guard.
Extended reasoning...
Root of the remaining bugs
The PR adds a guard in Worker::terminate() (Worker.cpp:262–268): if fetch_or(TerminateRequestedFlag) reveals that either TerminateRequestedFlag or TerminatedFlag was already set, the function returns early without touching impl_. This correctly closes the double-terminate window. However, TerminatedFlag is set asynchronously by a task posted in dispatchExit(), while the Zig struct is freed synchronously in exitAndDeinit() immediately after the post. This mismatch creates three remaining UAF windows.
Bug 1 (timing race before the dispatchExit task runs — Worker.cpp:266)
In exitAndDeinit() (web_worker.zig ~lines 626–662):
this.setStatus(.terminated)— Zig-side status updatedWebWorker__dispatchExit(...)— posts an async task to the main thread; that task will eventually setm_terminationFlags |= TerminatedFlagthis.deinit()— immediately callsbun.default_allocator.destroy(this), freeingimpl_
Between step 3 and when the posted task actually runs on the main thread, impl_ is a dangling pointer while m_terminationFlags still has neither TerminateRequestedFlag nor TerminatedFlag set. If JS calls worker.terminate() in this window, the PR's guard evaluates to 0 & (TerminateRequestedFlag | TerminatedFlag) == 0 and falls through, passing the freed pointer to WebWorker__notifyNeedTermination. Reading this.status inside notifyNeedTermination is a use-after-free.
Concrete proof: create a worker pointing at a non-existent URL; the Zig start() function immediately observes the error, calls exitAndDeinit(), and frees the struct — potentially before the main thread's event loop processes the async TerminatedFlag task. Running w.terminate() immediately after with a millisecond delay (but not a full event-loop turn) reliably triggers the race under ASAN.
Bug 2 (deterministic UAF from a close event handler — Worker.cpp:445–447)
In the async task that dispatchExit() posts, the main thread executes:
When a user's close handler calls w.terminate():
TerminateRequestedFlag= 0 (terminate() was never called before)TerminatedFlag= 0 (set only after this handler returns)
The guard is not triggered. WebWorker__notifyNeedTermination(impl_) is called, but impl_ was already freed when the worker thread called this.deinit() right after posting the task. This is 100% deterministically reproducible with: const w = new Worker('bad-url'); w.addEventListener('close', () => w.terminate());.
One verifier argued this is a duplicate of Bug 1 because they share the same root cause (TerminatedFlag set asynchronously). The scenarios are however distinct trigger paths: Bug 1 fires before the async task runs at all; Bug 2 fires from inside the task during the close handler. The fix for Bug 1 (set TerminatedFlag synchronously on the worker thread before deinit) happens to also fix Bug 2, but the deterministic reproducibility of Bug 2 makes it the most immediately dangerous manifestation.
Bug 3 (TOCTOU inside notifyNeedTermination — web_worker.zig ~lines 603–619)
Even if the C++ guard in terminate() passes correctly (i.e., impl_ is valid at call entry), there is a second race window within notifyNeedTermination:
Between the status check (which sees a non-terminated status, so proceeds) and the subsequent setRequestedTerminate(), the worker thread can complete setStatus(.terminated) + WebWorker__dispatchExit() + this.deinit() in full, freeing the struct. setRequestedTerminate() then writes to freed memory. The window is narrow (two adjacent atomic operations) but valid under thread preemption or TSAN stress testing.
How to fix
The fundamental issue is that impl_ is freed before TerminatedFlag is set. A safe approach is to set TerminatedFlag on the C++ Worker object synchronously on the worker thread before calling deinit(), e.g. via a new exported function WebWorker__markTerminated(cpp_worker) that does worker->m_terminationFlags.fetch_or(TerminatedFlag). Alternatively, null out and guard impl_ under a mutex before freeing. Additionally, notifyNeedTermination needs a mutex or generation counter to guard the TOCTOU between its status check and subsequent field accesses.
| void Worker::terminate() | ||
| { | ||
| // m_contextProxy.terminateWorkerGlobalScope(); | ||
| m_terminationFlags.fetch_or(TerminateRequestedFlag); | ||
| if (m_terminationFlags.fetch_or(TerminateRequestedFlag) & (TerminateRequestedFlag | TerminatedFlag)) | ||
| return; | ||
| WebWorker__notifyNeedTermination(impl_); | ||
| } |
There was a problem hiding this comment.
🔴 The PR's guard on terminate() is not applied to setKeepAlive(): calling worker.ref() or worker.unref() after a worker exits naturally but before the TerminatedFlag task runs will pass a freed impl_ pointer to WebWorker__setRef(). This is the same UAF race as the one fixed in terminate(), just via a different code path (Worker.cpp:134-137).
Extended reasoning...
What the bug is
Worker::setKeepAlive() (Worker.cpp:134-137) calls WebWorker__setRef(impl_, keepAlive) unconditionally, with no guard against impl_ having been freed. The PR correctly added a guard to terminate(), but setKeepAlive() was overlooked.
The race window
setKeepAlive() is reached from worker.ref() and worker.unref() (JSWorker.cpp:636, 647), which can be called by user JavaScript at any time. The vulnerable sequence is:
- Worker exits naturally (or via the first terminate()). On the worker thread, exitAndDeinit() runs: it calls WebWorker__dispatchExit(), which posts an async task to the main thread to set TerminatedFlag on the C++ Worker object, then immediately calls deinit() / bun.default_allocator.destroy(this) — the Zig WebWorker struct is now freed.
- The posted async task has not yet run; TerminatedFlag is therefore still clear in m_terminationFlags on the C++ side.
- Main-thread JS calls worker.ref() or worker.unref().
- setKeepAlive(true/false) -> WebWorker__setRef(impl_, value) where impl_ points to freed (and under ASAN, poisoned) memory.
Why existing code does not prevent it
setKeepAlive() has zero guards — it unconditionally forwards to Zig. In Zig, setRef() (web_worker.zig:580-586) begins with "if (this.hasRequestedTerminate())", which reads this.requested_terminate from the freed struct. This is a use-after-free / use-after-poison.
Unlike terminate(), which now checks m_terminationFlags on the safe C++ object before touching impl_, setKeepAlive() performs no such check.
Step-by-step proof
const w = new Worker("does-not-exist");
// worker thread: URL not found -> early exit -> exitAndDeinit()
// -> WebWorker__dispatchExit() posts task T to main thread
// -> deinit() frees WebWorker struct (impl_ is now dangling)
// [task T has NOT run yet; TerminatedFlag is still 0 in C++]
w.ref();
// -> setKeepAlive(true) -> WebWorker__setRef(impl_, true)
// Zig setRef(): this.hasRequestedTerminate() reads freed memory -> UAF/use-after-poison
Under ASAN this will reliably trigger use-after-poison.
How to fix
Apply the same guard used in terminate():
void Worker::setKeepAlive(bool keepAlive)
{
if (m_terminationFlags.load() & (TerminateRequestedFlag | TerminatedFlag))
return;
WebWorker__setRef(impl_, keepAlive);
}
Fuzzilli found a use-after-poison crash (
Address:use-after-poison:bun-debug+0xf2efd21) whenWorker.prototype.terminate()is called more than once.Fixes #24580
Root cause
Worker::terminate()unconditionally callsWebWorker__notifyNeedTermination(impl_), which dereferences the ZigWebWorkerstruct.When
terminate()is called the first time, it setsrequested_terminateon the Zig side. The worker thread observes this instart()(before the VM is even set up), takes the early-exit path, and callsexitAndDeinit()→deinit(), which frees theWebWorker.A second
terminate()on the main thread then passes the freed pointer toWebWorker__notifyNeedTermination, which readsthis.statusfrom poisoned memory.This matches the user-reported crash in #24580 (
Segmentation fault at address 0x130innotifyNeedTerminationfromjsWorkerPrototypeFunction_terminate).Fix
m_terminationFlags.fetch_or(TerminateRequestedFlag)already returns the previous flag value. IfTerminateRequestedFlagorTerminatedFlagwas already set, there's nothing more to do — the worker is already shutting down andimpl_may have been freed — so return early instead of dereferencing it.Verification
Under ASAN this triggers use-after-poison before the fix and runs cleanly after. In release builds the read of freed-but-not-reused memory happens to succeed, so no regression test is included — the existing worker test suite exercises
terminate()and ASAN fuzzing covers this path.