fix(url): reject malformed percent-encoding in fileURLToPath by robobun · Pull Request #29176 · oven-sh/bun

robobun · 2026-04-11T14:37:08Z

Reproduction

import { fileURLToPath } from 'node:url';
fileURLToPath('file:///tmp/%%20users.txt');

Node.js: throws URIError: URI malformed.
Bun (before): silently returned /tmp/% users.txt.

Cause

functionFileURLToPath in src/bun.js/bindings/BunObject.cpp was handing the URL pathname to WebKit's WTF::URL::fileSystemPath(), which uses a lenient percent-decoder that treats invalid sequences as literal characters. Node, in contrast, implements fileURLToPath via decodeURIComponent(pathname) which throws URIError: URI malformed for any % not followed by exactly two hex digits.

Fix

Before the call to url.fileSystemPath(), iterate the pathname and throw ERR_INVALID_URI (a URIError with message URI malformed, matching Node) if any % is not followed by two hex digits. The existing encoded-slash check (%2F, %5C) still runs first, preserving Node's error precedence.

Verification

test/regression/issue/29174.test.ts — 16 cases covering:

malformed sequences: %%20, %GG, %2., %., lone trailing %, lone /%
valid encoding still decodes: %20, %7E
unencoded paths untouched
encoded slash still throws ERR_INVALID_FILE_URL_PATH (unchanged)

Each case was cross-checked against Node.js — same error name (URIError), code, and message.

fileURLToPath was delegating to WebKit's lenient percent-decoder, which silently interpreted malformed sequences like `%%20` as a literal `%` followed by a space. Node.js throws `URIError: URI malformed` for any `%` not followed by two hex digits. Validate the URL pathname strictly before handing it to `fileSystemPath()`, matching Node's `decodeURIComponent` semantics. Fixes #29174

robobun · 2026-04-11T14:37:19Z

^{Updated 12:22 PM PT - Apr 11th, 2026}

❌ @robobun, your commit 60792ec has 2 failures in Build #45134 (All Failures):

test/js/web/broadcastchannel/broadcast-channel-worker-gc.test.ts - pid 13476 error: addresssanitizer on 🐧 13 x64-asan
test/js/web/fetch/fetch.upgrade.test.ts - code 1 on 🐧 25.04 aarch64

🧪 To try this PR locally:

bunx bun-pr 29176

That installs a local version of the PR into your bun-29176 executable, so you can run:

bun-29176 --bun

coderabbitai · 2026-04-11T14:42:32Z

Walkthrough

Adds strict validation for percent-encoded sequences in file:// URL pathnames: every % must be followed by two ASCII hex digits and decoded bytes must be valid UTF‑8. On invalid sequences the function throws URI malformed (ERR_INVALID_URI) and returns before converting the URL to a filesystem path. Includes regression tests.

Changes

Cohort / File(s)	Summary
URL pathname validation `src/bun.js/bindings/BunObject.cpp`	Added `#include <wtf/ASCIICType.h>`; added strict percent-sequence validator (every `%` must be followed by two ASCII hex digits), percent-decode to a byte buffer, validate UTF‑8 via `simdutf::validate_utf8`, and throw `ERR_INVALID_URI` (`URI malformed`) on failures before calling `url.fileSystemPath()`.
Regression tests `test/regression/issue/29174.test.ts`	New test suite asserting `fileURLToPath` throws `URIError` with `code: "ERR_INVALID_URI"` and message `URI malformed` for malformed percent-encodings (invalid hex, truncated `%`, `%%20`, and invalid UTF‑8), plus platform-scoped assertions for valid decodings and `%2F` behavior.

🚥 Pre-merge checks | ✅ 4

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'fix(url): reject malformed percent-encoding in fileURLToPath' clearly and concisely describes the main change: adding validation to reject malformed percent-encoding sequences in the fileURLToPath function.
Description check	✅ Passed	The PR description comprehensively covers the template requirements: explains what the PR does (fixes malformed percent-encoding validation), provides reproduction steps, identifies the root cause, describes the fix, and includes verification details with 16 test cases.
Linked Issues check	✅ Passed	The PR fully addresses issue `#29174` by implementing strict percent-encoding validation that throws URIError for malformed sequences, matching Node.js behavior. Changes include validation logic in BunObject.cpp and comprehensive regression tests.
Out of Scope Changes check	✅ Passed	All changes are directly related to the objective of fixing `#29174`: the C++ implementation adds percent-encoding validation, and the test file validates the fix with 16 regression test cases. No unrelated modifications detected.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/regression/issue/29174.test.ts`:
- Around line 52-60: The tests hardcode POSIX paths; on Windows fileURLToPath
returns Windows-formatted paths, so update the three assertions that call
fileURLToPath to branch by platform: if process.platform === "win32" compute
expected values with path.win32.join("\\", "tmp", ...) (e.g.
path.win32.join("\\", "tmp", " space.txt"), path.win32.join("\\", "tmp",
"a~b.txt"), path.win32.join("\\", "tmp", "~.txt")), otherwise keep the POSIX
literals ("/tmp/..."); apply the same platform branch for the plain.txt test,
and import/use the path module where needed.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 20291d0f-f4df-4767-a5d2-29949d2389c2

📥 Commits

Reviewing files that changed from the base of the PR and between 4b1d889 and 74b524b.

📒 Files selected for processing (2)

src/bun.js/bindings/BunObject.cpp
test/regression/issue/29174.test.ts

Addresses coderabbit review — on Windows, fileURLToPath('/tmp/...') rejects the path as non-absolute. Skip the posix assertions on Windows and mirror them with drive-letter variants under test.if(isWindows).

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/regression/issue/29174.test.ts`:
- Around line 2-6: Remove the multi-line bug-history prose comment block that
follows the single-line issue URL at the top of the test file and leave only the
single-line issue URL comment; in other words, delete the descriptive lines
about Bun's fileURLToPath/WebKit decoding while keeping the one-line issue
reference so the test header follows the standard single-line format used in
test/regression/issue/*.test.ts.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 50d84c21-9624-4f21-a173-8a405170b1de

📥 Commits

Reviewing files that changed from the base of the PR and between 74b524b and 213ed81.

📒 Files selected for processing (1)

test/regression/issue/29174.test.ts

claude

LGTM — small, correct fix with good test coverage.

Extended reasoning...

Overview

Adds a strict percent-encoding validation pass in functionFileURLToPath (BunObject.cpp) before delegating to WTF's lenient fileSystemPath(). Only two files changed: the C++ fix and a regression test.

Security risks

None. The change makes validation stricter, not looser.

Level of scrutiny

Low — the logic is a simple linear scan with short-circuit bounds checking. The condition i + 2 >= length correctly detects trailing % and single-char sequences, and C++ short-circuit evaluation prevents any out-of-bounds access on p[i+1]/p[i+2].

Other factors

No bugs found by the bug hunting system. The fix restores Node.js compatibility for a well-defined edge case, and the 16 test cases cover malformed sequences, valid encoding, and the unchanged encoded-slash error path.

Use explicit unsigned loop counter matching StringView::length()'s return type, and bind the two hex-digit chars to named locals before calling isASCIIHexDigit. The previous size_t loop + templated call through operator[] failed to compile on Windows baseline.

claude

LGTM — simple, correct fix with comprehensive test coverage; the multi-line comment nit is already flagged inline.

Extended reasoning...

Overview

The PR modifies in to reject malformed percent-encoded sequences. It adds a single loop that scans the URL pathname for characters not followed by exactly two ASCII hex digits, throwing before delegating to WTF's lenient decoder. A new regression test () covers 16 cases: malformed inputs, valid encodings, platform variants, and the existing encoded-slash check.

Security risks

None. This is a correctness/compatibility fix that makes Bun stricter — it can only cause previously-silent bad input to throw, which is the desired behavior.

Level of scrutiny

Low. The change is small (one loop, ~10 lines), touches a single well-isolated code path, uses existing helpers (), and has no side effects on the happy path. The boundary check correctly handles all edge cases (lone trailing , single-char remainder).

Other factors

The Windows path assertion issue raised by CodeRabbit was addressed in commit 213ed81 using / . The only remaining open item is the multi-line bug-history prose comment (lines 2–6 of the test file), which is a trivial style nit already posted as an inline comment — it has no functional impact.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/bun.js/bindings/BunObject.cpp`:
- Around line 933-951: The current loop only checks percent-encoding hex digits
but not UTF-8 validity; update the logic in BunObject.cpp (around the
fileURLToPath-related validation that inspects p and uses isASCIIHexDigit) to
either (A) perform UTF-8 sequence validation on the percent-decoded bytes and
throw ERR_INVALID_URI ("URI malformed") for invalid UTF-8 sequences, or (B)
after calling WTF::URL::fileSystemPath(), check its return (e.g., null/empty or
an error indicator) and throw ERR_INVALID_URI if it indicates invalid UTF-8;
locate the code that builds/uses p and the call to WTF::URL::fileSystemPath()
and add the chosen validation/ null-check to ensure inputs like "%E0%A4%A" are
rejected consistently with Node's decodeURIComponent.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 7b65e470-272b-4452-95b1-0b2b0723d508

📥 Commits

Reviewing files that changed from the base of the PR and between 213ed81 and 3303760.

📒 Files selected for processing (1)

src/bun.js/bindings/BunObject.cpp

Node's decodeURIComponent rejects percent sequences that are syntactically valid but decode to invalid UTF-8 (e.g. truncated multi-byte sequences, overlong encodings, lone continuation bytes). WTF's fileSystemPath decoder is lenient on both counts, so decode the pathname bytes ourselves in the same pass that validates the hex syntax and run simdutf::validate_utf8 over the result.

Per repo convention in test/regression/issue/*.test.ts, only the single-line issue URL comment should live at the top of the file.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@test/regression/issue/29174.test.ts`:
- Around line 81-88: Extend the existing test "encoded slash still throws
ERR_INVALID_FILE_URL_PATH (unchanged)" to assert that Windows-encoded
backslashes (%5C and %5c) also throw the same error by calling
fileURLToPath("file:///tmp/%5Chack") and fileURLToPath("file:///tmp/%5chack")
when running on Windows; wrap those two additional expect(...) assertions in a
platform check (process.platform === "win32") so they only run on Windows, and
keep the same expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" })
matcher as the existing assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 169132c8-9f03-4963-b9cd-004a2a7e3e57

📥 Commits

Reviewing files that changed from the base of the PR and between 3303760 and 60792ec.

📒 Files selected for processing (2)

src/bun.js/bindings/BunObject.cpp
test/regression/issue/29174.test.ts

coderabbitai · 2026-04-11T15:59:41Z

+  test("encoded slash still throws ERR_INVALID_FILE_URL_PATH (unchanged)", () => {
+    expect(() => fileURLToPath("file:///tmp/%2Fhack")).toThrow(
+      expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }),
+    );
+    expect(() => fileURLToPath("file:///tmp/%2fhack")).toThrow(
+      expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }),
+    );
+  });


🧹 Nitpick | 🔵 Trivial

Add Windows %5C/%5c encoded-backslash assertions for completeness.

This test currently validates encoded / only. Adding encoded \ checks on Windows would fully lock in the unchanged platform-specific behavior.

Suggested test extension

test("encoded slash still throws ERR_INVALID_FILE_URL_PATH (unchanged)", () => { expect(() => fileURLToPath("file:///tmp/%2Fhack")).toThrow( expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }), ); expect(() => fileURLToPath("file:///tmp/%2fhack")).toThrow( expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }), ); }); + + test.if(isWindows)("encoded backslash still throws ERR_INVALID_FILE_URL_PATH (unchanged)", () => { + expect(() => fileURLToPath("file:///C:/tmp/%5Chack")).toThrow( + expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }), + ); + expect(() => fileURLToPath("file:///C:/tmp/%5chack")).toThrow( + expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }), + ); + }); });

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

test("encoded slash still throws ERR_INVALID_FILE_URL_PATH (unchanged)", () => {

expect(() => fileURLToPath("file:///tmp/%2Fhack")).toThrow(

expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }),

);

expect(() => fileURLToPath("file:///tmp/%2fhack")).toThrow(

expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }),

);

});

test("encoded slash still throws ERR_INVALID_FILE_URL_PATH (unchanged)", () => {

expect(() => fileURLToPath("file:///tmp/%2Fhack")).toThrow(

expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }),

);

expect(() => fileURLToPath("file:///tmp/%2fhack")).toThrow(

expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }),

);

});

test.if(isWindows)("encoded backslash still throws ERR_INVALID_FILE_URL_PATH (unchanged)", () => {

expect(() => fileURLToPath("file:///C:/tmp/%5Chack")).toThrow(

expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }),

);

expect(() => fileURLToPath("file:///C:/tmp/%5chack")).toThrow(

expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }),

);

});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@test/regression/issue/29174.test.ts` around lines 81 - 88, Extend the existing test "encoded slash still throws ERR_INVALID_FILE_URL_PATH (unchanged)" to assert that Windows-encoded backslashes (%5C and %5c) also throw the same error by calling fileURLToPath("file:///tmp/%5Chack") and fileURLToPath("file:///tmp/%5chack") when running on Windows; wrap those two additional expect(...) assertions in a platform check (process.platform === "win32") so they only run on Windows, and keep the same expect.objectContaining({ code: "ERR_INVALID_FILE_URL_PATH" }) matcher as the existing assertions.

claude

LGTM — all review feedback addressed.

Extended reasoning...

Overview

This PR fixes fileURLToPath in src/bun.js/bindings/BunObject.cpp to reject malformed percent-encodings (invalid hex digits, truncated sequences) and invalid UTF-8 byte sequences, matching Node.js behavior. Changes touch two files: the C++ binding and a new regression test.

Security risks

None. The change adds stricter input validation (rejecting bad input rather than accepting it), which is strictly safer than the prior lenient behavior.

Level of scrutiny

Moderate. The C++ change adds a pathname-scan loop and a simdutf UTF-8 validation call before delegating to WTF::URL::fileSystemPath(). The logic is straightforward and follows the existing error-throwing pattern in the file. No memory safety concerns — Vector<uint8_t, 256> with reserveInitialCapacity is a standard WTF pattern.

Other factors

Three rounds of reviewer feedback were raised and all addressed: (1) Windows path assertions scoped with test.skipIf/test.if (commit 213ed81), (2) UTF-8 validation added via simdutf (commit 47a7aee), (3) multi-line prose comment trimmed to single-line issue URL (commit 60792ec). No bugs were found by the automated bug hunting system. The remaining CodeRabbit comment (add %5C Windows backslash tests) is a trivial enhancement, not a correctness issue.

github-actions bot added the claude label Apr 11, 2026

robobun mentioned this pull request Apr 11, 2026

Inconsistent validation of percent-encoded file URLs compared to Node.js #29174

Open

coderabbitai bot reviewed Apr 11, 2026

View reviewed changes

Comment thread test/regression/issue/29174.test.ts Outdated

test: gate posix path assertions to non-windows

213ed81

Addresses coderabbit review — on Windows, fileURLToPath('/tmp/...') rejects the path as non-absolute. Skip the posix assertions on Windows and mirror them with drive-letter variants under test.if(isWindows).

coderabbitai bot reviewed Apr 11, 2026

View reviewed changes

Comment thread test/regression/issue/29174.test.ts Outdated

claude bot reviewed Apr 11, 2026

View reviewed changes

Comment thread test/regression/issue/29174.test.ts Outdated

coderabbitai bot reviewed Apr 11, 2026

View reviewed changes

Comment thread src/bun.js/bindings/BunObject.cpp

robobun added 2 commits April 11, 2026 15:54

test: trim multi-line prose comment to single issue URL

60792ec

Per repo convention in test/regression/issue/*.test.ts, only the single-line issue URL comment should live at the top of the file.

coderabbitai bot reviewed Apr 11, 2026

View reviewed changes

claude bot reviewed Apr 11, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(url): reject malformed percent-encoding in fileURLToPath#29176

fix(url): reject malformed percent-encoding in fileURLToPath#29176
robobun wants to merge 5 commits intomainfrom
farm/ec839e12/validate-fileurltopath-percent-encoding

robobun commented Apr 11, 2026

Uh oh!

robobun commented Apr 11, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Apr 11, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

claude bot left a comment

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 11, 2026

Uh oh!

claude bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

robobun commented Apr 11, 2026

Reproduction

Cause

Fix

Verification

Uh oh!

robobun commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot commented Apr 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 11, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

robobun commented Apr 11, 2026 •

edited

Loading

coderabbitai bot commented Apr 11, 2026 •

edited

Loading