Skip to content

Desktop: Fixes #13903: Avoid OOM when printing notes with large attachment links#15026

Open
Rygaa wants to merge 12 commits intolaurent22:devfrom
Rygaa:issue13903
Open

Desktop: Fixes #13903: Avoid OOM when printing notes with large attachment links#15026
Rygaa wants to merge 12 commits intolaurent22:devfrom
Rygaa:issue13903

Conversation

@Rygaa
Copy link
Copy Markdown
Collaborator

@Rygaa Rygaa commented Apr 6, 2026

Summary

Printing could crash the renderer with an out-of-memory error when a note contained links to large non-image resources (e.g. archives). Packing HTML for print/export inlined those <a href="..."> targets as base64 data: URLs, which forced the whole file through encoding and blew memory.

Solution

  • Remove anchor-tag inlining from packToString so links keep their original href (no data: URL for attachments). Images, stylesheets, and scripts are unchanged.
  • Normalize CRLF to LF on the packed HTML string so output is consistent regardless of fixture or platform newlines; the integration test builds the expected document with .join('\n') so it stays stable on Windows/Git Bash.

This matches the agreed product trade-off: standalone HTML will no longer embed those files as data: links, but printing no longer OOMs.

Testing

Create a new note or open an existing one.
Attach a large ZIP file to the note.
In Joplin, click File → Print.
Joplin should no longer crash.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough
📝 Walkthrough
🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: fixing an out-of-memory crash when printing notes with large attachment links by removing anchor-tag inlining.
Description check ✅ Passed The description is directly related to the changeset, explaining the problem (OOM from inlining large attachment links), the solution (removing anchor-tag inlining, normalising CRLF to LF), and testing steps.
Pr Description Must Follow Guidelines ✅ Passed The pull request description fully adheres to the specified guidelines with proper platform prefix, issue reference, and all three required sections including problem statement, solution description, and test plan.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai bot added bug It's a bug export labels Apr 6, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/htmlpack/packToString.ts (1)

188-209: ⚠️ Potential issue | 🟠 Major

Restoring raw <a href> here breaks single-file export links after resource cleanup.

With no anchor processing in the open-tag path, packed HTML keeps relative link targets. In packages/lib/services/interop/InteropService_Exporter_Html.ts (Lines 225-234 in the provided snippet), resource directories are removed after writing the packed file, so those links become dead in exported output.

💡 Proposed direction (make anchor inlining configurable per caller)
-import { dirname } from 'path';
+import { basename, dirname } from 'path';

+export type PackOptions = {
+	inlineAnchorsAsDataUri?: boolean;
+};

-const packToString = async (baseDir: string, inputFileText: string, fs: FileApi) => {
+const packToString = async (baseDir: string, inputFileText: string, fs: FileApi, options: PackOptions = { inlineAnchorsAsDataUri: true }) => {

+	const processAnchorTag = async (_name: string, attrs: HtmlAttrs) => {
+		const href = attrValue(attrs, 'href');
+		if (!href) return null;
+		const filePath = `${baseDir}/${href}`;
+		if (!await fs.exists(filePath)) return null;
+		const modAttrs = { ...attrs, href: await readFileDataUriSafe(filePath), download: basename(href) };
+		return `<a ${attributesHtml(modAttrs)}>`;
+	};

 	await parseHtmlAsync(inputFileText, {
 		onopentag: async (name: string, attrs: HtmlAttrs) => {
 			name = name.toLowerCase();
 			let processedResult = '';

+			if (name === 'a' && options.inlineAnchorsAsDataUri) {
+				processedResult = await processAnchorTag(name, attrs);
+			}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/htmlpack/packToString.ts` around lines 188 - 209, The open-tag path
currently skips anchor handling, leaving relative <a href> links intact which
break single-file exports after resource cleanup; restore anchor processing by
treating name === 'a' similarly to 'link'/'script'/'img'—call a dedicated
processor (e.g., processAnchorTag or reuse processLinkTag with an anchor mode)
and push its result into processedResult before falling back to the generic
attribute handling; additionally make anchor inlining configurable by adding a
flag/option to the packToString caller interface and passing that option into
processAnchorTag/processLinkTag so callers (like InteropService_Exporter_Html)
can disable inlining for cases where links must remain raw.
🧹 Nitpick comments (2)
packages/htmlpack/packToString.ts (1)

234-234: Normalise line endings before returning packed HTML.

output.join('') preserves whatever line endings came through parsing. Normalising to LF here keeps output deterministic across platforms and aligns with the PR goal.

🔧 Proposed change
-	return output.join('');
+	return output.join('').replace(/\r\n/g, '\n');
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/htmlpack/packToString.ts` at line 234, The returned string from
packToString preserves platform-specific line endings; normalize them to LF
before returning by replacing CRLF/CR occurrences. In the function packToString
(look for the return output.join('') statement), change the final return to join
the output and run a normalization replace (e.g., replace CRLF and lone CR with
'\n') so the packed HTML is deterministic across platforms.
packages/htmlpack/index.test.ts (1)

15-34: Add a dedicated CRLF normalisation regression test.

This assertion covers anchor behaviour, but it does not verify the stated \r\n\n normalisation. Please add one focused test with CRLF input to lock that in.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/htmlpack/index.test.ts` around lines 15 - 34, Add a new focused test
in packages/htmlpack/index.test.ts that feeds an HTML input containing CRLF line
endings (\r\n) to the htmlpack function and asserts the resulting output file
uses normalized LF (\n) line endings; create a temporary input file with CRLF
content (similar to the existing test patterns), call htmlpack(inputPath,
outputFile), read the output with readFile(outputFile, 'utf8'), and assert that
all \r characters are gone (or that the content equals the expected string with
only \n). Reference the existing test setup variables (outputDirectory,
outputFile pattern) and the htmlpack function to locate where to add this
regression test.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@packages/htmlpack/packToString.ts`:
- Around line 188-209: The open-tag path currently skips anchor handling,
leaving relative <a href> links intact which break single-file exports after
resource cleanup; restore anchor processing by treating name === 'a' similarly
to 'link'/'script'/'img'—call a dedicated processor (e.g., processAnchorTag or
reuse processLinkTag with an anchor mode) and push its result into
processedResult before falling back to the generic attribute handling;
additionally make anchor inlining configurable by adding a flag/option to the
packToString caller interface and passing that option into
processAnchorTag/processLinkTag so callers (like InteropService_Exporter_Html)
can disable inlining for cases where links must remain raw.

---

Nitpick comments:
In `@packages/htmlpack/index.test.ts`:
- Around line 15-34: Add a new focused test in packages/htmlpack/index.test.ts
that feeds an HTML input containing CRLF line endings (\r\n) to the htmlpack
function and asserts the resulting output file uses normalized LF (\n) line
endings; create a temporary input file with CRLF content (similar to the
existing test patterns), call htmlpack(inputPath, outputFile), read the output
with readFile(outputFile, 'utf8'), and assert that all \r characters are gone
(or that the content equals the expected string with only \n). Reference the
existing test setup variables (outputDirectory, outputFile pattern) and the
htmlpack function to locate where to add this regression test.

In `@packages/htmlpack/packToString.ts`:
- Line 234: The returned string from packToString preserves platform-specific
line endings; normalize them to LF before returning by replacing CRLF/CR
occurrences. In the function packToString (look for the return output.join('')
statement), change the final return to join the output and run a normalization
replace (e.g., replace CRLF and lone CR with '\n') so the packed HTML is
deterministic across platforms.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: aa709472-bd44-477c-9d96-667b289c36e6

📥 Commits

Reviewing files that changed from the base of the PR and between cb27561 and 177d4b0.

📒 Files selected for processing (2)
  • packages/htmlpack/index.test.ts
  • packages/htmlpack/packToString.ts

Comment thread packages/htmlpack/packToString.ts Outdated
return `<a ${attributesHtml(modAttrs)}>`;
};

const output: string[] = [];
Copy link
Copy Markdown
Owner

@laurent22 laurent22 Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this is used in other contexts, isn't it? The idea of htmlpack is that all resources are included in the same file - it's used to export notes to HTML files too with all resources.

If that's a problem for PDF files, we can disable it for this, but it should still be enabled for other cases

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmmm it did not crash only in the html directory export but it did crash in html export.

## Crashed

- Print
- Export PDF
- Note export -> HTML File
- Sidebar notebook export -> HTML File
- Programmatic exportFolders -> HTML file target
- Programmatic exportFolders -> HTML directory target

## Worked

- Note export -> HTML Directory
- Sidebar notebook export -> HTML Directory
- Main menu -> Export all -> HTML Directory
- Programmatic exportNotes -> HTML directory target

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need a more nuanced fix then. The idea was to disable embedding resources for PDF. We haven't discussed disabling it for HTML files and we don't really want this. What are the options?

Copy link
Copy Markdown
Collaborator Author

@Rygaa Rygaa Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue is caused by readFile(path, 'base64'). We should keep embedding for HTML file exports, but switch that part to use streaming or chunking instead of loading the entire file into memory at once.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The task took a bit more time than I expected because I first tried to keep output.push while adding streaming, so the change would stay small and safe. But that caused another oom problem, because output.push was still keeping a very large string in memory. For the PDF part, I also had to update app-desktop so it can know if the call is for a normal HTML export or for a PDF export, because these two cases need different behavior.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thanks @Rygaa. There are some CodeRabbit comments to check now

@coderabbitai coderabbitai bot added desktop All desktop platforms performance Performance issues bug It's a bug export and removed bug It's a bug export desktop All desktop platforms performance Performance issues labels Apr 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/htmlpack/packToString.ts`:
- Around line 183-186: The current branch overwrites modAttrs.href with an empty
string when fs.exists(filePath) is false, creating dead anchors for valid
external/non-local links; to fix, remove the assignment that sets modAttrs.href
= '' and simply return `<a ${attributesHtml(modAttrs)}>` so the original href is
preserved, or alternatively only clear href for proven-local-relative paths
(i.e., tighten the existence check around filePath before mutating
modAttrs.href) while leaving external (https:, mailto:, #) links unchanged;
update the code around fs.exists, filePath, modAttrs.href and attributesHtml in
packToString.ts accordingly.
- Around line 173-191: The code is still inlining <a> hrefs by calling
fs.streamFileDataUri; instead, for anchors (inside packToString) do not call
fs.streamFileDataUri — keep or set the href attribute to the file path/relative
href and write the opening tag with attributesHtml. Concretely: remove the
delete modAttrs.href + fs.streamFileDataUri + write(...) sequence and instead
set modAttrs.href = href (or filePath if you prefer absolute) and call write(`<a
${attributesHtml(modAttrs)}>`), preserving the download attribute and existence
check logic.
🪄 Autofix (Beta)

❌ Autofix failed (check again to retry)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 78c82963-9a5b-4b28-b310-5e67a484c81a

📥 Commits

Reviewing files that changed from the base of the PR and between 5b8e165 and c2b6272.

📒 Files selected for processing (2)
  • packages/htmlpack/packToString.ts
  • packages/lib/services/interop/InteropService_Exporter_Html.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/lib/services/interop/InteropService_Exporter_Html.ts

Comment thread packages/htmlpack/packToString.ts Outdated
Comment on lines +173 to +191
if (!fs.streamFileDataUri) return null;

const href = attrValue(attrs, 'href');
if (!href) return null;

const filePath = `${baseDir}/${href}`;
if (!await fs.exists(filePath)) return null;

const modAttrs = { ...attrs };
modAttrs.href = await readFileDataUriSafe(filePath);
modAttrs.download = basename(href);
return `<a ${attributesHtml(modAttrs)}>`;
};

const output: string[] = [];
if (!await fs.exists(filePath)) {
modAttrs.href = '';
return `<a ${attributesHtml(modAttrs)}>`;
}

delete modAttrs.href;
await write('<a href="');
await fs.streamFileDataUri(filePath, async (chunk) => { await write(chunk); });
await write(`" ${attributesHtml(modAttrs)}>`);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Anchor href is still embedded as a Data URI

Line 189–Line 191 still converts <a href="..."> into a streamed data URI. That keeps large non-image attachments inlined, which conflicts with this PR’s stated goal to avoid anchor inlining for print/export OOM prevention.

Suggested fix
 const processAnchorTag = async (_name: string, attrs: HtmlAttrs) => {
-	if (!fs.streamFileDataUri) return null;
-
-	const href = attrValue(attrs, 'href');
-	if (!href) return null;
-
-	const filePath = `${baseDir}/${href}`;
-
-	const modAttrs = { ...attrs };
-	modAttrs.download = basename(href);
-
-	if (!await fs.exists(filePath)) {
-		modAttrs.href = '';
-		return `<a ${attributesHtml(modAttrs)}>`;
-	}
-
-	delete modAttrs.href;
-	await write('<a href="');
-	await fs.streamFileDataUri(filePath, async (chunk) => { await write(chunk); });
-	await write(`" ${attributesHtml(modAttrs)}>`);
-	return '';
+	return null;
 };
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/htmlpack/packToString.ts` around lines 173 - 191, The code is still
inlining <a> hrefs by calling fs.streamFileDataUri; instead, for anchors (inside
packToString) do not call fs.streamFileDataUri — keep or set the href attribute
to the file path/relative href and write the opening tag with attributesHtml.
Concretely: remove the delete modAttrs.href + fs.streamFileDataUri + write(...)
sequence and instead set modAttrs.href = href (or filePath if you prefer
absolute) and call write(`<a ${attributesHtml(modAttrs)}>`), preserving the
download attribute and existence check logic.

Comment thread packages/htmlpack/packToString.ts
@coderabbitai coderabbitai bot added desktop All desktop platforms and removed bug It's a bug export labels Apr 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/lib/services/interop/InteropService_Exporter_Html.ts (1)

205-248: ⚠️ Potential issue | 🟠 Major

Do not truncate the destination until packing has succeeded.

Line 207 clears this.filePath_ before the streamed pack has completed. If any later read or write fails, the export is left empty or half-written. Please stream into a temporary file and replace the final path only after packToString(...) resolves successfully.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/services/interop/InteropService_Exporter_Html.ts` around lines
205 - 248, The code currently truncates this.filePath_ (writeFile to empty)
before packToString completes, risking a half-written export; change the flow to
write/append to a temporary file (e.g., derive a temp path from this.filePath_),
use the same packToString callback (streaming append) to write to that temp
file, and only after packToString resolves successfully atomically
replace/rename the temp file to this.filePath_ (using shim.fsDriver().rename or
equivalent) and clean up the temp on error; update references to initial
write/truncate (the writeFile call at the start) and the appendFile usage inside
the packToString output callback so they target the temp path instead of
this.filePath_.
♻️ Duplicate comments (1)
packages/htmlpack/packToString.ts (1)

180-185: ⚠️ Potential issue | 🟠 Major

Preserve the original href when the local file lookup misses.

Lines 183-185 turn valid non-local links into dead anchors as soon as streamFileDataUri is enabled. https:, mailto: and #fragment targets should fall back to the unmodified tag when ${baseDir}/${href} is not a local file.

Proposed fix
-		const modAttrs = { ...attrs };
-		modAttrs.download = basename(href);
-
-		if (!await fs.exists(filePath)) {
-			modAttrs.href = '';
-			return `<a ${attributesHtml(modAttrs)}>`;
-		}
+		if (!await fs.exists(filePath)) return null;
+
+		const modAttrs = { ...attrs };
+		modAttrs.download = basename(href);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/htmlpack/packToString.ts` around lines 180 - 185, The code mutates
modAttrs and clears href before verifying the file exists, turning valid
external/non-local links into dead anchors; update packToString so you check
fs.exists(filePath) before changing attributes: if the file does not exist
return the original tag using attributesHtml(attrs) (do not set modAttrs.href =
''), and only after a successful exists check create modAttrs, set
modAttrs.download = basename(href) and any modified href/data-uri values;
reference variables: modAttrs, attrs, href, filePath, attributesHtml.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@packages/lib/services/interop/InteropService_Exporter_Html.ts`:
- Around line 205-248: The code currently truncates this.filePath_ (writeFile to
empty) before packToString completes, risking a half-written export; change the
flow to write/append to a temporary file (e.g., derive a temp path from
this.filePath_), use the same packToString callback (streaming append) to write
to that temp file, and only after packToString resolves successfully atomically
replace/rename the temp file to this.filePath_ (using shim.fsDriver().rename or
equivalent) and clean up the temp on error; update references to initial
write/truncate (the writeFile call at the start) and the appendFile usage inside
the packToString output callback so they target the temp path instead of
this.filePath_.

---

Duplicate comments:
In `@packages/htmlpack/packToString.ts`:
- Around line 180-185: The code mutates modAttrs and clears href before
verifying the file exists, turning valid external/non-local links into dead
anchors; update packToString so you check fs.exists(filePath) before changing
attributes: if the file does not exist return the original tag using
attributesHtml(attrs) (do not set modAttrs.href = ''), and only after a
successful exists check create modAttrs, set modAttrs.download = basename(href)
and any modified href/data-uri values; reference variables: modAttrs, attrs,
href, filePath, attributesHtml.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 1e4a3932-6959-413e-a6b3-66793c64de94

📥 Commits

Reviewing files that changed from the base of the PR and between c2b6272 and d00b6bd.

📒 Files selected for processing (4)
  • packages/app-desktop/InteropServiceHelper.ts
  • packages/htmlpack/index.ts
  • packages/htmlpack/packToString.ts
  • packages/lib/services/interop/InteropService_Exporter_Html.ts
✅ Files skipped from review due to trivial changes (1)
  • packages/app-desktop/InteropServiceHelper.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/htmlpack/index.ts

@Rygaa Rygaa force-pushed the issue13903 branch 2 times, most recently from d72ad83 to d00b6bd Compare April 8, 2026 17:59
@coderabbitai coderabbitai bot added bug It's a bug export performance Performance issues and removed desktop All desktop platforms labels Apr 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/htmlpack/packToString.ts (1)

75-120: ⚠️ Potential issue | 🟡 Minor

Normalise packed CSS to LF instead of preserving CRLF.

Line 76 and Line 120 reintroduce \r\n whenever the source CSS uses CRLF. That is the opposite of the stated cross-platform normalisation, so packed HTML will still differ between Windows and Unix inputs.

Suggested change
 	const processCssContent = async (cssBaseDir: string, content: string) => {
-		const newline = content.includes('\r\n') ? '\r\n' : '\n';
 		const o = cssParse(content, {
 			silent: false,
 		});
@@
-		return cssStringify(o).replace(/\n/g, newline);
+		return cssStringify(o).replace(/\r\n/g, '\n');
 	};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/htmlpack/packToString.ts` around lines 75 - 120, The function
processCssContent currently preserves CRLF by detecting newline at the top
(newline = content.includes('\r\n') ? '\r\n' : '\n') and then using that when
returning cssStringify(o).replace(/\n/g, newline); change this to always
normalise to LF: set newline to '\n' (or simply remove the detection) and ensure
the return replaces CRLF (and lone CR if needed) with '\n' (e.g. replace
/\r\n?/g or at minimum /\r\n/g with '\n'), so cssStringify(o) output is
converted to LF regardless of the source; update references in processCssContent
to use this forced LF normalization.
♻️ Duplicate comments (1)
packages/htmlpack/packToString.ts (1)

173-191: ⚠️ Potential issue | 🟠 Major

Do not inline attachment hrefs in packToString.

Line 183–191 still rewrites local attachment links into data: URLs. That does not implement this PR’s stated fix, and packages/htmlpack/index.ts still buffers emitted chunks before the final join, so the HTML-file export path keeps the same large-memory pressure.

Suggested change
 	const modAttrs = { ...attrs };
 	modAttrs.download = basename(href);

-	delete modAttrs.href;
-	await write('<a href="');
-	if (fs.streamFileDataUri) {
-		await fs.streamFileDataUri(filePath, async (chunk) => { await write(chunk); });
-	} else {
-		await write(await readFileDataUriSafe(filePath));
-	}
-	await write(`" ${attributesHtml(modAttrs)}>`);
-	return '';
+	return `<a ${attributesHtml(modAttrs)}>`;
 	};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/htmlpack/packToString.ts` around lines 173 - 191, processAnchorTag
is currently inlining local attachment hrefs as data: URIs (using
fs.streamFileDataUri / readFileDataUriSafe) which defeats the PR's goal;
instead, if the local file exists, keep the original href attribute (do not
delete modAttrs.href), set modAttrs.download = basename(href), and write the
anchor tag using attributesHtml(modAttrs) without streaming or inlining the file
contents; remove calls to fs.streamFileDataUri and readFileDataUriSafe from
processAnchorTag so hrefs remain external links and avoid buffering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@packages/htmlpack/packToString.ts`:
- Around line 75-120: The function processCssContent currently preserves CRLF by
detecting newline at the top (newline = content.includes('\r\n') ? '\r\n' :
'\n') and then using that when returning cssStringify(o).replace(/\n/g,
newline); change this to always normalise to LF: set newline to '\n' (or simply
remove the detection) and ensure the return replaces CRLF (and lone CR if
needed) with '\n' (e.g. replace /\r\n?/g or at minimum /\r\n/g with '\n'), so
cssStringify(o) output is converted to LF regardless of the source; update
references in processCssContent to use this forced LF normalization.

---

Duplicate comments:
In `@packages/htmlpack/packToString.ts`:
- Around line 173-191: processAnchorTag is currently inlining local attachment
hrefs as data: URIs (using fs.streamFileDataUri / readFileDataUriSafe) which
defeats the PR's goal; instead, if the local file exists, keep the original href
attribute (do not delete modAttrs.href), set modAttrs.download = basename(href),
and write the anchor tag using attributesHtml(modAttrs) without streaming or
inlining the file contents; remove calls to fs.streamFileDataUri and
readFileDataUriSafe from processAnchorTag so hrefs remain external links and
avoid buffering.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f7273068-579a-4d12-8f31-a9496215bdfa

📥 Commits

Reviewing files that changed from the base of the PR and between d00b6bd and 75ab722.

📒 Files selected for processing (2)
  • packages/htmlpack/index.test.ts
  • packages/htmlpack/packToString.ts
✅ Files skipped from review due to trivial changes (1)
  • packages/htmlpack/index.test.ts

Comment thread packages/htmlpack/packToString.ts Outdated

// packToString should be able to run in React Native -- don't use fs-extra.
const packToString = async (baseDir: string, inputFileText: string, fs: FileApi) => {
const packToString = async (baseDir: string, inputFileText: string, fs: FileApi, write: FileApiChunkCallback) => {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're already passing a FileApi to this function - is it not possible to use this to implement the file chunking functions?

Also this function is called packToString, which is no longer what it does. Please update the name

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

const exportOptions = {
customCss: options.customCss ? options.customCss : '',
plugins: options.plugins,
packIntoSingleFile: false,
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that be here? I see that in InteropService_Exporter_Html.ts, packIntoSingleFile defaults to true but now it seems you make it default to false but from somewhere else. Also why do we need this now that you've implemented the streaming API?

Copy link
Copy Markdown
Collaborator Author

@Rygaa Rygaa Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The confusion is that I originally focused on large zip files, where most PDF viewers do not support opening a zip from a data: link. So I set packIntoSingleFile: false to skip packing for the PDF/print path, without considering that other file types can still be usefully embedded in a PDF. So yes, packIntoSingleFile should be removed so PDF can also embed other file types.

Laurent confirmed the following -> For this, I just want to confirm: most PDF viewers don't really support opening a zip from a data: link, even when the zip is small, right? I want to be sure before arguing for a different fix.

The remaining problem is that we should still ignore zip files in a PDF even with streaming working, having a zip inside a PDF is useless. Should I add logic to skip packing archive types (zip, 7z, ...) for PDF?

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So as discussed before we don't embed anything in the pdf, just images

@coderabbitai coderabbitai bot added desktop All desktop platforms and removed bug It's a bug export labels Apr 14, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
packages/htmlpack/packToString.ts (1)

178-200: ⚠️ Potential issue | 🟠 Major

Keep anchor hrefs intact instead of turning them into data: URLs.

This branch still rewrites local <a> targets into embedded data: URIs and clears unresolved ones. That means large attachment payloads still end up inside the final HTML, so the print/PDF OOM is not really fixed, and missing relative links become dead anchors. The safer behaviour here is to fall back to the normal tag emission and preserve the original href.

Suggested fix
 	const processAnchorTag = async (_name: string, attrs: HtmlAttrs) => {
-		const href = attrValue(attrs, 'href');
-		if (!href || !isLocalHref(href)) return null;
-
-		const filePath = `${baseDir}/${href}`;
-		const modAttrs = { ...attrs };
-		modAttrs.download = basename(href);
-
-		if (!await fs.exists(filePath)) {
-			modAttrs.href = '';
-			await fs.write(`<a ${attributesHtml(modAttrs)}>`);
-			return '';
-		}
-
-		delete modAttrs.href;
-		await fs.write('<a href="');
-		if (fs.streamFileDataUri) {
-			await fs.streamFileDataUri(filePath, async (chunk) => { await fs.write(chunk); });
-		} else {
-			await fs.write(await readFileDataUriSafe(filePath));
-		}
-		await fs.write(`" ${attributesHtml(modAttrs)}>`);
-		return '';
+		return null;
 	};
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/htmlpack/packToString.ts` around lines 178 - 200, The
processAnchorTag function currently rewrites local anchor hrefs into data: URIs
and clears unresolved hrefs; change it to preserve the original href instead: in
processAnchorTag (the function handling <a> tags) stop generating data URIs
(remove/ignore the fs.streamFileDataUri/readFileDataUriSafe branch) and emit the
normal anchor tag with the original href value (keep download basename if
desired but do not delete or blank modAttrs.href), and when the local file is
missing, leave attrs.href unchanged rather than setting it to ''. Ensure the
emitted tag uses attributesHtml(modAttrs) or the original attrs so relative
links remain intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/lib/services/interop/InteropService_Exporter_Html.ts`:
- Around line 247-248: The async write(chunk) sink currently appends chunks
verbatim via shim.fsDriver().appendFile(tempFilePath, chunk, 'utf8'); — modify
it to normalise line endings to LF before writing: convert chunk to a string
(handle Buffer or string), replace all CRLF and lone CR instances with '\n'
(e.g. use a global regex like /\r\n|\r/g), then pass the normalized string to
shim.fsDriver().appendFile; keep tempFilePath and the streaming write behavior
intact.

---

Duplicate comments:
In `@packages/htmlpack/packToString.ts`:
- Around line 178-200: The processAnchorTag function currently rewrites local
anchor hrefs into data: URIs and clears unresolved hrefs; change it to preserve
the original href instead: in processAnchorTag (the function handling <a> tags)
stop generating data URIs (remove/ignore the
fs.streamFileDataUri/readFileDataUriSafe branch) and emit the normal anchor tag
with the original href value (keep download basename if desired but do not
delete or blank modAttrs.href), and when the local file is missing, leave
attrs.href unchanged rather than setting it to ''. Ensure the emitted tag uses
attributesHtml(modAttrs) or the original attrs so relative links remain intact.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 03a43856-14af-422a-8a3a-e3606d6fdf89

📥 Commits

Reviewing files that changed from the base of the PR and between eca8512 and e122c4c.

📒 Files selected for processing (3)
  • packages/htmlpack/index.ts
  • packages/htmlpack/packToString.ts
  • packages/lib/services/interop/InteropService_Exporter_Html.ts
✅ Files skipped from review due to trivial changes (1)
  • packages/htmlpack/index.ts

Comment on lines +247 to +248
async write(chunk) {
await shim.fsDriver().appendFile(tempFilePath, chunk, 'utf8');
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Normalise line endings before appending chunks.

The new sink writes each chunk verbatim, so any \r\n from the rendered HTML survives in the packed file. That keeps the output platform-dependent and misses the LF normalisation called out in this PR. Converting to \n at the write boundary would keep the streaming approach while making the packed output deterministic.

Suggested fix
 						async write(chunk) {
-							await shim.fsDriver().appendFile(tempFilePath, chunk, 'utf8');
+							await shim.fsDriver().appendFile(tempFilePath, chunk.replace(/\r\n/g, '\n'), 'utf8');
 						},
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
async write(chunk) {
await shim.fsDriver().appendFile(tempFilePath, chunk, 'utf8');
async write(chunk) {
await shim.fsDriver().appendFile(tempFilePath, chunk.replace(/\r\n/g, '\n'), 'utf8');
},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/services/interop/InteropService_Exporter_Html.ts` around lines
247 - 248, The async write(chunk) sink currently appends chunks verbatim via
shim.fsDriver().appendFile(tempFilePath, chunk, 'utf8'); — modify it to
normalise line endings to LF before writing: convert chunk to a string (handle
Buffer or string), replace all CRLF and lone CR instances with '\n' (e.g. use a
global regex like /\r\n|\r/g), then pass the normalized string to
shim.fsDriver().appendFile; keep tempFilePath and the streaming write behavior
intact.

@coderabbitai coderabbitai bot added bug It's a bug export and removed desktop All desktop platforms labels Apr 14, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/htmlpack/packToWriter.ts (1)

218-245: ⚠️ Potential issue | 🟠 Major

Record whether the handler emitted a full element.

Line 236 always pushes the tag as though only the opening tag was written. That is not true for processScriptTag, which returns a complete <script>…</script> element, so onclosetag later appends a second </script> on Line 266 and corrupts the output.

Suggested fix
 interface Tag {
 	name: string;
+	emission: 'default' | 'open-only' | 'full';
 }
-			tagStack.push({ name });
+			tagStack.push({
+				name,
+				emission: processedResult === null ? 'default' : processedResult === '' ? 'open-only' : 'full',
+			});
-			if (isSelfClosingTag(name)) return;
+			if (isSelfClosingTag(name) || current.emission === 'full') return;
 			await fs.writeChunk(`</${name}>`);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/htmlpack/packToWriter.ts` around lines 218 - 245, The tag stack
currently assumes every start tag was written as an opening tag, causing
duplicate closers when a handler like processScriptTag returns a full element;
update the logic that handles start tags (the block using
processLinkTag/processScriptTag/processImgTag/processAnchorTag and
tagStack.push) to record whether the handler emitted the complete element (e.g.,
push { name, emittedFull: boolean } instead of just { name }). Determine
emittedFull by checking processedResult: if processedResult is null then
emittedFull=false, if processedResult is a non-empty string that represents a
full element (or if handlers can signal this via a convention), set
emittedFull=true. Then update the corresponding onclosetag handling to skip
writing an extra closing tag when the top-of-stack entry has emittedFull=true.
Ensure all references use the tagStack, processedResult, and the specific
handler names (processScriptTag, processLinkTag, processImgTag,
processAnchorTag) so the change is localized and consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/htmlpack/package.json`:
- Around line 12-14: The package.json export map changed "./packToString" to
"./packToWriter", which is a breaking API change; update the "version" field
from 3.6.0 to 4.0.0 to reflect a semver major bump and ensure the release
notes/changelog mention the breaking change (reference the export entry for
"./packToWriter" and the removed "./packToString"); also verify any package
publishing metadata or CI that reads the version uses the new 4.0.0 value.

In `@packages/htmlpack/packToWriter.ts`:
- Around line 63-64: The streamed output path in packToWriter still emits raw
CRLF sequences; update the chunked write flow to normalize CRLF → LF before
writing by funneling all emits through the existing writeChunk(...) helper
(instead of calling fs.writeChunk(...) directly) and perform normalization
inside writeChunk (or immediately before calling it). Locate packToWriter and
replace direct fs.writeChunk calls with writeChunk(...) so every chunk passes
through the CRLF-to-LF conversion, and ensure the same normalization is applied
where packages/htmlpack/index.ts reassembles chunks (the join logic) so no
Windows \r\n survives in the final packed output.
- Around line 178-200: In processAnchorTag, stop inlining local <a href> targets
as data: URIs and stop blanking missing local links; instead preserve the
original href attribute and only add the download attribute. Concretely: in
processAnchorTag (and around uses of readFileDataUriSafe, fs.streamFileDataUri,
modAttrs, attributesHtml, and fs.writeChunk) remove the logic that deletes
modAttrs.href, writes href="data:..." or sets modAttrs.href = '' when the file
exists or is missing; always keep the original attrs.href (or leave attrs
untouched) and only add modAttrs.download = basename(href) before serializing
with attributesHtml and writeChunk so the output HTML keeps the original local
target even when the file is missing and avoids embedding huge data: URIs into
href.

---

Outside diff comments:
In `@packages/htmlpack/packToWriter.ts`:
- Around line 218-245: The tag stack currently assumes every start tag was
written as an opening tag, causing duplicate closers when a handler like
processScriptTag returns a full element; update the logic that handles start
tags (the block using
processLinkTag/processScriptTag/processImgTag/processAnchorTag and
tagStack.push) to record whether the handler emitted the complete element (e.g.,
push { name, emittedFull: boolean } instead of just { name }). Determine
emittedFull by checking processedResult: if processedResult is null then
emittedFull=false, if processedResult is a non-empty string that represents a
full element (or if handlers can signal this via a convention), set
emittedFull=true. Then update the corresponding onclosetag handling to skip
writing an extra closing tag when the top-of-stack entry has emittedFull=true.
Ensure all references use the tagStack, processedResult, and the specific
handler names (processScriptTag, processLinkTag, processImgTag,
processAnchorTag) so the change is localized and consistent.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 96eecf5c-9385-48b5-86bd-324ddc457e7f

📥 Commits

Reviewing files that changed from the base of the PR and between e122c4c and cb3dc44.

📒 Files selected for processing (4)
  • packages/htmlpack/index.ts
  • packages/htmlpack/packToWriter.ts
  • packages/htmlpack/package.json
  • packages/lib/services/interop/InteropService_Exporter_Html.ts
✅ Files skipped from review due to trivial changes (1)
  • packages/htmlpack/index.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/lib/services/interop/InteropService_Exporter_Html.ts

Comment thread packages/htmlpack/package.json
Comment thread packages/htmlpack/packToWriter.ts
Comment thread packages/htmlpack/packToWriter.ts
@coderabbitai coderabbitai bot added desktop All desktop platforms and removed bug It's a bug export labels Apr 14, 2026
@laurent22
Copy link
Copy Markdown
Owner

@Rygaa please address code rabbit comments

@Rygaa
Copy link
Copy Markdown
Collaborator Author

Rygaa commented Apr 15, 2026

coderabbitai has been addressed. except this one https://github.com/laurent22/joplin/pull/15026#discussion_r3081848541

This problem is not specific to packToWriter. The same issue exists in many other places in the codebase that read and inject CSS or JS files. Fixing it only here would be inconsistent. This is already being tracked as a broader Windows issue in . It needs a dedicated PR that addresses the issue as a whole.

@coderabbitai coderabbitai bot added bug It's a bug export performance Performance issues and removed desktop All desktop platforms labels Apr 15, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
packages/lib/services/interop/InteropService_Exporter_Html.ts (1)

251-252: ⚠️ Potential issue | 🟡 Minor

Normalise line endings before appending streamed chunks.

Line 252 still writes each chunk verbatim, so CRLFs from the rendered HTML survive in the packed file and the output remains platform-dependent.

Suggested fix
 						async writeChunk(chunk) {
-							await shim.fsDriver().appendFile(tempFilePath, chunk, 'utf8');
+							await shim.fsDriver().appendFile(tempFilePath, chunk.replace(/\r\n|\r/g, '\n'), 'utf8');
 						},
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/services/interop/InteropService_Exporter_Html.ts` around lines
251 - 252, The writeChunk implementation in InteropService_Exporter_Html
currently appends each chunk verbatim to tempFilePath; change it to normalise
line endings before writing (e.g., ensure CRLF -> LF) so the packed HTML is
platform-independent. In the writeChunk method, convert incoming chunk to a
UTF-8 string if it might be a Buffer, replace CRLF sequences (/\r\n/g) with '\n'
(and optionally collapse lone '\r' to '\n'), then call
shim.fsDriver().appendFile(tempFilePath, normalizedChunk, 'utf8') to persist the
normalized data.
🧹 Nitpick comments (1)
packages/lib/services/interop/InteropService_Exporter_Html.ts (1)

41-49: Give the exporter options a real type before adding more flags.

shouldEmbedOnlyImages is part of the exporter contract now, but Line 42 still keeps options as any. Please switch this to a concrete options interface and add the flag to the shared interop option type, otherwise this wiring stays easy to break silently during future refactors. As per coding guidelines, "Avoid using any type in TypeScript code; use proper TypeScript types instead".

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/lib/services/interop/InteropService_Exporter_Html.ts` around lines
41 - 49, Replace the loose any options with a concrete interface and use it in
the init signature: define an ExporterOptions (or extend the shared interop
options type) that includes customCss?: string, packIntoSingleFile?: boolean,
and shouldEmbedOnlyImages?: boolean, then change public async init(path: string,
options: ExporterOptions = {}) to use that type and update usages of
options.customCss, options.packIntoSingleFile and options.shouldEmbedOnlyImages;
also add the new shouldEmbedOnlyImages field to the shared interop option type
so the exporter wiring cannot break silently (referenced symbols: init,
customCss_, packIntoSingleFile_, shouldEmbedOnlyImages_, metadata()).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/lib/services/interop/InteropService_Exporter_Html.ts`:
- Around line 251-252: The writeChunk implementation in
InteropService_Exporter_Html currently appends each chunk verbatim to
tempFilePath; change it to normalise line endings before writing (e.g., ensure
CRLF -> LF) so the packed HTML is platform-independent. In the writeChunk
method, convert incoming chunk to a UTF-8 string if it might be a Buffer,
replace CRLF sequences (/\r\n/g) with '\n' (and optionally collapse lone '\r' to
'\n'), then call shim.fsDriver().appendFile(tempFilePath, normalizedChunk,
'utf8') to persist the normalized data.

---

Nitpick comments:
In `@packages/lib/services/interop/InteropService_Exporter_Html.ts`:
- Around line 41-49: Replace the loose any options with a concrete interface and
use it in the init signature: define an ExporterOptions (or extend the shared
interop options type) that includes customCss?: string, packIntoSingleFile?:
boolean, and shouldEmbedOnlyImages?: boolean, then change public async
init(path: string, options: ExporterOptions = {}) to use that type and update
usages of options.customCss, options.packIntoSingleFile and
options.shouldEmbedOnlyImages; also add the new shouldEmbedOnlyImages field to
the shared interop option type so the exporter wiring cannot break silently
(referenced symbols: init, customCss_, packIntoSingleFile_,
shouldEmbedOnlyImages_, metadata()).

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 95f42f5a-c5ad-4c5e-8457-ab0a83d450bb

📥 Commits

Reviewing files that changed from the base of the PR and between 9854c34 and 8720fef.

📒 Files selected for processing (2)
  • packages/app-desktop/InteropServiceHelper.ts
  • packages/lib/services/interop/InteropService_Exporter_Html.ts

Comment thread packages/app-desktop/InteropServiceHelper.ts
Comment thread packages/htmlpack/package.json
@Rygaa Rygaa requested a review from laurent22 April 15, 2026 09:24
@laurent22 laurent22 added the v3.7 label Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug It's a bug export performance Performance issues v3.7

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants