Skip to content

Commit b0b5a18

Browse files
author
Manuel Bucher
committed
🐛 FIX: Correctly encode "&" in Markdown URLs by not HTML-escaping refuri
`escapeHtml` was called on the URL before storing it in the `refuri` attribute of a reference node, converting `&` to `&amp;`. This caused double-escaping when Sphinx's HTML writer later escaped the `&` in `&amp;` to produce `&amp;amp;` in the final `href` attribute, breaking URLs with query parameters. The `refuri` attribute should hold the raw URL; HTML-escaping is the responsibility of the output writer. The other characters `escapeHtml` converts (`<`, `>`, `"`) are already percent-encoded by `normalizeLink` before reaching this point, so removing the call has no other effect.
1 parent 9364edb commit b0b5a18

8 files changed

Lines changed: 53 additions & 2 deletions

File tree

myst_parser/mdit_to_docutils/base.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@
3131
from docutils.utils import Reporter, SystemMessage, new_document
3232
from docutils.utils.code_analyzer import Lexer, LexerError, NumberLines
3333
from markdown_it import MarkdownIt
34-
from markdown_it.common.utils import escapeHtml
34+
3535
from markdown_it.renderer import RendererProtocol
3636
from markdown_it.token import Token
3737
from markdown_it.tree import SyntaxTreeNode
@@ -954,7 +954,7 @@ def render_link_url(
954954
if "classes" in conversion:
955955
ref_node["classes"].extend(conversion["classes"])
956956

957-
ref_node["refuri"] = escapeHtml(uri)
957+
ref_node["refuri"] = uri
958958
if implicit_text is not None:
959959
with self.current_node_context(ref_node, append=True):
960960
self.current_node.append(nodes.Text(implicit_text))

tests/test_renderers/fixtures/docutil_link_resolution.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
[alt2](https://www.google.com)
44
[](https://www.google.com)
55
<https://www.google.com>
6+
<https://example.com?foo=bar&a=1>
67
.
78
<document source="<src>/index.md">
89
<paragraph>
@@ -13,6 +14,9 @@
1314

1415
<reference refuri="https://www.google.com">
1516
https://www.google.com
17+
18+
<reference refuri="https://example.com?foo=bar&a=1">
19+
https://example.com?foo=bar&a=1
1620
.
1721

1822
[missing]

tests/test_renderers/fixtures/docutil_syntax_elements.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -747,3 +747,33 @@ a = 1
747747
<paragraph>
748748
<reference refname="target">
749749
.
750+
751+
URL with ampersand in query string
752+
.
753+
[link](https://example.com/search?q=foo&bar=baz)
754+
.
755+
<document source="notset">
756+
<paragraph>
757+
<reference refuri="https://example.com/search?q=foo&bar=baz">
758+
link
759+
.
760+
761+
URL with angle brackets (percent-encoded by normalizeLink, not HTML-escaped)
762+
.
763+
[link](https://example.com/path<with>brackets)
764+
.
765+
<document source="notset">
766+
<paragraph>
767+
<reference refuri="https://example.com/path%3Cwith%3Ebrackets">
768+
link
769+
.
770+
771+
URL with double quotes (percent-encoded by normalizeLink, not HTML-escaped)
772+
.
773+
[link](https://example.com/path"with"quotes)
774+
.
775+
<document source="notset">
776+
<paragraph>
777+
<reference refuri="https://example.com/path%22with%22quotes">
778+
link
779+
.

tests/test_renderers/fixtures/sphinx_link_resolution.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
[alt2](https://www.google.com)
44
[](https://www.google.com)
55
<https://www.google.com>
6+
<https://example.com?foo=bar&a=1>
67
.
78
<document source="<src>/index.md">
89
<paragraph>
@@ -13,6 +14,9 @@
1314

1415
<reference refuri="https://www.google.com">
1516
https://www.google.com
17+
18+
<reference refuri="https://example.com?foo=bar&a=1">
19+
https://example.com?foo=bar&a=1
1620
.
1721

1822
[missing]

tests/test_sphinx/sourcedirs/references/index.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@
88

99
[nested *syntax*](https://example.com)
1010

11+
[query params](https://example.com?foo=bar&a=1)
12+
1113
[](title)
1214

1315
[plain text](title)

tests/test_sphinx/test_sphinx_builds/test_references.html

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,11 @@ <h1>
3333
</em>
3434
</a>
3535
</p>
36+
<p>
37+
<a class="reference external" href="https://example.com?foo=bar&amp;a=1">
38+
query params
39+
</a>
40+
</p>
3641
<p>
3742
<a class="reference internal" href="#title">
3843
<span class="std std-ref">

tests/test_sphinx/test_sphinx_builds/test_references.resolved.xml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@
1818
nested
1919
<emphasis>
2020
syntax
21+
<paragraph>
22+
<reference refuri="https://example.com?foo=bar&a=1">
23+
query params
2124
<paragraph>
2225
<reference internal="True" refid="title">
2326
<inline classes="std std-ref">

tests/test_sphinx/test_sphinx_builds/test_references.xml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,9 @@
1818
nested
1919
<emphasis>
2020
syntax
21+
<paragraph>
22+
<reference refuri="https://example.com?foo=bar&a=1">
23+
query params
2124
<paragraph>
2225
<pending_xref refdoc="index" refdomain="True" refexplicit="False" reftarget="title" reftype="myst">
2326
<inline classes="xref myst">

0 commit comments

Comments
 (0)