Skip to content

Conversation

@hntk03
Copy link

@hntk03 hntk03 commented Nov 14, 2025

Purpose

This PR ensures that URLs in search results are properly encoded.
Previously, search result URLs containing characters such as # or ? were not encoded correctly.
This change adds URL encoding in the _displayItem of search results to fix this issue.

References

}
let linkEl = listItem.appendChild(document.createElement("a"));
linkEl.href = linkUrl + anchor;
const encodedLinkUrl = linkUrl.split("/").map(encodeURIComponent).join("/");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible for linkUrl to contain other components that we would not want to URL-encode? (for example, protocol scheme - http://, or similar?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment.

If linkUrl contains something like http://, the colon would be encoded to %3A, so it would become http%3A//.
At least in the examples shown in the related issue, it doesn’t seem to contain any protocol scheme.

I’m checking the behavior of linkUrl, but if you know of any cases where it may contain a protocol scheme, please let me know.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hntk03 - I think in particular it is the behaviour and values of the content_root setting that we should inspect for this.

Currently I don't think that it could contain a URL scheme -- however perhaps we should write this part of the code defensively even so (because this JavaScript code may exist and run for a long duration of time in various documentation contexts).

cc @AA-Turner (in case you can add any more thoughts about the content_root, re: 8e730ae)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment.

It seems that content_root does not contain a protocol scheme,
but I understand your concern.

To confirm — should we encode only the pathname part of the URL and leave any protocol (if present) untouched?

Copy link
Member

@AA-Turner AA-Turner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could escape at the Python level: #14028 (comment)

We should also add a test here.

A

@AA-Turner AA-Turner added html search awaiting:response Waiting for a response from the author of this issue labels Nov 25, 2025
@hntk03
Copy link
Author

hntk03 commented Nov 28, 2025

Thanks for the feedback. I’ll look into the suggested approach.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting:response Waiting for a response from the author of this issue html search

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Search results don't link correctly if the url contains a "#"

3 participants