-
Notifications
You must be signed in to change notification settings - Fork 203
fix: Convert relative path to a file in Mardown to its URL on GitHub. #1070
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Shang Wang <[email protected]>
) Signed-off-by: ashors1 <[email protected]> Signed-off-by: Shang Wang <[email protected]>
Signed-off-by: Charlie Truong <[email protected]> Signed-off-by: Shang Wang <[email protected]>
Signed-off-by: Terry Kong <[email protected]> Signed-off-by: Shang Wang <[email protected]>
Signed-off-by: Zhiyu Li <[email protected]> Signed-off-by: Zhiyu Li <[email protected]> Co-authored-by: Yuki Huang <[email protected]> Signed-off-by: Shang Wang <[email protected]>
Signed-off-by: Terry Kong <[email protected]> Signed-off-by: Shang Wang <[email protected]>
Signed-off-by: Shang Wang <[email protected]>
Signed-off-by: Shang Wang <[email protected]>
c08bbde to
6052183
Compare
terrykong
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @wangshangsam . a much needed QoL improvement. I just had a small comment
Signed-off-by: Shang Wang <[email protected]>
…o/RL into wangshangsam/fix-rel-src-link
Signed-off-by: Shang Wang <[email protected]>
|
Important Review skippedReview was skipped due to path filters ⛔ Files ignored due to path filters (1)
CodeRabbit blocks several paths by default. You can override this behavior by explicitly including those paths in the path filters. For example, including You can disable this status message by setting the WalkthroughAdds a Sphinx Transform that rewrites local Markdown Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor Dev as Developer
participant Sphinx as Sphinx App
participant Transform as _GitHubLinkTransform
participant Git as GitPython (Repo)
participant FS as Filesystem
Dev->>Sphinx: build docs
Sphinx->>Sphinx: setup(app) registers Transform\nand connects include-read handler
Sphinx->>Sphinx: include-read event\n(parent_docname, relative_path, contents)
Sphinx->>Sphinx: _convert_gh_admonitions(...) rewrites admonition lines
Sphinx->>Transform: run Transform.apply()
Transform->>Git: read remotes (prefer origin) and HEAD SHA
Transform->>FS: resolve relative target path and determine file vs dir
Transform->>Transform: build URL: repo_url/{blob|tree}/{sha}/{path}
Transform-->>Sphinx: replace download_reference nodes with GitHub URLs
Sphinx-->>Dev: built docs with rewritten GitHub links
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Pre-merge checks (1 passed, 4 warnings)❌ Failed checks (4 warnings)
✅ Passed checks (1 passed)
Pre-merge checks (1 passed, 4 warnings)❌ Failed checks (4 warnings)
✅ Passed checks (1 passed)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (1)
pyrefly.toml (1)
17-18: Good: narrow the type-check surface by Any-replacing sphinx/docutils.This addresses earlier feedback about avoiding adding Sphinx to dev deps. Consider scoping to files under docs/ only if broader replacement ever hides useful type errors elsewhere.
🧹 Nitpick comments (2)
docs/conf.py (2)
25-33: Optional: lazy-import GitPython.To avoid import-time failures when building tools parse conf.py without docs extras, import GitPython inside the transform. Low impact now that docs deps include GitPython, but it’s safer.
-from pathlib import Path -from typing import Any - -import git +from pathlib import Path +from typing import Any…and inside apply():
- repo = git.Repo(search_parent_directories=True) + import git # lazy import to keep conf.py importable without docs deps + repo = git.Repo(search_parent_directories=True)
118-131: include-read handler signature LGTM; tiny cleanup.Signature matches Sphinx docs; consider moving _github_admonitions to a module-level constant to avoid re-allocating each call.
-_github_admonitions = { +_GITHUB_ADMONITIONS = { @@ - for i, orig_content in enumerate(contents): + for i, orig_content in enumerate(contents): @@ - for admonition_key in _github_admonitions: + for admonition_key in _GITHUB_ADMONITIONS: @@ - "```{" + _github_admonitions[admonition_key] + "}", + "```{" + _GITHUB_ADMONITIONS[admonition_key] + "}",
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (3)
docs/conf.py(3 hunks)pyproject.toml(1 hunks)pyrefly.toml(1 hunks)
🧰 Additional context used
🪛 Ruff (0.12.2)
docs/conf.py
119-119: Unused function argument: app
(ARG001)
119-119: Unused function argument: relative_path
(ARG001)
119-119: Unused function argument: parent_docname
(ARG001)
171-171: Unused method argument: kwargs
(ARG002)
184-184: Consider ["docs", *node["refdoc"].split("/")[:-1]] instead of concatenation
Replace with ["docs", *node["refdoc"].split("/")[:-1]]
(RUF005)
🔇 Additional comments (2)
pyproject.toml (1)
100-116: CI workflows have nouv syncinvocations
The suggestion to add--group docsisn’t applicable because no workflow callsuv sync. If you need docs deps in CI, either add auv sync --group docsstep in the docs build workflow or use an equivalent install command (e.g.,pip install .[docs]).Likely an incorrect or invalid review comment.
docs/conf.py (1)
195-197: Registration looks right; confirm transform timing.Default priority 500 should run after MyST creates download_reference nodes. If you observe missed rewrites, consider adjusting priority or using a post-transform.
Build locally with one page linking a file via a relative path containing “../” to ensure normalization works.
Signed-off-by: Shang Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
♻️ Duplicate comments (3)
docs/conf.py (3)
25-33: Lazy-import GitPython to keep conf.py importable without docs extrasMove the import under apply() (and TYPE_CHECKING for annotations). This avoids failures when building docs in environments without GitPython.
-from pathlib import Path, PosixPath -from typing import Any - -import git +from pathlib import Path +from typing import Any, TYPE_CHECKING + +if TYPE_CHECKING: # avoid runtime import-time dependency + import git # type: ignore
202-205: Avoid iterator mutation and prep for URL-encodingUse findall() with a snapshot; import quote for encoding.
- for node in self.document.traverse(addnodes.download_reference): + from urllib.parse import quote + nodes_to_update = list(self.document.findall(addnodes.download_reference)) + for node in nodes_to_update:
172-189: Robust remote detection; avoid PosixPath for URLs; guard non-GitHub remotes
- 'origin' in repo.remotes is unreliable; prefer attribute/iteration.
- Returning PosixPath for a URL leads to malformed links when joined.
- Skip transform if remote isn’t GitHub.
- @staticmethod - def _get_github_source_url(repo: git.Repo) -> PosixPath: + @staticmethod + def _get_github_source_url(repo: "git.Repo") -> str: @@ - if "origin" in repo.remotes: - url = repo.remotes.origin.url - elif len(repo.remotes) == 1: - url = repo.remotes[0].url - else: - raise ValueError( - "Cannot determine which remote repo on GitHub this local repo is from." - ) + url = None + try: + origin = getattr(repo.remotes, "origin", None) + if origin is not None: + url = origin.url + else: + for r in getattr(repo, "remotes", []): + ru = getattr(r, "url", "") + if "github.com" in ru: + url = ru + break + except Exception: + url = None + if not url or "github.com" not in url: + return "" @@ - return PosixPath(url) + return url
🧹 Nitpick comments (3)
docs/conf.py (3)
190-201: Make git import lazy; narrow failure paths; support ref override
- Import git inside apply(); bail quietly if unavailable.
- Drop **kwargs to match Transform.apply signature.
- Allow DOCS_GITHUB_REF override for versioned builds.
- def apply(self, **kwargs: Any) -> None: # type: ignore[bad-override] - try: - local_repo = git.Repo(search_parent_directories=True) - remote_repo_url = self._get_github_source_url(local_repo) - except Exception: - # Cannot figure out which source url it should be; leave links as-is. - return + def apply(self) -> None: # type: ignore[bad-override] + # Lazy import keeps conf.py importable without docs extras + try: + import git # type: ignore + except Exception: + return + try: + local_repo = git.Repo(search_parent_directories=True) + except Exception: + return + remote_repo_url = self._get_github_source_url(local_repo) + if not remote_repo_url: + return if local_repo.working_tree_dir is None: # If the local repo is a bare repo, the method below won't work. return - wt_dir = local_repo.working_tree_dir + wt_dir = local_repo.working_tree_dir + ref = os.environ.get("DOCS_GITHUB_REF") or local_repo.head.object.hexsha
203-207: Anchor path resolution to the repo root for correctness across confdir/cwdEnsure resolution starts at working tree dir to avoid cwd-dependent behavior.
- md_dir = Path(node["refdoc"]).parent - dst_path = md_dir / Path(node["reftarget"]) + md_dir = Path(node["refdoc"]).parent + dst_path = Path(wt_dir) / md_dir / Path(node["reftarget"]) try: dst_path = dst_path.resolve(strict=True)
234-236: Optional: return Sphinx extension metadataReturning parallel safety hints helps Sphinx optimize builds.
def setup(app: Sphinx) -> None: app.add_transform(_GitHubLinkTransform) app.connect("include-read", _convert_gh_admonitions) + return { + "version": "1.0", + "parallel_read_safe": True, + "parallel_write_safe": True, + }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
docs/conf.py(3 hunks)pyproject.toml(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- pyproject.toml
🧰 Additional context used
🪛 Ruff (0.12.2)
docs/conf.py
119-119: Unused function argument: app
(ARG001)
119-119: Unused function argument: relative_path
(ARG001)
119-119: Unused function argument: parent_docname
(ARG001)
180-182: Avoid specifying long messages outside the exception class
(TRY003)
190-190: Unused method argument: kwargs
(ARG002)
194-194: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (2)
docs/conf.py (2)
210-214: LGTM: blob vs tree detectionCorrectly distinguishes files vs directories for GitHub URLs.
225-231: LGTM: attribute preservationPreserving classes/title avoids styling regressions.
Signed-off-by: Shang Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/conf.py (1)
119-165: Close unterminated GitHub admonitions at EOF and silence unused-arg warnings.Prevents open blocks and clears Ruff ARG001.
-def _convert_gh_admonitions( - app: Sphinx, relative_path: Path, parent_docname: str, contents: list[str] -) -> None: +def _convert_gh_admonitions( + _app: Sphinx, _relative_path: Path, _parent_docname: str, contents: list[str] +) -> None: @@ for i, orig_content in enumerate(contents): orig_line_splits = orig_content.split("\n") replacing = False for j, line in enumerate(orig_line_splits): @@ orig_line_splits[j] = line - # swap line back in original + # If the admonition ran to EOF, close it + if replacing: + orig_line_splits.append("```") + replacing = False + # swap line back in original contents[i] = "\n".join(orig_line_splits)
♻️ Duplicate comments (1)
pyproject.toml (1)
126-127: Polish: fix comment grammar and use canonical package casing.Keep the version floor; adjust wording and casing.
- "gitpython>=3.1.45", # To git-related information + "GitPython>=3.1.45", # For Git operations
🧹 Nitpick comments (1)
docs/conf.py (1)
168-235: Optional: tighten remote selection and messages.
- Consider preferring
originby name (next((r for r in repo.remotes if r.name == "origin"), None)) to avoid reliance onIterableList.__contains__.- The long
ValueErrorstring can be shortened or moved to a custom exception to satisfy TRY003.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (2)
docs/conf.py(3 hunks)pyproject.toml(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
docs/conf.py (1)
docs/autodoc2_docstrings_parser.py (1)
parse(20-27)
🪛 Ruff (0.12.2)
docs/conf.py
120-120: Unused function argument: app
(ARG001)
120-120: Unused function argument: relative_path
(ARG001)
120-120: Unused function argument: parent_docname
(ARG001)
181-183: Avoid specifying long messages outside the exception class
(TRY003)
191-191: Unused method argument: kwargs
(ARG002)
195-195: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (1)
docs/conf.py (1)
237-239: Confirm Sphinx event name.Verify that
include-readis the intended event (vs.source-read). If it’s extension-specific, add a brief comment for maintainers.Would you confirm the event is emitted in your build? If needed, I can propose a small probe.
Signed-off-by: Shang Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
docs/conf.py (2)
25-33: Harden the GitHub link transform: lazy-import GitPython, fix apply signature, anchor paths to srcdir, snapshot iteration, allow ref override, and narrow exceptions.Prevents import-time failures, incorrect path resolution, traversal-mutation issues, and lets CI pin the ref (branch/tag/SHA).
@@ -import urllib.parse +import urllib.parse from pathlib import Path from typing import Any -import git +# GitPython is imported lazily inside _GitHubLinkTransform.apply() from docutils import nodes from docutils.transforms import Transform from sphinx import addnodes from sphinx.application import Sphinx @@ class _GitHubLinkTransform(Transform): """Converting the relative path to a file in a Markdown to the URL of that file on GitHub.""" - default_priority = 500 # type: ignore[bad-override] + default_priority = 500 @staticmethod - def _get_github_source_url(repo: git.Repo) -> str: - # Find out which remote GitHub repo should be the source. - if "origin" in repo.remotes: - url = repo.remotes.origin.url - elif len(repo.remotes) == 1: - url = repo.remotes[0].url - else: - raise ValueError( - "Cannot determine which remote repo on GitHub this local repo is from." - ) + def _get_github_source_url(repo: Any) -> str: + # Prefer a GitHub remote named "origin"; otherwise the first GitHub remote; otherwise single remote. + url = None + for r in getattr(repo, "remotes", []): + if getattr(r, "name", "") == "origin" and "github.com" in getattr(r, "url", ""): + url = r.url + break + if url is None: + for r in getattr(repo, "remotes", []): + if "github.com" in getattr(r, "url", ""): + url = r.url + break + if url is None and len(repo.remotes) == 1: + url = repo.remotes[0].url + if not url: + raise ValueError("No GitHub remote found for this repository.") # Canonicalize the URL. if url.startswith("[email protected]:"): url = url.replace("[email protected]:", "https://github.com/", 1) if url.endswith(".git"): url = url[: -len(".git")] return url - def apply(self, **kwargs: Any) -> None: # type: ignore[bad-override] - try: - local_repo = git.Repo(search_parent_directories=True) - remote_repo_url = self._get_github_source_url(local_repo) - except Exception: - # Cannot figure out which source url it should be; leave links as-is. - return + def apply(self) -> None: + # Lazy import keeps conf.py importable without docs extras + try: + import git # type: ignore + except Exception: + return # GitPython not installed; skip transform + try: + local_repo = git.Repo(search_parent_directories=True) + remote_repo_url = self._get_github_source_url(local_repo) + except (git.InvalidGitRepositoryError, git.NoSuchPathError, ValueError): + return # Not a git checkout or no suitable remote; skip if local_repo.working_tree_dir is None: # If the local repo is a bare repo, the method below won't work. return wt_dir = local_repo.working_tree_dir + # Resolve Sphinx docs srcdir to anchor relative paths correctly + env = getattr(self.document.settings, "env", None) + if not env or not getattr(env, "srcdir", None): + return + srcdir = Path(env.srcdir).resolve() + # Allow CI to override the ref (branch/tag/SHA); default to exact commit for stability + ref = os.environ.get("DOCS_GITHUB_REF", local_repo.head.object.hexsha) - for node in self.document.traverse(addnodes.download_reference): - md_dir = Path(node["refdoc"]).parent - dst_path = md_dir / Path(node["reftarget"]) - try: - dst_path = dst_path.resolve(strict=True) - except OSError: - # If the path doesn't exist or a symlink loop is encountered. - continue + for node in list(self.document.findall(addnodes.download_reference)): + md_abs_dir = (srcdir / Path(node["refdoc"]).parent).resolve() + dst_path = (md_abs_dir / Path(node["reftarget"])).resolve(strict=False) + if not dst_path.exists(): + # If the path doesn't exist, skip + continue if dst_path.is_file(): kind = "blob" elif dst_path.is_dir(): kind = "tree" else: # Cannot figure out what type of thing this path is pointing to. continue refuri = "/".join( ( remote_repo_url.rstrip("/"), kind, - local_repo.head.object.hexsha, - urllib.parse.quote(dst_path.relative_to(wt_dir).as_posix()), + ref, + urllib.parse.quote(dst_path.relative_to(wt_dir).as_posix(), safe="/"), ) ) new_node = nodes.reference(rawsource=node.rawsource, refuri=refuri) # Preserve styling and title if present. if "classes" in node: new_node["classes"] = list(node["classes"]) if "title" in node: new_node["title"] = node["title"] if node.children: new_node += node.children node.replace_self(new_node)Also applies to: 168-235
119-121: Close unterminated GitHub-style admonitions at EOF and silence unused-arg warnings.Prevents open blocks when the admonition runs to end-of-file and clears Ruff ARG001.
-def _convert_gh_admonitions( - app: Sphinx, relative_path: Path, parent_docname: str, contents: list[str] -) -> None: +def _convert_gh_admonitions( + _app: Sphinx, _relative_path: Path, _parent_docname: str, contents: list[str] +) -> None: @@ for i, orig_content in enumerate(contents): orig_line_splits = orig_content.split("\n") replacing = False for j, line in enumerate(orig_line_splits): @@ orig_line_splits[j] = line - # swap line back in original + # Close an admonition that runs to EOF + if replacing: + orig_line_splits.append("```") + replacing = False + # swap line back in original contents[i] = "\n".join(orig_line_splits)Also applies to: 133-166
🧹 Nitpick comments (2)
docs/conf.py (2)
171-172: Remove unnecessary type-ignore on class attribute.
default_priorityoverride is valid; the ignore is noise.- default_priority = 500 # type: ignore[bad-override] + default_priority = 500
181-183: Shorten exception message to satisfy TRY003.Keep the message on a single line or reuse the refactored logic in the main diff.
- raise ValueError( - "Cannot determine which remote repo on GitHub this local repo is from." - ) + raise ValueError("No GitHub remote found for this repository.")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/conf.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
docs/conf.py (1)
docs/autodoc2_docstrings_parser.py (1)
parse(20-27)
🪛 Ruff (0.12.2)
docs/conf.py
120-120: Unused function argument: app
(ARG001)
120-120: Unused function argument: relative_path
(ARG001)
120-120: Unused function argument: parent_docname
(ARG001)
181-183: Avoid specifying long messages outside the exception class
(TRY003)
191-191: Unused method argument: kwargs
(ARG002)
195-195: Do not catch blind exception: Exception
(BLE001)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Lint check
…NVIDIA-NeMo#1070) Signed-off-by: Shang Wang <[email protected]> Signed-off-by: ashors1 <[email protected]> Signed-off-by: Charlie Truong <[email protected]> Signed-off-by: Terry Kong <[email protected]> Signed-off-by: Zhiyu Li <[email protected]> Signed-off-by: Zhiyu Li <[email protected]> Co-authored-by: Anna Shors <[email protected]> Co-authored-by: Charlie Truong <[email protected]> Co-authored-by: Terry Kong <[email protected]> Co-authored-by: Zhiyu Li <[email protected]> Co-authored-by: Yuki Huang <[email protected]>
What does this PR do ?
This PR adds a converter (for the Sphinx documentation) that converts "the relative path to a file" in the Markdown to "the URL of the file on GitHub" being displayed on the documentation website.
This PR also fixes minor coding styles issues from a previous PR (#837).
Issues
Fixes #577
Usage
Screencast.from.2025-09-03.06.59.58.PM.webm
Before your PR is "Ready for review"
Pre checks:
Additional Information
Summary by CodeRabbit