[ty] Implement docstring rendering to markdown #21550

Gankra · 2025-11-20T22:57:14Z

Summary

This introduces a very bad and naive python-docstring-flavoured-reStructuredText to github-flavor-markdown translator. The main goal is to try to preserve a lot of the formatting and plaintext, progressively enhance the content when we find things we know about, and escape the text when we find things that might get corrupt.

Previously I'd broken this out into rendering each different format, but with this approach you don't really need to?

Test Plan

Lots of snapshot tests, also messed around in some random stdlib modules.

Gankra · 2025-11-20T22:58:23Z

crates/ty_ide/src/docstring.rs

+    let mut output = String::new();
+    let mut first_line = true;
+    let mut block_indent = 0;
+    let mut in_doctest = false;
+    let mut starting_literal = false;
+    let mut in_literal = false;
+    let mut in_any_code = false;
+    for untrimmed_line in docstring.lines() {


This is the worst parser I've written in a long while so if you tell me "Aria for the love of god make a Parser type with methods and enums" I will. Writing it this way was conducive to experimentation and thinking about the format.

astral-sh-bot · 2025-11-20T22:58:54Z

Diagnostic diff on typing conformance tests

No changes detected when running ty on typing conformance tests ✅

Gankra · 2025-11-20T22:59:52Z

I spent a while salivating over the ruff docstring formatter as a basis for this, but I ultimately concluded it would be a nightmare for both codebases to try to share that logic, and it's not that much logic.

astral-sh-bot · 2025-11-20T23:00:55Z

`mypy_primer` results

No ecosystem changes detected ✅

No memory usage changes detected ✅

Gankra · 2025-11-20T23:03:50Z

The main risk of shipping this implementation is that the previous implementation was so wildly conservative that it "couldn't" ever mess up whereas insufficient escaping or getting confused could complete garble things and eat the docs.

On balance though this just looks so much better I think the risk is worth it (and shipping it will get me a lot of bug reports about what to fix :)

MichaReiser · 2025-11-21T07:00:38Z

crates/ty_ide/src/docstring.rs

+    let mut starting_literal = false;
+    let mut in_literal = false;
+    let mut in_any_code = false;
+    for untrimmed_line in docstring.lines() {


You want to use .universal_newlines here to get proper \r support

I need to signpost that better in the comments but this function assumes documentation_trim (in the same file) was already run, so our leading whitespace is only spaces, and our newlines are only \n (and unlike ruff we don't need to care about authentically preserving the flavour of whitespace).

MichaReiser · 2025-11-21T07:01:17Z

crates/ty_ide/src/docstring.rs

+///
+/// This code currently assumes docstrings are reStructuredText with significant
+/// line-breaks and indentation. Some constructs like *italic*, **bold**, and `inline code`
+/// are shared between the two formats. Other things like _italic_ are only markdown


What are the two formats that you're referring to here?

reStructuredText and markdown

Oh, that's not clear to me from the comment

MichaReiser · 2025-11-21T07:03:51Z

crates/ty_ide/src/docstring.rs

+        // Add this line's indentation.
+        // We could subtract the block_indent here but in practice it's uglier
+        for _ in 0..line_indent {
+            // If we're not in a codeblock use non-breaking spaces to preserve the indent


Can you say why this is necessary, given that it gets rendered as markdown?

Yes so the idea here is that docstrings really like using indentation for structure that we do not (care to) parse but we also don't want markdown turning into code-blocks. By replacing leading whitespace with   outside of codeblocks, we preserve the native indent of the docstring and stuff we don't understand still is rendered clearly and pretty.

(Even if I had a perfect parser for this format, the only clear way to render this to markdown, imo, is by rendering these blocks as indented)

Oh, I think I misread it as in a code block. This makes a ton of sense for indents outside code blocks.

I wonder if we should use the unicode code point directly so that we don't rely on the markdown renderer supporting HTML?

MichaReiser · 2025-11-21T07:04:16Z

crates/ty_ide/src/docstring.rs

+
+        // Add this line's indentation.
+        // We could subtract the block_indent here but in practice it's uglier
+        for _ in 0..line_indent {


This won't work for code using tab indentation (which our formatter supports).

(another thing handled by documentation_trim)

MichaReiser · 2025-11-21T07:05:45Z

crates/ty_ide/src/docstring.rs

+///
+/// This is by no means perfect but it handles a lot of common formats well.
+///
+/// This code currently assumes docstrings are reStructuredText with significant


What happens for docstrings using the numpy or google format? will they still be rendered correctly?

MichaReiser · 2025-11-21T07:06:50Z

crates/ty_ide/src/docstring.rs

+
+        // First thing's first, add a newline to start the new line
+        if !first_line {
+            // If we're not in a codeblock, add trailing space to the line to authentically wrap it


I don't think I understand this. Why is it necessary to add trailing whitespace to any line? This seems very undesired?

It's part of the approach I'm taking where I want to preserve the wrapping/indentation of the original docstring.

It prevents e.g. param1 and param2 from being merged together as a single line in:

'param1' -- The first parameter description 'param2' -- The second parameter description This is a continuation of param2 description. 'param3' -- A parameter without type annotation

MichaReiser · 2025-11-21T07:08:26Z

crates/ty_ide/src/docstring.rs

+/// line-breaks and indentation. Some constructs like *italic*, **bold**, and `inline code`
+/// are shared between the two formats. Other things like _italic_ are only markdown
+/// and should be escaped.
+fn render_markdown(docstring: &str) -> String {


Let's add a link to reStructuredText

MichaReiser · 2025-11-21T07:10:30Z

crates/ty_ide/src/docstring.rs

+            in_literal = true;
+            in_any_code = true;
+            block_indent = line_indent;
+            // TODO: should we not be this aggressive? Let it autodetect?


https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-code-block

Defaulting to Python doesn't seem unreasonable but we should respect any override

MichaReiser · 2025-11-21T07:15:26Z

crates/ty_ide/src/docstring.rs

+        }
+
+        // If we're not in a codeblock and we see something that signals a literal block, start one
+        if !in_any_code && let Some(without_lit) = line.strip_suffix("::") {


I'd have to play with restructuredText but the way I read this paragraph

Literal code blocks (ref) are introduced by ending a paragraph with the special marker ::. The literal block must be indented

is that what comes after is only a literal block if there's at least one blank line, and text is indented.

I think your code correctly handles the empty line. But it ends up pushing an empty markdown code block if the next paragraph isn't correctly indented.

Haha yes you caught me this part of the parser is distinctly janky. I'll tweak it to make it Spec Compliant (although I can't help but wonder if being sloppy will be the name of the game since we need to render whatever we find).

MichaReiser · 2025-11-21T07:16:40Z

crates/ty_ide/src/docstring.rs

+    let mut output = String::new();
+    let mut first_line = true;
+    let mut block_indent = 0;
+    let mut in_doctest = false;
+    let mut starting_literal = false;
+    let mut in_literal = false;
+    let mut in_any_code = false;


I'm not sure about mixing the parsing of docstrings with the formatting logic. It seems harder to follow. I'd be inclined to parse the docstring into an intermediate representation and then do a second pass to format it.

Dang, ok if the captain of the performance team wants more intermediate allocations and representations who am I to complain 😄

(I was half expecting to be told to remove some intermediate Strings)

Lol. I'm not too concerned here, given that we don't render markdown strings that frequently and having a more structured representation would ultimately be useful for Ruff too. Although I don't think we should aim to make it reusable for now.

Gankra · 2025-11-21T12:52:34Z

crates/ty_ide/src/docstring.rs

+        if !in_any_code {
+            // This is plain text, apply some escapes for reST that don't map to markdown:
+            // * underscores aren't used for italics/bold in reST (don't mess up __dunders__)
+            output.push_str(&line.replace('_', "\\_"));


Found an in-retrospect-obvious place where this is broken: this mangles inline code __init__ (I kept seeing it without any code markup, so that was my primary concern)

…d-typevar * origin/main: (24 commits) [ty] Remove brittle constraint set reveal tests (#21568) [`ruff`] Catch more dummy variable uses (`RUF052`) (#19799) [ty] Use the same snapshot handling as other tests (#21564) [ty] suppress autocomplete suggestions during variable binding (#21549) Set severity for non-rule diagnostics (#21559) [ty] Add `with_type` convenience to display code (#21563) [ty] Implement docstring rendering to markdown (#21550) [ty] Reduce indentation of `TypeInferenceBuilder::infer_attribute_load` (#21560) Bump 0.14.6 (#21558) [ty] Improve debug messages when imports fail (#21555) [ty] Add support for relative import completions [ty] Refactor detection of import statements for completions [ty] Use dedicated collector for completions [ty] Attach subdiagnostics to `unresolved-import` errors for relative imports as well as absolute imports (#21554) [ty] support PEP 613 type aliases (#21394) [ty] More low-hanging fruit for inlay hint goto-definition (#21548) [ty] implement `TypedDict` structural assignment (#21467) [ty] Add more random TypeDetails and tests (#21546) [ty] Add goto for `Unknown` when it appears in an inlay hint (#21545) [ty] Add type definitions for `Type::SpecialForm`s (#21544) ...

Gankra added 2 commits November 20, 2025 15:51

render signatures

07aefe1

implement docstring rendering to markdown

1f69ffb

Gankra requested review from AlexWaygood, MichaReiser, carljm, dcreager and sharkdp as code owners November 20, 2025 22:57

Gankra commented Nov 20, 2025

View reviewed changes

Gankra added server Related to the LSP server ty Multi-file analysis & type inference labels Nov 20, 2025

Gankra changed the title ~~[ty] implement docstring rendering to markdown~~ [ty] Implement docstring rendering to markdown Nov 20, 2025

MichaReiser reviewed Nov 21, 2025

View reviewed changes

AlexWaygood removed their request for review November 21, 2025 12:48

Gankra commented Nov 21, 2025

View reviewed changes

Gankra added 3 commits November 21, 2025 10:23

Handle inline code in escaping

b301556

fixup

1d727aa

add note

d55043e

Gankra merged commit 6292582 into main Nov 21, 2025
40 checks passed

Gankra deleted the gankra/doc-render branch November 21, 2025 15:47

[ty] Implement docstring rendering to markdown #21550

[ty] Implement docstring rendering to markdown #21550

Uh oh!

Conversation

Gankra commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Uh oh!

Choose a reason for hiding this comment

Uh oh!

astral-sh-bot bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Diagnostic diff on typing conformance tests

Uh oh!

Gankra commented Nov 20, 2025

Uh oh!

astral-sh-bot bot commented Nov 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

mypy_primer results

Uh oh!

Gankra commented Nov 20, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gankra Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Gankra Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Gankra commented Nov 20, 2025 •

edited

Loading

astral-sh-bot bot commented Nov 20, 2025 •

edited

Loading

astral-sh-bot bot commented Nov 20, 2025 •

edited

Loading

`mypy_primer` results

Gankra Nov 21, 2025 •

edited

Loading

Gankra Nov 21, 2025 •

edited

Loading