Skip to content

Apply formatting to markdown code blocks#22470

Merged
amyreese merged 45 commits intomainfrom
amy/ruffen-docs
Jan 28, 2026
Merged

Apply formatting to markdown code blocks#22470
amyreese merged 45 commits intomainfrom
amy/ruffen-docs

Conversation

@amyreese
Copy link
Member

@amyreese amyreese commented Jan 9, 2026

Adds initial support for formatting Python code blocks inside Markdown files.

  • Adds Markdown source types/kinds
  • Maps .md file extension to Markdown by default
  • Uses simple regex adapted from blacken-docs to find and format fenced python code blocks
  • Dedents contents before formatting, and reapplies indent from fenced ```py header
  • Selects Python vs Stub options based on language label on code block
  • Silently skips formatting for any code block with syntax errors or that produce formatting errors.
  • CLI tests formatting via both stdin and from filesystem
  • Requires running with --preview, and otherwise emits formatting error when given a markdown file
  • Requires a user to extend-include = ["**/*.md"] if they want to format markdown files by default

Limitations:

  • Returns a formatting error if run with a range of any sort
  • Ignores implicit code blocks (no code fence)
  • Doesn't yet support ~~~ fences, arbitrary fence lengths, or code blocks inside blockquotes

Issue #3792

@astral-sh-bot
Copy link

astral-sh-bot bot commented Jan 9, 2026

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@amyreese amyreese changed the title Ruffen docs prototype WIP: Ruffen docs prototype Jan 9, 2026
@amyreese
Copy link
Member Author

amyreese commented Jan 13, 2026

minimal working prototype using regex adapted from blacken-docs:

amethyst@lunatone ~/workspace/ruff amy/ruffen-docs » cat ~/scratch/test.md
Hello, this is a *markdown* document.

This is a rust code block:

```rust
fn main() {
    for x in 0..10 {
        println!("x = {x}");
    }
}
```

This is a poorly formatted python code block:

```py
def foo(arg1,
           arg2):
    print( "hello world")



foo(1 , 2)
```

This is another python code block, also poorly formatted, but now in a list:

1. List item 1
2. List item 2

    ```python
    dataset = [1, 2, 3,
        4, 5, 6]
    if 1+2==3:
        print('yes')
    ```

And here's an unlabeled code block that happens to have valid python code — what do we do?

```
print("hello")
```

amethyst@lunatone ~/workspace/ruff amy/ruffen-docs » cargo run -p ruff -- format --no-cache --diff ~/scratch/test.md
   Compiling ruff v0.14.11 (/Users/amethyst/workspace/ruff/crates/ruff)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 1.66s
     Running `target/debug/ruff format --no-cache --diff /Users/amethyst/scratch/test.md`
--- /Users/amethyst/scratch/test.md
+++ /Users/amethyst/scratch/test.md
@@ -13,13 +13,11 @@
 This is a poorly formatted python code block:

 ```py
-def foo(arg1,
-           arg2):
-    print( "hello world")
+def foo(arg1, arg2):
+    print("hello world")


-
-foo(1 , 2)
+foo(1, 2)
 ```

 This is another python code block, also poorly formatted, but now in a list:
@@ -28,10 +26,9 @@
 2. List item 2

     ```python
-    dataset = [1, 2, 3,
-        4, 5, 6]
-    if 1+2==3:
-        print('yes')
+    dataset = [1, 2, 3, 4, 5, 6]
+    if 1 + 2 == 3:
+        print("yes")
     ```

 And here's an unlabeled code block that happens to have valid python code — what do we do?

1 file would be reformatted
> [1]

@amyreese
Copy link
Member Author

Some notes on the regex adapted from blacken-docs:

  • it does not support ~~~ delimited code blocks
  • it does not support code blocks delimited by more than three backticks/tildes
  • it does verify that indentation and the end delimiter matches the starting delimiter

Prototype also silently discards any formatting error, and those should either be warnings or get tracked somehow to re-raise them outside of the closure.

@amyreese
Copy link
Member Author

Also need to decide on if/how to gate this behind --preview

@amyreese amyreese requested review from MichaReiser and ntBre January 20, 2026 21:23
@amyreese amyreese changed the title WIP: Ruffen docs prototype Apply formatting to markdown code blocks Jan 20, 2026
@amyreese amyreese marked this pull request as ready for review January 20, 2026 22:57
@amyreese
Copy link
Member Author

Not familiar enough with ty to know why this changed what files ty looks at, or how to make ty ignore the md files when loading the project.

@amyreese amyreese requested a review from ntBre January 26, 2026 21:04
Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I found a few nits, but this otherwise looks good to me!


settings.add_filter(&tempdir_filter(project_dir.to_str().unwrap()), "[TMP]/");
settings.add_filter(
&tempdir_filter(Self::crate_root().to_str().unwrap()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be tempdir_filtered? I don't think it's a big deal either way, but I wouldn't think this is needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I saw his comment. I just thought it might have applied only to the crate_root part. I don't think the crate root should be in a temporary directory, so I think we could just leave off the tempdir_filter part. Again, it's okay to leave it since it's a no-op if the filter doesn't match, but the tests should also fail loudly if it was necessary, so I don't think it hurts to try removing the tempdir_filter call.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tempdir_filter just transforms the path into a regex string, not sure why "tempdir" is in the name other than the expected use case:

format!(r"{}[\\/]?", regex::escape(path.as_ref()))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhh, I see. I had always seen this used with actual temporary directories, so I thought it was doing something special with /tmp. Sorry for the noise!

Comment on lines +244 to +247
Ok(Some(source_kind)) => match source_kind {
SourceKind::Markdown(_) => return Ok(Diagnostics::default()), // skip linting markdown
_ => source_kind,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as the comment on crates/ruff/resources/test/fixtures/unformatted.md? (Did we open a follow-up issue for that?)

@amyreese amyreese merged commit ffa07b5 into main Jan 28, 2026
49 checks passed
@amyreese amyreese deleted the amy/ruffen-docs branch January 28, 2026 01:26
@amyreese amyreese added the formatter Related to the formatter label Jan 28, 2026
@tvatter
Copy link
Contributor

tvatter commented Jan 29, 2026

It would be great to also support quarto markdown (qmd) files, where executability of code chunks is triggered by the addition of brackets 3{} vs 3:

{python} # this is executable code

python # this isn't

Wondered if this comment had gone unnoticed? Is there a way to include this?

@ntBre
Copy link
Contributor

ntBre commented Jan 29, 2026

Amy may have different thoughts, but I would expect Quarto formatting support to follow more general Quarto support tracked in #6140 rather than as part of this feature.

@tvatter
Copy link
Contributor

tvatter commented Jan 29, 2026

Why would that be? It seems that quarto support follows immediately from this PR or am I missing something?

Is it because context is carried from one block to the next? Otherwise, wouldn't modifying this line be enough?

@ntBre
Copy link
Contributor

ntBre commented Jan 29, 2026

Oh I may have misunderstood how Quarto notebooks are structured. They are just Markdown files where the python in the code block header is in brackets? Yes your updated comment makes sense to me in that case, sorry for the confusion.

For a demonstration of a line plot on a polar axis, see @fig-polar.

```{python}
#| label: fig-polar
#| fig-cap: "A line plot on a polar axis"

import numpy as np
import matplotlib.pyplot as plt

r = np.arange(0, 2, 0.01)
theta = 2 * np.pi * r
fig, ax = plt.subplots(
  subplot_kw = {'projection': 'polar'} 
)
ax.plot(theta, r)
ax.set_rticks([0.5, 1, 1.5, 2])
ax.grid(True)
plt.show()
```

From https://quarto.org/docs/computations/python.html#code-blocks

@tvatter
Copy link
Contributor

tvatter commented Jan 29, 2026

@ntBre #22947 is what I had in mind. Forgive me if it's really basic, it's just to give an idea.

@amyreese
Copy link
Member Author

I opened #22951 to track that feature request. I'll either include it as part of #22937 or adapt your PR at some point after that.

amyreese added a commit that referenced this pull request Feb 5, 2026
See [this
comment](#22470 (comment))
from #22470. Also related to #6140.

## Summary

Add support for formatting code blocks with curly brace syntax (e.g.,
\`\`\`{python}, \`\`\`{py}, \`\`\`{pyi}) in Markdown files. This syntax
is commonly used in tools like Quarto and R Markdown.

The regex pattern now matches both the standard syntax (\`\`\`python)
and the curly brace variant (\`\`\`{python}).

## Test Plan

Added test cases.

Fix #22951

---------

Co-authored-by: Amethyst Reese <amethyst@n7.gg>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

formatter Related to the formatter preview Related to preview mode features

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Markdown formatting options based on code block language Handle markdown code blocks with syntax errors

6 participants