Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify logic for unindenting doc comments #125396

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
77 changes: 19 additions & 58 deletions compiler/rustc_resolve/src/rustdoc.rs
Original file line number Diff line number Diff line change
Expand Up @@ -84,84 +84,44 @@ pub enum MalformedGenerics {

/// Removes excess indentation on comments in order for the Markdown
/// to be parsed correctly. This is necessary because the convention for
/// writing documentation is to provide a space between the /// or //! marker
/// writing documentation is to provide a space between the `///` or `//!` marker
/// and the doc text, but Markdown is whitespace-sensitive. For example,
/// a block of text with four-space indentation is parsed as a code block,
/// so if we didn't unindent comments, these list items
///
/// ```
/// /// A list:
/// ///
/// /// - Foo
/// /// - Bar
/// # fn foo() {}
/// ```
///
/// would be parsed as if they were in a code block, which is likely not what the user intended.
pub fn unindent_doc_fragments(docs: &mut [DocFragment]) {
// `add` is used in case the most common sugared doc syntax is used ("/// "). The other
// fragments kind's lines are never starting with a whitespace unless they are using some
// markdown formatting requiring it. Therefore, if the doc block have a mix between the two,
// we need to take into account the fact that the minimum indent minus one (to take this
// whitespace into account).
//
// For example:
//
// /// hello!
// #[doc = "another"]
//
// In this case, you want "hello! another" and not "hello! another".
let add = if docs.windows(2).any(|arr| arr[0].kind != arr[1].kind)
&& docs.iter().any(|d| d.kind == DocFragmentKind::SugaredDoc)
{
// In case we have a mix of sugared doc comments and "raw" ones, we want the sugared one to
// "decide" how much the minimum indent will be.
1
} else {
0
};

// `min_indent` is used to know how much whitespaces from the start of each lines must be
// removed. Example:
//
// /// hello!
// #[doc = "another"]
//
// In here, the `min_indent` is 1 (because non-sugared fragment are always counted with minimum
// 1 whitespace), meaning that "hello!" will be considered a codeblock because it starts with 4
// (5 - 1) whitespaces.
let Some(min_indent) = docs
// Only sugared docs have the concept of indentation.
// We assume the docs overall are indented by the amount that the least-indented line is indented.
// Raw docs are taken literally.
let min_indent = docs
.iter()
.map(|fragment| {
fragment
.doc
.as_str()
.lines()
.filter(|line| line.chars().any(|c| !c.is_whitespace()))
.map(|line| {
// Compare against either space or tab, ignoring whether they are
// mixed or not.
let whitespace = line.chars().take_while(|c| *c == ' ' || *c == '\t').count();
whitespace
+ (if fragment.kind == DocFragmentKind::SugaredDoc { 0 } else { add })
})
.min()
.unwrap_or(usize::MAX)
})
.filter(|frag| frag.kind == DocFragmentKind::SugaredDoc)
.flat_map(|frag| frag.doc.as_str().lines())
.filter(|line| line.chars().any(|c| !c.is_whitespace()))
// FIXME: we should be more careful when spaces and tabs are mixed
.map(|line| line.chars().take_while(|c| *c == ' ' || *c == '\t').count())
.min()
else {
return;
};
.unwrap_or(0);

for fragment in docs {
if fragment.doc == kw::Empty {
continue;
}

let indent = if fragment.kind != DocFragmentKind::SugaredDoc && min_indent > 0 {
min_indent - add
} else {
min_indent
fragment.indent = match fragment.kind {
// Raw docs are taken literally.
DocFragmentKind::RawDoc => 0,
DocFragmentKind::SugaredDoc => min_indent,
};

fragment.indent = indent;
}
}

Expand All @@ -171,6 +131,7 @@ pub fn unindent_doc_fragments(docs: &mut [DocFragment]) {
///
/// Note: remove the trailing newline where appropriate
pub fn add_doc_fragment(out: &mut String, frag: &DocFragment) {
debug!("add_doc_fragment: {:?}", frag);
if frag.doc == kw::Empty {
out.push('\n');
return;
Expand Down
28 changes: 14 additions & 14 deletions tests/rustdoc-ui/unescaped_backticks.stderr
Original file line number Diff line number Diff line change
Expand Up @@ -280,11 +280,11 @@ LL | | /// level changes.
| |______________________^
|
= help: the opening backtick of a previous inline code may be missing
change: The Subscriber` may be accessed by calling [`WeakDispatch::upgrade`],
to this: The `Subscriber` may be accessed by calling [`WeakDispatch::upgrade`],
change: The Subscriber` may be accessed by calling [`WeakDispatch::upgrade`],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems surprising, no?

Copy link
Member Author

@camelid camelid May 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's strange indeed, especially because (based on the logging I'm looking at), the indent is correctly assessed to be 1. Looking into this more... EDIT: Never mind, that was a different line of code, but still weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so here's the reason. It's because this code is wrapped in a macro (which is just the identity function, returning exactly its input). However, this turns the sugared doc attrs into raw doc attrs; thus, the indent is computed as 0. This also makes the lint's suggestion go into the "uglier" path because source_span_for_markdown_range requires all inputs to be sugared doc attrs.

I don't know if there's something we want to change here. It does seem unfortunate that the indentation rules will change if the docs are passed through a macro, although that does seem like a rare occurrence.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you prefer then! r=me if it sounds good to you as is. ;)

to this: The `Subscriber` may be accessed by calling [`WeakDispatch::upgrade`],
= help: if you meant to use a literal backtick, escape it
change: `None`. Otherwise, it will return `Some(Dispatch)`.
to this: `None`. Otherwise, it will return `Some(Dispatch)\`.
change: `None`. Otherwise, it will return `Some(Dispatch)`.
to this: `None`. Otherwise, it will return `Some(Dispatch)\`.

error: unescaped backtick
--> $DIR/unescaped_backticks.rs:323:5
Expand All @@ -300,8 +300,8 @@ LL | | /// level changes.
|
= help: the opening or closing backtick of an inline code may be missing
= help: if you meant to use a literal backtick, escape it
change: or `None` if it isn't.
to this: or `None\` if it isn't.
change: or `None` if it isn't.
to this: or `None\` if it isn't.

error: unescaped backtick
--> $DIR/unescaped_backticks.rs:323:5
Expand All @@ -316,11 +316,11 @@ LL | | /// level changes.
| |______________________^
|
= help: a previous inline code might be longer than expected
change: Called before the filtered [`Layer]'s [`on_event`], to determine if
to this: Called before the filtered [`Layer`]'s [`on_event`], to determine if
change: Called before the filtered [`Layer]'s [`on_event`], to determine if
to this: Called before the filtered [`Layer`]'s [`on_event`], to determine if
= help: if you meant to use a literal backtick, escape it
change: `on_event` should be called.
to this: `on_event\` should be called.
change: `on_event` should be called.
to this: `on_event\` should be called.

error: unescaped backtick
--> $DIR/unescaped_backticks.rs:323:5
Expand All @@ -335,11 +335,11 @@ LL | | /// level changes.
| |______________________^
|
= help: a previous inline code might be longer than expected
change: Therefore, if the `Filter will change the value returned by this
to this: Therefore, if the `Filter` will change the value returned by this
change: Therefore, if the `Filter will change the value returned by this
to this: Therefore, if the `Filter` will change the value returned by this
= help: if you meant to use a literal backtick, escape it
change: [`rebuild_interest_cache`][rebuild] is called after the value of the max
to this: [`rebuild_interest_cache\`][rebuild] is called after the value of the max
change: [`rebuild_interest_cache`][rebuild] is called after the value of the max
to this: [`rebuild_interest_cache\`][rebuild] is called after the value of the max

error: unescaped backtick
--> $DIR/unescaped_backticks.rs:349:56
Expand Down
2 changes: 1 addition & 1 deletion tests/rustdoc/unindent.rs
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ pub struct G;
pub struct H;

// @has foo/struct.I.html
// @matches - '//pre[@class="rust rust-example-rendered"]' '(?m)4 whitespaces!\Z'
// @has - '//div[@class="docblock"]/p' '4 whitespaces! something'
/// 4 whitespaces!
#[doc = "something"]
pub struct I;
Expand Down
Loading