fix(md): code info string, and fmt advancement#9979
Conversation
|
Parser conformance results onjs/262
jsx/babel
markdown/commonmark
symbols/microsoft
ts/babel
ts/microsoft
|
Merging this PR will not alter performance
Comparing Footnotes
|
WalkthroughUpdates Markdown formatter and parser handling for fenced code blocks and related auxiliary tokens. Formatter changes replace verbatim-node emissions with structured Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
crates/biome_markdown_formatter/src/markdown/auxiliary/fenced_code_block.rs (1)
28-30: Tilde-to-backtick normalisation: theoretical edge case.The formatter unconditionally normalises all fence characters to backticks. Per CommonMark, backtick-fenced info strings cannot contain backticks, but tilde-fenced ones can. Normalising a tilde fence with backticks in its info string would produce invalid Markdown.
That said, the parser already validates this (rejects backticks in backtick fence info strings), and no test cases with this scenario exist. If it becomes an issue in practice, adding a test case and handling would be straightforward.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_formatter/src/markdown/auxiliary/fenced_code_block.rs` around lines 28 - 30, The current code in fenced_code_block.rs always normalizes fences to backticks (using longest_fence_char_sequence(node, '`') and building normalized_fence), which can produce invalid Markdown if the original fence was '~' and the info string contains backticks; change the logic to preserve the original fence character when the original fence is '~' and the info string contains '`' (i.e., inspect node to determine the original fence char and the info string, and only normalize to backticks when doing so will not introduce backticks into the info string), otherwise proceed with the existing longest_fence_char_sequence(...) and repeat_n(...) logic to build normalized_fence.crates/biome_markdown_parser/src/lexer/mod.rs (1)
275-280: Drop unreachableCodeInfoStringbranch inWHSarm.The early return in
consume_tokenalready handlesCodeInfoString, so this extra branch cannot run and just adds noise.Suggested tidy-up
} else if matches!(context, MarkdownLexContext::LinkDefinition) { // In link definition context, whitespace separates tokens. // We consume it as textual literal so it's not treated as trivia by the parser. self.consume_link_definition_whitespace() - } else if matches!(context, MarkdownLexContext::CodeInfoString) { - return if current == b'\n' || current == b'\r' { - self.consume_newline() - } else { - self.consume_code_info_string() - }; } else if self.after_newline && matches!(current, b' ' | b'\t') {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/lexer/mod.rs` around lines 275 - 280, The WHS arm in consume_token contains an unreachable branch checking matches!(context, MarkdownLexContext::CodeInfoString) and returning consume_newline() or consume_code_info_string(); remove this conditional block from the WHS match arm (leaving the existing WHS behavior intact) because consume_token already returns early for MarkdownLexContext::CodeInfoString; update any surrounding match/else layout as needed to compile without changing semantics of consume_token, consume_newline, or consume_code_info_string.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/biome_markdown_formatter/src/markdown/auxiliary/fenced_code_block.rs`:
- Around line 28-30: The current code in fenced_code_block.rs always normalizes
fences to backticks (using longest_fence_char_sequence(node, '`') and building
normalized_fence), which can produce invalid Markdown if the original fence was
'~' and the info string contains backticks; change the logic to preserve the
original fence character when the original fence is '~' and the info string
contains '`' (i.e., inspect node to determine the original fence char and the
info string, and only normalize to backticks when doing so will not introduce
backticks into the info string), otherwise proceed with the existing
longest_fence_char_sequence(...) and repeat_n(...) logic to build
normalized_fence.
In `@crates/biome_markdown_parser/src/lexer/mod.rs`:
- Around line 275-280: The WHS arm in consume_token contains an unreachable
branch checking matches!(context, MarkdownLexContext::CodeInfoString) and
returning consume_newline() or consume_code_info_string(); remove this
conditional block from the WHS match arm (leaving the existing WHS behavior
intact) because consume_token already returns early for
MarkdownLexContext::CodeInfoString; update any surrounding match/else layout as
needed to compile without changing semantics of consume_token, consume_newline,
or consume_code_info_string.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: a6def3d3-f782-4091-adf1-90e523703da2
⛔ Files ignored due to path filters (16)
crates/biome_markdown_formatter/tests/specs/markdown/fenced_code_block_info_string.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/multiparser-js/meta-in-code-block.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-104.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-106.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-108.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-110.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-197.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-292.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-297.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-307.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-88.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-90.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-92.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-93.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-94.md.snapis excluded by!**/*.snapand included by**crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-95.md.snapis excluded by!**/*.snapand included by**
📒 Files selected for processing (11)
crates/biome_markdown_formatter/src/markdown/auxiliary/fenced_code_block.rscrates/biome_markdown_formatter/src/markdown/auxiliary/hard_line.rscrates/biome_markdown_formatter/src/markdown/auxiliary/header.rscrates/biome_markdown_formatter/src/markdown/auxiliary/indent_token.rscrates/biome_markdown_formatter/src/markdown/lists/code_name_list.rscrates/biome_markdown_formatter/tests/quick_test.rscrates/biome_markdown_formatter/tests/specs/markdown/fenced_code_block_info_string.mdcrates/biome_markdown_parser/src/lexer/mod.rscrates/biome_markdown_parser/src/parser.rscrates/biome_markdown_parser/src/syntax/fenced_code_block.rscrates/biome_markdown_parser/tests/quick_test.rs
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/biome_markdown_parser/src/lexer/mod.rs (1)
74-75: Tiny docstring polish.The comment reads “strings doesn't have particular meaning”; consider “strings don't have special meaning” for clarity.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/lexer/mod.rs` around lines 74 - 75, Update the doc comment for the enum variant CodeInfoString: replace the phrase "Inside the code block list, where strings doesn't have particular meaning" with grammatically correct wording such as "Inside the code block list, where strings don't have special meaning" in the comment above CodeInfoString in lexer/mod.rs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/biome_markdown_parser/src/lexer/mod.rs`:
- Around line 74-75: Update the doc comment for the enum variant CodeInfoString:
replace the phrase "Inside the code block list, where strings doesn't have
particular meaning" with grammatically correct wording such as "Inside the code
block list, where strings don't have special meaning" in the comment above
CodeInfoString in lexer/mod.rs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: fd0995c6-5825-4c14-8fee-5c05b7d6752b
⛔ Files ignored due to path filters (1)
crates/biome_markdown_formatter/tests/specs/markdown/fenced_code_block.md.snapis excluded by!**/*.snapand included by**
📒 Files selected for processing (1)
crates/biome_markdown_parser/src/lexer/mod.rs
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/biome_markdown_formatter/src/markdown/auxiliary/textual.rs (1)
72-80: Consider using.trim()for brevity.
trim_start().trim_end()is equivalent to.trim()— the latter is slightly more idiomatic.♻️ Suggested simplification
} else if self.print_mode.is_all() { - let trimmed_text = value_token.text().trim_start().trim_end(); + let trimmed_text = value_token.text().trim(); write!(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_formatter/src/markdown/auxiliary/textual.rs` around lines 72 - 80, Replace the verbose .trim_start().trim_end() call with .trim() where the trimmed_text is computed (inside the branch that checks self.print_mode.is_all()), i.e. update the assignment to trimmed_text that currently uses value_token.text().trim_start().trim_end() to use value_token.text().trim(); leave the surrounding write!( ..., format_replaced(&value_token, &text(trimmed_text, value_token.text_trimmed_range().start())) ) logic unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@crates/biome_markdown_formatter/src/markdown/auxiliary/textual.rs`:
- Around line 72-80: Replace the verbose .trim_start().trim_end() call with
.trim() where the trimmed_text is computed (inside the branch that checks
self.print_mode.is_all()), i.e. update the assignment to trimmed_text that
currently uses value_token.text().trim_start().trim_end() to use
value_token.text().trim(); leave the surrounding write!( ...,
format_replaced(&value_token, &text(trimmed_text,
value_token.text_trimmed_range().start())) ) logic unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 35a92a93-b883-4c14-9530-8ca8ffe40b6d
📒 Files selected for processing (10)
crates/biome_markdown_formatter/src/markdown/auxiliary/inline_image.rscrates/biome_markdown_formatter/src/markdown/auxiliary/inline_link.rscrates/biome_markdown_formatter/src/markdown/auxiliary/link_destination.rscrates/biome_markdown_formatter/src/markdown/auxiliary/link_title.rscrates/biome_markdown_formatter/src/markdown/auxiliary/reference_link.rscrates/biome_markdown_formatter/src/markdown/auxiliary/setext_header.rscrates/biome_markdown_formatter/src/markdown/auxiliary/textual.rscrates/biome_markdown_formatter/src/markdown/lists/code_name_list.rscrates/biome_markdown_formatter/src/markdown/lists/inline_item_list.rscrates/biome_markdown_formatter/src/shared.rs
🚧 Files skipped from review as they are similar to previous changes (1)
- crates/biome_markdown_formatter/src/markdown/lists/code_name_list.rs
|
@ematipico, yes the markdown dead code variants can be deleted. As for the backticks, we can add a diagnostic for friendlier feedback. |
Summary
This PR does the following:
quick_test, which was missing from the parser testing infra.[dead_code]). Can we delete them?Test Plan
Added new tests. Some prettier snapshots are now gone
Docs