refactor(markdown): replace skipped trivia with formatter-safe recovery by jfmcdowell · Pull Request #9746 · biomejs/biome

jfmcdowell · 2026-03-31T10:03:21Z

Note

This PR was created with AI assistance (Claude Code).

Summary

Replace Skipped trivia in Markdown parsing with formatter-safe representations. Normal parsing paths now use Whitespace trivia for structural whitespace, and quote/list depth-overflow recovery now wraps content in MdBogusBlock instead of attaching skipped trivia to normal nodes.

Test Plan

just test-crate biome_markdown_parser — all passed
just test-crate biome_markdown_formatter — all passed
just test-markdown-conformance — 652/652
just f
just l

Docs

N/A.

changeset-bot · 2026-03-31T10:03:25Z

⚠️ No Changeset found

Latest commit: 66269f9

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

codspeed-hq · 2026-03-31T10:10:17Z

Merging this PR will not alter performance

✅ 58 untouched benchmarks
⏩ 196 skipped benchmarks¹

_{Comparing jfmcdowell:refactor/markdown-remove-skipped-trivia (66269f9) with main (8837bf3)}

196 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

coderabbitai · 2026-03-31T10:32:09Z

Walkthrough

Added MarkdownLexContext::HeadingContent plus a force_relex_heading_content path so heading text is lexed without treating trailing spaces as hard line breaks. Introduced consume_as_whitespace_trivia() and a BumpWithContext::skip_as_trivia_of_kind_with_context hook to classify leading/trailing indentation and blank-line spaces as Whitespace trivia instead of Skipped. Introduced an MdBogusBlock grammar node, wired its CST formatter, and updated parser recovery paths (lists, quotes, headers) to emit bogus blocks or attach whitespace as trivia. Tests for heading trailing-space behaviour were added.

Possibly related PRs

fix(markdown): preserve nested list indent tokens #9717 — closely overlaps list parsing changes and the treatment of MD_TEXTUAL_LITERAL whitespace-only tokens.
refactor(markdown-parser): promote remaining skipped trivia to explicit CST nodes #9427 — related changes to promoting skipped whitespace into explicit trivia pieces in the markdown parser.
feat(md/fmt): links, codeblocks #9699 — related formatter work adding/adjusting handling for MdBogusBlock in generated bindings and formatting code.

Suggested labels

A-Formatter, A-Tooling

Suggested reviewers

dyc3
ematipico

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description clearly relates to the changeset, explaining the refactoring of Skipped trivia handling in Markdown parsing with specific technical details about the changes.
Title check	✅ Passed	The title accurately summarises the main refactoring: replacing skipped trivia with formatter-safe recovery mechanisms throughout the markdown parser.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/biome_markdown_parser/src/parser.rs (1)
396-445: ⚠️ Potential issue | 🟠 Major

Please add parser fixtures for the new whitespace-trivia path.

This flips normal parse paths from Skipped to Whitespace, but the diff ships no parser cases to pin the new CST/trivia shape. I’d want at least # h \n, # h \n, # h ## \n, and one blank-line/list continuation case in the same PR; these are exactly the sort of gremlins that regress quietly.

As per coding guidelines, "All code changes MUST include appropriate tests: lint rules require snapshot tests in 'tests/specs/{group}/{rule}/', formatters require snapshot tests with valid/invalid cases, parsers require test files covering valid and error cases, and bug fixes require tests that reproduce and validate the fix."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/parser.rs` around lines 396 - 445, Add
parser snapshot tests that exercise the new whitespace-trivia path so we pin the
CST/trivia shape produced by skip_line_indent and consume_as_whitespace_trivia:
create tests that parse the input lines "# h  \n", "# h   \n", "# h ##  \n", and
a blank-line/list-continuation case, then assert the produced CST/trivia
contains TriviaPieceKind::Whitespace (not Skipped) attached in the Regular
MarkdownLexContext; run/update snapshots to capture the new shape and ensure the
parser fixtures fail if skip_line_indent stops consuming or attaches Skipped
trivia instead.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/lexer/mod.rs`:
- Around line 271-278: The HeadingContent case still collapses 3+ trailing
spaces because consume_textual() bails on is_potential_hard_line_break() even
inside headings; update the mid-token hard-break guard in consume_textual() to
skip that bail when the current context is MarkdownLexContext::HeadingContent
(i.e., only treat is_potential_hard_line_break() as a hard-break bail-out when
context != MarkdownLexContext::HeadingContent), so that headings will consume
spaces normally; reference consume_textual(), is_potential_hard_line_break(),
and MarkdownLexContext::HeadingContent to locate and change the conditional
logic accordingly.

---

Outside diff comments:
In `@crates/biome_markdown_parser/src/parser.rs`:
- Around line 396-445: Add parser snapshot tests that exercise the new
whitespace-trivia path so we pin the CST/trivia shape produced by
skip_line_indent and consume_as_whitespace_trivia: create tests that parse the
input lines "# h  \n", "# h   \n", "# h ##  \n", and a
blank-line/list-continuation case, then assert the produced CST/trivia contains
TriviaPieceKind::Whitespace (not Skipped) attached in the Regular
MarkdownLexContext; run/update snapshots to capture the new shape and ensure the
parser fixtures fail if skip_line_indent stops consuming or attaches Skipped
trivia instead.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: bb9dbab5-3f2c-4065-bc58-18c018ba635a

📥 Commits

Reviewing files that changed from the base of the PR and between 8837bf3 and 00430d0.

⛔ Files ignored due to path filters (7)

crates/biome_markdown_parser/tests/md_test_suite/ok/atx_heading_trailing_hash.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/fenced_code_in_list.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/header.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/html_block_in_list.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/list_continuation_edge_cases.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/list_indentation.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/setext_heading_edge_cases.md.snap is excluded by !**/*.snap and included by **

📒 Files selected for processing (7)

crates/biome_markdown_parser/src/lexer/mod.rs
crates/biome_markdown_parser/src/parser.rs
crates/biome_markdown_parser/src/syntax/header.rs
crates/biome_markdown_parser/src/syntax/list.rs
crates/biome_markdown_parser/src/syntax/mod.rs
crates/biome_markdown_parser/src/token_source.rs
crates/biome_parser/src/token_source.rs

Refs biomejs#9742. `Skipped` trivia now only appears in genuine error-recovery paths (quote/list nesting depth limits). Spec-driven structural whitespace now uses `Whitespace` trivia instead. Add `skip_as_trivia_of_kind_with_context` to `BumpWithContext` in `biome_parser` so parsers can consume tokens as a chosen trivia kind. Markdown uses this to emit `Whitespace` trivia for structural whitespace the CommonMark spec says to strip. Add `MarkdownLexContext::HeadingContent` to split `MD_HARD_LINE_LITERAL` (trailing spaces + newline) into separate tokens in heading parsing. This lets the parser consume only the whitespace as trivia while keeping the newline visible as a real token, avoiding syntax-significant trivia. Converted normal-path sites from `Skipped` to `Whitespace`: - `parser.rs`: `skip_line_indent` - `header.rs`: whitespace before/after trailing `#` - `header.rs`: trailing spaces in heading content via re-lexing - `mod.rs`: blank-line whitespace and post-hard-break spaces - `list.rs`: blank-line whitespace in lists Also convert `list.rs:skip_list_marker_indent` from `Skipped` to `Whitespace`; this helper is only used in nesting-depth recovery paths. Retain `Skipped` only for recovery: - `quote.rs`: quote nesting depth exceeded - `list.rs`: list nesting depth exceeded Formatter-facing CST structure for quote prefixes, list prefixes, and header indent/hash nodes was already explicit and did not need grammar changes.

Add MdBogusBlock to the Markdown grammar as a valid AnyMdBlock child. Use it in quote and list depth-overflow recovery paths to wrap the consumed marker tokens instead of routing them through parse_as_skipped_trivia_tokens. Previously, recovery tokens (e.g. an over-nested `>` marker) were attached as Skipped trivia on the next normal paragraph content. This made them visible to the formatter on normal nodes. Now they are contained inside MdBogusBlock, which the formatter skips. With this change, zero Skipped trivia remains anywhere in the Markdown parser — not on normal paths, and not on recovery paths. Remove the now-unused skip_optional_marker_space function.

coderabbitai

🧹 Nitpick comments (1)

crates/biome_markdown_parser/src/syntax/quote.rs (1)

89-93: Reuse try_bump_quote_marker in this recovery branch.

This branch duplicates marker-consumption logic already centralised in try_bump_quote_marker, which can drift over time.

♻️ Small refactor

-        if p.at(T![>]) {
-            p.bump(T![>]);
-        } else if p.at(MD_TEXTUAL_LITERAL) && p.cur_text() == ">" {
-            p.bump_remap(T![>]);
-        }
+        let _ = try_bump_quote_marker(p);

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/quote.rs` around lines 89 - 93, The
recovery branch duplicates the quote-marker consumption logic; replace the
conditional that checks p.at(T![>]) / p.at(MD_TEXTUAL_LITERAL) and calls p.bump
or p.bump_remap with a single call to the existing helper try_bump_quote_marker
so the centralized logic in try_bump_quote_marker is reused; locate the recovery
branch in quote.rs and call try_bump_quote_marker(p) (or the appropriate
receiver) instead of duplicating the two p.at/... branches, preserving existing
control flow and any surrounding error/recovery handling.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/quote.rs`:
- Around line 89-93: The recovery branch duplicates the quote-marker consumption
logic; replace the conditional that checks p.at(T![>]) /
p.at(MD_TEXTUAL_LITERAL) and calls p.bump or p.bump_remap with a single call to
the existing helper try_bump_quote_marker so the centralized logic in
try_bump_quote_marker is reused; locate the recovery branch in quote.rs and call
try_bump_quote_marker(p) (or the appropriate receiver) instead of duplicating
the two p.at/... branches, preserving existing control flow and any surrounding
error/recovery handling.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 48462d35-7339-4024-a346-a165d4623af6

📥 Commits

Reviewing files that changed from the base of the PR and between f2e7e03 and 66269f9.

⛔ Files ignored due to path filters (6)

crates/biome_markdown_factory/src/generated/node_factory.rs is excluded by !**/generated/**, !**/generated/** and included by **
crates/biome_markdown_factory/src/generated/syntax_factory.rs is excluded by !**/generated/**, !**/generated/** and included by **
crates/biome_markdown_parser/tests/md_test_suite/error/quote_nesting_too_deep.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_syntax/src/generated/kind.rs is excluded by !**/generated/**, !**/generated/** and included by **
crates/biome_markdown_syntax/src/generated/macros.rs is excluded by !**/generated/**, !**/generated/** and included by **
crates/biome_markdown_syntax/src/generated/nodes.rs is excluded by !**/generated/**, !**/generated/** and included by **

📒 Files selected for processing (8)

crates/biome_markdown_formatter/src/generated.rs
crates/biome_markdown_formatter/src/markdown/any/block.rs
crates/biome_markdown_formatter/src/markdown/bogus/bogus_block.rs
crates/biome_markdown_formatter/src/markdown/bogus/mod.rs
crates/biome_markdown_parser/src/syntax/list.rs
crates/biome_markdown_parser/src/syntax/quote.rs
xtask/codegen/markdown.ungram
xtask/codegen/src/markdown_kinds_src.rs

✅ Files skipped from review due to trivial changes (2)

crates/biome_markdown_formatter/src/markdown/bogus/mod.rs
xtask/codegen/src/markdown_kinds_src.rs

🚧 Files skipped from review as they are similar to previous changes (1)

crates/biome_markdown_parser/src/syntax/list.rs

dyc3

snapshots look good

github-actions bot added A-Parser Area: parser L-Markdown Language: Markdown labels Mar 31, 2026

jfmcdowell marked this pull request as ready for review March 31, 2026 10:21

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

Comment thread crates/biome_markdown_parser/src/lexer/mod.rs

jfmcdowell force-pushed the refactor/markdown-remove-skipped-trivia branch from 00430d0 to f2e7e03 Compare March 31, 2026 10:56

github-actions bot added A-Formatter Area: formatter A-Tooling Area: internal tools labels Mar 31, 2026

coderabbitai bot reviewed Mar 31, 2026

View reviewed changes

jfmcdowell changed the title ~~refactor(markdown): eliminate Skipped trivia from normal parsing paths~~ refactor(markdown): replace skipped trivia with formatter-safe recovery Mar 31, 2026

jfmcdowell mentioned this pull request Mar 31, 2026

🐛 Markdown skipped trivia #9742

Open

1 task

dyc3 approved these changes Mar 31, 2026

View reviewed changes

dyc3 merged commit d94b8ac into biomejs:main Mar 31, 2026
34 checks passed

This was referenced Mar 31, 2026

refactor(markdown): cleanup nits from #9746 #9751

Merged

refactor(markdown): emit continuation indent as structural CST node #9737

Merged

dyc3 pushed a commit that referenced this pull request Mar 31, 2026

refactor(markdown): cleanup nits from #9746 (#9751)

16d37a4

jfmcdowell deleted the refactor/markdown-remove-skipped-trivia branch March 31, 2026 21:24

This was referenced Apr 1, 2026

feat(markdown): implement basic formatter features #9693

Merged

fix(markdown_parser): recognize setext heading inside blockquote #9782

Merged

fix(markdown_parser): incorrect tight/loose list classification at marker boundaries #9787

Merged

This was referenced Apr 11, 2026

feat(md/fmt): better formatting for some cases #9917

Merged

fix(md): code info string, and fmt advancement #9979

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor(markdown): replace skipped trivia with formatter-safe recovery#9746

refactor(markdown): replace skipped trivia with formatter-safe recovery#9746
dyc3 merged 2 commits intobiomejs:mainfrom
jfmcdowell:refactor/markdown-remove-skipped-trivia

jfmcdowell commented Mar 31, 2026 •

edited

Loading

Uh oh!

changeset-bot bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

codspeed-hq bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Mar 31, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

dyc3 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

jfmcdowell commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Docs

Uh oh!

changeset-bot bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

codspeed-hq bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

coderabbitai bot commented Mar 31, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

dyc3 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jfmcdowell commented Mar 31, 2026 •

edited

Loading

changeset-bot bot commented Mar 31, 2026 •

edited

Loading

codspeed-hq bot commented Mar 31, 2026 •

edited

Loading

coderabbitai bot commented Mar 31, 2026 •

edited

Loading