refactor(markdown-parser): promote fenced code block skipped trivia to explicit CST nodes by jfmcdowell · Pull Request #9321 · biomejs/biome

jfmcdowell · 2026-03-03T23:43:26Z

Note

AI Assistance Disclosure: This PR was developed with assistance from Claude Code.

Summary

Add MdIndentToken to AnyMdInline in the grammar for fence indent stripping tokens.
Replace 4 parse_as_skipped_trivia_tokens() call sites in fenced_code_block.rs with explicit CST node emission:
- Sites 1-3: Blockquote > prefixes on continuation lines within fenced code blocks now emit MdQuotePrefix nodes (with MdQuoteIndentList, marker, and optional post-marker space).
- Site 4: Fence indent stripping per CommonMark §4.5 now emits MdIndentToken nodes with MD_INDENT_CHAR tokens.
Add MdIndentToken no-op arm in to_html.rs extract_alt_text_inline exhaustive match.
Regenerate codegen output (biome_markdown_syntax, biome_markdown_formatter).
Add error fixture fenced_code_in_blockquote.md documenting pre-existing limitation where fenced code blocks inside blockquotes produce unterminated fence diagnostics.
Update fenced_code_advanced.md snapshot to reflect new CST shape.

Continues the skipped trivia promotion series (#9219, #9274, #9313). Sites 1-3 (quote prefixes in code content) are structurally correct but exercised only via the pre-existing blockquote+fenced-code path which has a known limitation — the error fixture documents current behavior until a follow-up fix lands.

No user-facing behavior change. Parsed semantics are preserved; only the internal CST representation changes.

Test Plan

just test-crate biome_markdown_parser
just test-markdown-conformance
just f && just l

Docs

N/A — internal structural change, no new user-facing features.

changeset-bot · 2026-03-03T23:43:31Z

⚠️ No Changeset found

Latest commit: cf5d8f5

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

…o explicit CST nodes Replace 4 parse_as_skipped_trivia_tokens() call sites in fenced_code_block.rs: - Sites 1-3: blockquote > prefixes on continuation lines emit MdQuotePrefix nodes - Site 4: fence indent stripping emits MdIndentToken nodes Add MdIndentToken to AnyMdInline in the grammar and regenerate codegen. Add MdIndentToken no-op arm in to_html.rs extract_alt_text_inline. Add error fixture documenting pre-existing fenced-code-in-blockquote limitation. Extract try_bump_quote_marker as pub(crate) to deduplicate marker-bumping logic.

coderabbitai · 2026-03-04T02:33:41Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

Walkthrough

Adds MdIndentToken to the inline grammar and wires it through the formatter and HTML alt-text extraction. Refactors fenced-code parsing into a stateful loop with helpers for quote-prefix handling, virtual-line-start semantics and earlier closing-fence detection. Reorganises quote parsing (introducing emit_quote_prefix_tokens, try_bump_quote_marker, virtual-line-start helpers, and improved indent/prefix handling). Adds tests for fenced code blocks inside blockquotes. No public API signatures changed.

Possibly related PRs

refactor(markdown-parser): promote list structural tokens from skipped trivia to explicit CST nodes #9274: Adds and wires MdIndentToken across grammar, parser emission and formatter handling (direct overlap).
refactor(markdown-parser): promote pre-marker indent to explicit CST #9224: Implements quote-prefix refactor and explicit pre-marker indent tokens (emit_quote_prefix_tokens / try_bump_quote_marker) that this change extends.
fix(markdown-parser): promote blockquote prefix markers from skipped trivia to explicit CST nodes #9219: Modifies quote-prefix parsing and formatter paths with similar token/marker handling used here.

Suggested reviewers

ematipico
dyc3

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main structural refactoring: promoting fenced code block skipped trivia (indentation and quote prefixes) to explicit CST nodes (MdIndentToken).
Description check	✅ Passed	The description comprehensively details the changes across multiple files, the rationale for the refactoring, and links to related work—clearly related to the changeset throughout.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/fenced_code_block.rs`:
- Around line 370-373: The call to try_bump_quote_marker(p) is inside
debug_assert! so it is skipped in release builds and the parser state won't be
updated; replace the debug_assert! invocation with an unconditional call to
try_bump_quote_marker(p) (so the marker is always consumed) and keep an optional
debug-only check if desired (e.g., call try_bump_quote_marker(p) and then
debug_assert!(result, "guard above guarantees marker present")); update the code
around the debug_assert! to call try_bump_quote_marker(p) unconditionally and
handle a false result only via debug assertion or by panicking with the same
message.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 02a409d7-7cbc-4b18-996b-0f5447559e8c

📥 Commits

Reviewing files that changed from the base of the PR and between 1022662 and 8d084b9.

⛔ Files ignored due to path filters (3)

crates/biome_markdown_parser/tests/md_test_suite/error/fenced_code_in_blockquote.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_parser/tests/md_test_suite/ok/fenced_code_advanced.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_syntax/src/generated/nodes.rs is excluded by !**/generated/**, !**/generated/** and included by **

📒 Files selected for processing (6)

crates/biome_markdown_formatter/src/markdown/any/inline.rs
crates/biome_markdown_parser/src/syntax/fenced_code_block.rs
crates/biome_markdown_parser/src/syntax/quote.rs
crates/biome_markdown_parser/src/to_html.rs
crates/biome_markdown_parser/tests/md_test_suite/error/fenced_code_in_blockquote.md
xtask/codegen/markdown.ungram

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/fenced_code_block.rs`:
- Around line 314-335: The code currently sets at_line_start = false immediately
after consume_quote_prefixes_in_code_content, which prevents the later
fence-indent stripping block (skip_fenced_content_indent and at_closing_fence)
from running for lines inside blockquotes; update the loop in
fenced_code_block.rs so fence-indent stripping runs after quote prefix
consumption: after calling consume_quote_prefixes_in_code_content (function
name) and before or regardless of resetting at_line_start, call
skip_fenced_content_indent when fence_indent > 0 and then re-check
at_closing_fence (function name) — or alternatively handle blockquote-nested
indentation explicitly by adding a branch that strips fence_indent even when
at_line_start was just true and quote prefixes were consumed; ensure
CodeContentLoopAction semantics and the at_line_start flag are preserved.

ℹ️ Review info

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7e51518c-9db4-4808-9efb-2a5a282df782

📥 Commits

Reviewing files that changed from the base of the PR and between 8d084b9 and 582f51a.

📒 Files selected for processing (1)

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

Use virtual_line_start in line_has_closing_fence so fence detection starts after consumed quote prefixes instead of seeing `>` as non-whitespace. Set virtual_line_start after quote prefix consumption and allow fence-indent stripping to run on blockquote lines.

ematipico · 2026-03-04T08:25:11Z

It's weird that the coverage job isn't triggered by these changes.

ematipico · 2026-03-04T08:30:34Z

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

+    indent_list_m.complete(p, MD_QUOTE_INDENT_LIST);
+
+    let marker_bumped = try_bump_quote_marker(p);
+    debug_assert!(marker_bumped, "guard above guarantees marker present");


We usually try to use messages to understand what went wrong and/or how to fix it. For example, if a developer lands here, the message should tell what caused the problem, and where to look at for possible fixes (if applicable)

Fixed — replaced unreachable!() with a safe fallback (prefix_m.abandon(p); return false), and improved the debug_assert! message to explain the root cause and where to look:

"consume_quote_prefix_in_code_content: quote marker not found after guard confirmed `>` token — check that force_relex_regular and the guard condition are in sync"

ematipico · 2026-03-04T08:30:54Z

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

+    let marker_bumped = try_bump_quote_marker(p);
+    debug_assert!(marker_bumped, "guard above guarantees marker present");
+    if !marker_bumped {
+        unreachable!("guard above guarantees marker present");


No code that panics in production. Let's find a safer approach

Fixed — replaced unreachable!() with prefix_m.abandon(p); return false. The empty MD_QUOTE_INDENT_LIST that was already completed gets reparented to the parent via abandon, which is harmless in this theoretically unreachable path.

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

ematipico · 2026-03-04T08:34:27Z

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

+            CodeContentLoopAction::Continue => continue,
+            CodeContentLoopAction::ConsumeText => {}


Suggested change

CodeContentLoopAction::Continue => continue,

CodeContentLoopAction::ConsumeText => {}

CodeContentLoopAction::Continue |

CodeContentLoopAction::ConsumeText => continue,

These arms have different semantics — ConsumeText falls through to bump_code_textual(p) + at_line_start = false, while Continue skips both. Merging them would cause an infinite loop (parser position never advances).

Open to restructuring if you have a different approach in mind — what would you prefer here?

Maybe something like this that inverts the enum?

enum CodeContentTokenAction { Break, Skip, // renamed from Continue Consume, // renamed from ConsumeText } fn parse_code_content(...) { // ... while !p.at(T![EOF]) { match prepare_next_code_content_token(...) { CodeContentTokenAction::Break => break, CodeContentTokenAction::Skip => continue, CodeContentTokenAction::Consume => { bump_code_textual(p); at_line_start = false; } } } }

Much better yes!

Resolved: I've refactored the control flow to make all three code paths explicit within the match statement (d44b297)

ematipico · 2026-03-04T08:35:55Z

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

+    true
+}
+
+fn consume_code_textual(p: &mut MarkdownParser) {


There's a bit of misalignment among these new functions. Some return a boolean, some don't, but they all start with consume_*. I would look for a better alignment in naming

Fixed — renamed consume_code_textual → bump_code_textual since it unconditionally bumps and doesn't return a bool. The consume_* functions all follow the try-consume pattern (return bool), while bump_* is unconditional.

- Add docstrings to prepare_next_code_content_token, consume_quote_prefixes_in_code_content, and consume_quote_prefix_in_code_content - Replace unreachable!() with safe fallback (abandon + return false) - Improve debug_assert! message with actionable diagnostic - Rename consume_code_textual → bump_code_textual for naming alignment

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/fenced_code_block.rs`:
- Around line 350-360: The function consume_quote_prefixes_in_code_content
currently mutates parser state (virtual_line_start and by calling
skip_line_indent and consume_quote_prefix_in_code_content) as it iterates and
returns false on first failure, which can leave the parser mid-line; change it
to perform a preflight check or snapshot-and-restore: save the parser state (via
p.state()/p.state_mut() snapshot) before attempting to consume prefixes,
simulate or loop calling consume_quote_prefix_in_code_content on a
temporary/simulated parser (or perform the checks without mutating real state)
and only if all quote_depth prefixes succeed apply the real mutations
(virtual_line_start update, skip_line_indent and the actual
consume_quote_prefix_in_code_content calls); ensure that if any prefix fails no
real parser state is changed so outer-container parsing is not corrupted.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 76b463cf-10b6-47ec-b826-cc403def3478

📥 Commits

Reviewing files that changed from the base of the PR and between 870df65 and 4dbe781.

⛔ Files ignored due to path filters (2)

crates/biome_markdown_formatter/tests/specs/prettier/markdown/blockquote/code.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_formatter/tests/specs/prettier/markdown/blockquote/ignore-code.md.snap is excluded by !**/*.snap and included by **

📒 Files selected for processing (1)

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

…tion Prevents partial consumption of outer blockquote markers when an inner prefix is missing (quote_depth > 1). Without this, the first `>` would be consumed into the code block content, stealing it from the outer blockquote and corrupting subsequent parsing.

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/fenced_code_block.rs`:
- Around line 340-341: prepare_next_code_content_token can advance the parser to
EOF but still return CodeContentLoopAction::ConsumeText, causing later code
(around the MD_TEXTUAL_LITERAL remapping at the block that currently handles
p.at(T![EOF]) on lines ~293-295) to mis-handle EOF; update
prepare_next_code_content_token to check p.at(T![EOF]) immediately after
consuming the in-loop prefix/indent and return CodeContentLoopAction::Break
instead of ConsumeText when EOF is reached, and also modify the downstream logic
that remaps EOF to MD_TEXTUAL_LITERAL so it no longer treats EOF as text (i.e.,
ensure the EOF check precedes any remapping to MD_TEXTUAL_LITERAL).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: eacb4ca6-5a18-472f-990d-3fc24636b9ad

📥 Commits

Reviewing files that changed from the base of the PR and between 4dbe781 and a8280e0.

📒 Files selected for processing (1)

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs

Adds EOF check in prepare_next_code_content_token before returning ConsumeText. Prevents bump_code_textual from remapping EOF as MD_TEXTUAL_LITERAL when quote prefix or indent consumption advances the parser to end-of-input mid-iteration.

Renamed `CodeContentLoopAction` to `CodeContentTokenAction` and moved all control flow logic into explicit match arms, eliminating the fall-through pattern that was causing review confusion. Changes: - Renamed enum: `CodeContentLoopAction` → `CodeContentTokenAction` - Renamed variants: `Continue` → `Skip`, `ConsumeText` → `Consume` - Moved `bump_code_textual(p)` and `at_line_start = false` into the `Consume` match arm for clarity All tests pass. Behavior unchanged.

Clippy flagged the continue as redundant since nothing executes after the match. Using an empty block achieves the same result without the warning.

github-actions bot added A-Parser Area: parser A-Formatter Area: formatter A-Tooling Area: internal tools labels Mar 3, 2026

jfmcdowell force-pushed the refactor/fenced-code-block-prefix branch from ccff014 to 041e82b Compare March 4, 2026 01:51

[autofix.ci] apply automated fixes

8d084b9

jfmcdowell marked this pull request as ready for review March 4, 2026 02:26

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs Show resolved Hide resolved

fix(markdown-parser): consume quote marker outside debug assert

582f51a

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs Outdated Show resolved Hide resolved

jfmcdowell added 2 commits March 3, 2026 22:31

test(markdown): update blockquote prettier snapshots

27f50ef

ematipico reviewed Mar 4, 2026

View reviewed changes

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs Show resolved Hide resolved

coderabbitai bot reviewed Mar 4, 2026

View reviewed changes

crates/biome_markdown_parser/src/syntax/fenced_code_block.rs Outdated Show resolved Hide resolved

jfmcdowell added 3 commits March 4, 2026 08:27

fix(markdown-parser): remove needless continue in Skip arm

cf5d8f5

Clippy flagged the continue as redundant since nothing executes after the match. Using an empty block achieves the same result without the warning.

ematipico approved these changes Mar 4, 2026

View reviewed changes

ematipico merged commit e12a3c3 into biomejs:main Mar 4, 2026
14 checks passed

jfmcdowell deleted the refactor/fenced-code-block-prefix branch March 4, 2026 16:49

This was referenced Mar 5, 2026

refactor(markdown-parser): promote thematic break skipped trivia to explicit CST nodes #9337

Merged

refactor(markdown-parser): promote remaining skipped trivia to explicit CST nodes #9427

Merged

This was referenced Mar 11, 2026

docs(markdown-parser): add code-version comments to ungram nodes #9444

Merged

refactor(markdown-parser): simplify inline newline handling #9446

Merged

		CodeContentLoopAction::Continue => continue,
		CodeContentLoopAction::ConsumeText => {}

Uh oh!

Conversation

jfmcdowell commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Docs

Uh oh!

changeset-bot bot commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ematipico commented Mar 4, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jfmcdowell Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jfmcdowell commented Mar 3, 2026 •

edited

Loading

changeset-bot bot commented Mar 3, 2026 •

edited

Loading

coderabbitai bot commented Mar 4, 2026 •

edited

Loading

jfmcdowell Mar 4, 2026 •

edited

Loading