Skip to content

refactor(markdown-parser): promote remaining skipped trivia to explicit CST nodes#9427

Merged
ematipico merged 1 commit intobiomejs:mainfrom
jfmcdowell:refactor/phase5-skipped-trivia
Mar 11, 2026
Merged

refactor(markdown-parser): promote remaining skipped trivia to explicit CST nodes#9427
ematipico merged 1 commit intobiomejs:mainfrom
jfmcdowell:refactor/phase5-skipped-trivia

Conversation

@jfmcdowell
Copy link
Contributor

@jfmcdowell jfmcdowell commented Mar 10, 2026

Note

AI Assistance Disclosure: This PR was developed with assistance from Claude Code.

Summary

Promote structurally significant tokens from parse_as_skipped_trivia_tokens to explicit CST nodes, continuing #9337 and #9321.

  • Add indent: MdIndentTokenList grammar slots to MdHeader, MdFencedCodeBlock, MdHtmlBlock, MdLinkReferenceDefinition. Add r_fence_indent to MdFencedCodeBlock.
  • Add emit_line_indent() and emit_indent_tokens() parser methods; replace skip_line_indent() / consume_indent_prefix() / consume_partial_quote_prefix() across block parsers.
  • Add emit_optional_marker_space() and wrap quote-inside-list markers in proper MdQuotePrefix nodes.
  • Update to_html.rs to read indent from explicit slots instead of trivia.

Remaining parse_as_skipped_trivia_tokens calls are limited to error-recovery paths. No intended CLI or format output behavior change. Formatter snapshot updates are structural only (no new formatting logic).

Test Plan

  • just test-crate biome_markdown_parser
  • just test-markdown-conformance (652/652, 100%)
  • just f && just l

Docs

N/A — internal structural change.

@changeset-bot
Copy link

changeset-bot bot commented Mar 10, 2026

⚠️ No Changeset found

Latest commit: 8531525

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions bot added A-Parser Area: parser A-Formatter Area: formatter A-Tooling Area: internal tools labels Mar 10, 2026
…it CST nodes

Promote structurally significant tokens that were hidden via
parse_as_skipped_trivia_tokens to proper CST nodes visible to
traversal and the formatter.

Grammar: add indent: MdIndentTokenList to MdHeader, MdFencedCodeBlock,
MdHtmlBlock, and MdLinkReferenceDefinition. Add r_fence_indent slot
to MdFencedCodeBlock for closing fence indentation.

Parser: add emit_line_indent() for block prefix indent (emits
MdIndentTokenList) and emit_indent_tokens() for continuation indent
inside inline item lists. Add emit_optional_marker_space() for
quote post-marker space. Wrap quote-inside-list marker in proper
MdQuotePrefix node.

Remaining parse_as_skipped_trivia_tokens calls are all in error
recovery paths where tokens genuinely should be invisible.
@jfmcdowell jfmcdowell force-pushed the refactor/phase5-skipped-trivia branch from cffe70a to 8531525 Compare March 10, 2026 01:32
@jfmcdowell jfmcdowell marked this pull request as ready for review March 10, 2026 01:53
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 10, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: c499db12-f2fd-483f-8b5a-64c770a9450d

📥 Commits

Reviewing files that changed from the base of the PR and between 2de8362 and 8531525.

⛔ Files ignored due to path filters (45)
  • crates/biome_markdown_factory/src/generated/node_factory.rs is excluded by !**/generated/**, !**/generated/** and included by **
  • crates/biome_markdown_factory/src/generated/syntax_factory.rs is excluded by !**/generated/**, !**/generated/** and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/footnoteDefinition/long.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/math/remark-math.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-17.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-182.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-183.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-37.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-39.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-40.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-55.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-613.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-81.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/error/fenced_code_blockquote_eof_after_prefix.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/error/unterminated_code_fence.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/atx_heading_trailing_hash.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/edge_cases.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/emphasis_edge_cases.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/fenced_code_advanced.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/fenced_code_block.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/fenced_code_in_blockquote.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/fenced_code_indentation.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/header.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/html_block.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/indent_code_block.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/indented_code_blank_lines.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/inline_html_edge_cases.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/inline_html_invalid.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/lazy_continuation.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/link_definition.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/link_definition_edge_cases.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/list_blank_lines_between_items.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/list_continuation_edge_cases.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/list_indentation.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/list_tightness.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/multiline_label.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/multiline_list.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/paragraph_interruption.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/quote_pre_marker_indent.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/reference_links.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/setext_heading_edge_cases.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/setext_heading_negative.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_parser/tests/md_test_suite/ok/thematic_break_block.md.snap is excluded by !**/*.snap and included by **
  • crates/biome_markdown_syntax/src/generated/nodes.rs is excluded by !**/generated/**, !**/generated/** and included by **
  • crates/biome_markdown_syntax/src/generated/nodes_mut.rs is excluded by !**/generated/**, !**/generated/** and included by **
📒 Files selected for processing (12)
  • crates/biome_markdown_formatter/src/markdown/auxiliary/header.rs
  • crates/biome_markdown_parser/src/parser.rs
  • crates/biome_markdown_parser/src/syntax/fenced_code_block.rs
  • crates/biome_markdown_parser/src/syntax/header.rs
  • crates/biome_markdown_parser/src/syntax/html_block.rs
  • crates/biome_markdown_parser/src/syntax/link_block.rs
  • crates/biome_markdown_parser/src/syntax/list.rs
  • crates/biome_markdown_parser/src/syntax/mod.rs
  • crates/biome_markdown_parser/src/syntax/quote.rs
  • crates/biome_markdown_parser/src/syntax/thematic_break_block.rs
  • crates/biome_markdown_parser/src/to_html.rs
  • xtask/codegen/markdown.ungram

Walkthrough

This PR converts previously-skipped indentation and quote prefix trivia into explicit concrete syntax tree (CST) nodes. Parser methods emit_line_indent() and emit_indent_tokens() are introduced to generate MdIndentToken/MdIndentTokenList nodes. Multiple syntax modules (header, fenced-code-block, HTML block, link block, list, thematic break) are updated to emit rather than skip indentation. Quote prefix handling gains explicit emission functions. Grammar definitions add indent fields to header, fenced code blocks, HTML blocks, and link references. Formatter and HTML rendering logic updated to process explicit indent nodes.

Possibly related PRs

Suggested reviewers

  • ematipico
  • dyc3
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main refactoring work: promoting skipped trivia tokens to explicit CST nodes throughout the markdown parser.
Description check ✅ Passed The description clearly outlines the changes: grammar additions, parser method replacements, and updates to the HTML rendering logic, all aligned with the actual changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jfmcdowell
Copy link
Contributor Author

@tidefield for visibility.

Copy link
Member

@ematipico ematipico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that some nodes don't have a comment that show the "code version" of them. Would you mind having a pass at that, and send PR that updates the ungram file?


// html block - content is stored as raw text (multiple textual tokens)
MdHtmlBlock = content: MdInlineItemList
MdHtmlBlock =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment that shows the code is missing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in #9444

// Link reference definition per CommonMark §4.7
// [label]: destination "title" or [label]: destination 'title' or [label]: destination (title)
// Labels are case-insensitive and whitespace-normalized
MdLinkReferenceDefinition =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed in #9444

@ematipico ematipico merged commit f2debed into biomejs:main Mar 11, 2026
13 checks passed
@jfmcdowell jfmcdowell deleted the refactor/phase5-skipped-trivia branch March 11, 2026 10:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Formatter Area: formatter A-Parser Area: parser A-Tooling Area: internal tools

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants