refactor(markdown-parser): promote pre-marker indent to explicit CST#9224
Conversation
…nodes Replace skip_line_indent(3) in emit_quote_prefix_tokens with explicit MdQuoteIndentList > MdQuoteIndent node emission, following the MdHash/MdHashList pattern. Each space before '>' is now a real CST node visible to the formatter harness instead of skipped trivia. Grammar: add MdQuoteIndent, MdQuoteIndentList; change MdQuotePrefix pre_marker_indent from optional token to list field.
|
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review infoConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
WalkthroughThe PR introduces explicit handling of quote pre‑marker indentation across parser, codegen and formatter. The parser adds MAX_BLOCK_PREFIX_INDENT and new token kinds (MD_QUOTE_PRE_MARKER_INDENT, MD_QUOTE_INDENT, MD_QUOTE_INDENT_LIST) and emits bounded indentation tokens. Codegen adds MdQuoteIndent and MdQuoteIndentList nodes. The formatter gains modules and crate‑scoped formatters (FormatMdQuoteIndent, FormatMdQuoteIndentList) plus AsFormat/IntoFormat/FormatRule wiring and tests for varied pre‑marker indent patterns. Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
ematipico
left a comment
There was a problem hiding this comment.
Great work! I left a couple of comments. I'll merge once you add the comment I asked for
| if text.is_empty() || !text.chars().all(|c| c == ' ' || c == '\t') { | ||
| break; | ||
| } | ||
| let indent: usize = text.chars().map(|c| if c == '\t' { 4 } else { 1 }).sum(); | ||
| if consumed + indent > 3 { | ||
| break; | ||
| } |
There was a problem hiding this comment.
You might want to add some comments that explain this logic, mostly because there are some magic numbers that don't give enough context of the business logic
There was a problem hiding this comment.
I’ll address the review comments here and replace the semantic/spec magic numbers that are in scope for this PR.
For the broader cleanup, I’ll open a follow-up PR to standardize the remaining semantic constants across the markdown parser in one pass, since that ended up being larger than expected.
| p.bump_remap(MD_QUOTE_PRE_MARKER_INDENT); | ||
| indent_m.complete(p, MD_QUOTE_INDENT); | ||
| } | ||
| indent_list_m.complete(p, MD_QUOTE_INDENT_LIST); |
There was a problem hiding this comment.
Is there a particular reason why we don't use ParseList for this? Just curious
There was a problem hiding this comment.
My thinking: this is a tiny bounded scan (<= 3 cols per CommonMark’s 0–3 indent before >), immediately followed by > validation, so a direct loop felt simpler than ParseNodeList. It also keeps this path strictly no-recovery/no-diagnostic.
Happy to switch to ParseNodeList if you prefer consistency.
There was a problem hiding this comment.
No, I think it's fine for now. Maybe can you leave a comment explaning the reasoning
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
crates/biome_markdown_parser/src/syntax/quote.rs (1)
139-162: Comments and reasoning look good — past feedback addressed.The bounded scan is well-documented, the tab-expansion logic matches the existing pattern in
fenced_code_block.rsandparser.rs, and the rationale for not usingParseNodeListis clearly stated. The always-emitted (possibly empty)MD_QUOTE_INDENT_LISTis consistent with how other list nodes work in biome.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/quote.rs` around lines 139 - 162, No change required: the bounded scan in quote handling is correct — keep the tab-expansion logic and the always-emitted MD_QUOTE_INDENT_LIST as-is; verify the existing symbols indent_list_m, indent_m, TAB_STOP_SPACES and MAX_BLOCK_PREFIX_INDENT remain used exactly as shown and leave the MD_QUOTE_PRE_MARKER_INDENT remap and completion to MD_QUOTE_INDENT untouched.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/biome_markdown_parser/src/syntax/quote.rs`:
- Around line 44-46: Define a new constant named MAX_BLOCK_PREFIX_INDENT in the
constants section of the syntax module (next to INDENT_CODE_BLOCK_SPACES and
TAB_STOP_SPACES) with visibility pub(crate), type usize, and value 3 so imports
of MAX_BLOCK_PREFIX_INDENT in quote.rs and other files resolve; ensure the
constant is declared alongside the existing constants in mod.rs.
---
Duplicate comments:
In `@crates/biome_markdown_parser/src/syntax/quote.rs`:
- Around line 139-162: No change required: the bounded scan in quote handling is
correct — keep the tab-expansion logic and the always-emitted
MD_QUOTE_INDENT_LIST as-is; verify the existing symbols indent_list_m, indent_m,
TAB_STOP_SPACES and MAX_BLOCK_PREFIX_INDENT remain used exactly as shown and
leave the MD_QUOTE_PRE_MARKER_INDENT remap and completion to MD_QUOTE_INDENT
untouched.
cb74be5 to
6cf94b3
Compare
Note
AI Assistance Disclosure: This PR was developed with assistance from Claude Code.
Summary
MdQuoteIndentandMdQuoteIndentListto the grammar, following theMdHash/MdHashListwrapper pattern for repeating over raw tokens.MdQuotePrefix.pre_marker_indentfrom a single optional token slot toMdQuoteIndentList, so each space before>gets its ownMdQuoteIndentnode.MD_QUOTE_INDENTandMD_QUOTE_INDENT_LISTinmarkdown_kinds_src.rs.skip_line_indent(3)inemit_quote_prefix_tokenswith an explicit loop that emitsMdQuoteIndentList > MdQuoteIndentnodes withMD_QUOTE_PRE_MARKER_INDENTtokens.FormatNodeRulestubs forMdQuoteIndentandFormatRuleforMdQuoteIndentList.This is the follow-up to #9219, completing Phase 1 parser-side work. Pre-marker indentation (0-3 spaces before
>) was the last remaining skipped trivia in blockquote parsing. Each indent space is now a real CST node visible to the formatter harness.No user-facing behavior change. Parsed semantics are preserved; only the internal CST representation changes.
Test Plan
cargo test -p biome_markdown_parser— 66 tests pass (65 existing + 1 new)cargo insta test -p biome_markdown_parserrg -n "pre_marker_indent: MdQuoteIndentList|MD_QUOTE_INDENT_LIST|MD_QUOTE_INDENT" crates/biome_markdown_parser/tests/md_test_suite/**/*.snap— verifies snapshots contain explicit pre-marker indent nodesDocs
N/A — internal structural change, no new user-facing features.