refactor(markdown): split list item parser#9167
Conversation
|
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR refactors list-item block-content parsing in the Markdown parser by replacing a single-pass loop with a multi‑phase, stateful workflow. It introduces internal types and enums (e.g. LoopAction, ContinuationResult, ListItemLoopState, NestedListMarker) and phases for blank-line classification, first-line block detection (fenced code, HTML block, ATX heading, blockquote, thematic break, nested list marker), quote-prefix handling, and indentation-based continuation including virtual-line restoration. Several internal helpers were made crate-visible and a new test file exercises list-continuation edge cases. Public API signatures are unchanged. Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
crates/biome_markdown_parser/src/syntax/list.rs (2)
1344-1350: Permanently deadunreachable!guard — simplify.
apply_blank_line_actionexhaustively matches allBlankLineActionvariants and only ever returnsBreakorContinue, so the guard at line 1347 never fires. It also silently becomes load-bearing ifFallThroughis ever added toapply_blank_line_action's return set (it would panic rather than propagate cleanly).♻️ Proposed cleanup
- let result = apply_blank_line_action(p, state, action, is_blank, false); - if !matches!(result, LoopAction::Break | LoopAction::Continue) { - unreachable!("classify_blank_line always produces Break or Continue"); - } - return (result, false); + return (apply_blank_line_action(p, state, action, is_blank, false), false);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1344 - 1350, The unreachable guard should be removed and the code simplified: call classify_blank_line, list_newline_is_blank_line, and apply_blank_line_action as before but do not assert that result matches LoopAction::Break | LoopAction::Continue; instead directly return (result, false). Update the block containing classify_blank_line, list_newline_is_blank_line, and apply_blank_line_action to eliminate the unreachable! check so the function returns the apply_blank_line_action result without a hard panic.
1435-1467: Dead_marker_line_breakparameter.
_marker_line_breakis never read inside the function and is always passed asfalsefrom the single call site. Consider dropping it —apply_blank_line_action_with_prefixhas a live analogue, but this one doesn't need it.♻️ Proposed cleanup
fn apply_blank_line_action( p: &mut MarkdownParser, state: &mut ListItemLoopState, action: BlankLineAction, is_blank: bool, - _marker_line_break: bool, ) -> LoopAction {- let result = apply_blank_line_action(p, state, action, is_blank, false); + let result = apply_blank_line_action(p, state, action, is_blank);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1435 - 1467, The parameter `_marker_line_break` on function apply_blank_line_action is unused and always passed false, so remove it from the signature and all call sites; update fn apply_blank_line_action(p: &mut MarkdownParser, state: &mut ListItemLoopState, action: BlankLineAction, is_blank: bool) -> LoopAction and delete the `_marker_line_break` argument where apply_blank_line_action(...) is invoked, then run tests/formatting to ensure no remaining references to `_marker_line_break`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1797-1815: The code handling the ">" quote marker should mirror
parse_quote: start the MD_QUOTE node before consuming/remapping the marker, then
consume the marker, call skip_optional_marker_space() to handle spaces/tabs and
indented-code heuristics, and call record_quote_indent() to record the quote
indent into parser state before parsing the inner block; specifically, move the
let quote_m = p.start() before p.bump_remap(T![>]), replace the manual
single-space skip with a call to skip_optional_marker_space(p), and invoke
record_quote_indent(p, quote_m) (or the equivalent record_quote_indent call used
by parse_quote) prior to parse_quote_block_list(p) and completing quote_m to
MD_QUOTE so the CST, quote_indents, and indentation heuristics remain consistent
with parse_quote.
- Around line 1752-1770: The manual construction of MD_HASH_LIST/MD_HASH and
subsequent calls to parse_header_content and parse_trailing_hashes in this block
duplicates parse_hash_list() logic and risks divergence; extract the repeated
hash-list creation into a shared helper (eg. new parse_hash_list_helper(p) or
build_hash_list_nodes(p)) and call it here instead of re-implementing the
sequence, or if extraction is infeasible, add a clear explanatory comment
referencing parse_hash_list(), parse_header, with_virtual_line_start, at_header,
and skip_line_indent to justify the manual steps; ensure the helper encapsulates
the start/completion of hash_m/hash_list_m and then calls parse_header_content
and parse_trailing_hashes so header_m.complete(p, MD_HEADER) and
state.record_first_line_block() remain here.
---
Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1344-1350: The unreachable guard should be removed and the code
simplified: call classify_blank_line, list_newline_is_blank_line, and
apply_blank_line_action as before but do not assert that result matches
LoopAction::Break | LoopAction::Continue; instead directly return (result,
false). Update the block containing classify_blank_line,
list_newline_is_blank_line, and apply_blank_line_action to eliminate the
unreachable! check so the function returns the apply_blank_line_action result
without a hard panic.
- Around line 1435-1467: The parameter `_marker_line_break` on function
apply_blank_line_action is unused and always passed false, so remove it from the
signature and all call sites; update fn apply_blank_line_action(p: &mut
MarkdownParser, state: &mut ListItemLoopState, action: BlankLineAction,
is_blank: bool) -> LoopAction and delete the `_marker_line_break` argument where
apply_blank_line_action(...) is invoked, then run tests/formatting to ensure no
remaining references to `_marker_line_break`.
There was a problem hiding this comment.
🧹 Nitpick comments (4)
crates/biome_markdown_parser/src/syntax/list.rs (4)
1401-1413: Secondifis redundant —elsesuffices.After the
if next_indent >= state.required_indentblock returns,next_indent < state.required_indentis tautologically true; anelseis clearer and avoids the redundant comparison.♻️ Suggested simplification
if next_indent >= state.required_indent { consume_quote_prefix(p, quote_depth); consume_blank_line(p); if !state.first_line { state.has_blank_line = true; } state.last_was_blank = true; state.first_line = false; return (LoopAction::Continue, line_has_quote_prefix); } - if next_indent < state.required_indent { - return (LoopAction::Break, line_has_quote_prefix); - } + return (LoopAction::Break, line_has_quote_prefix);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1401 - 1413, The second conditional comparing next_indent < state.required_indent is redundant because the previous branch returns when next_indent >= state.required_indent; replace that second if with an else to simplify control flow in the function in list.rs (the block that calls consume_quote_prefix, consume_blank_line and returns LoopAction::Continue/Break) so it becomes an else returning (LoopAction::Break, line_has_quote_prefix) instead of re-checking the comparison.
1775-1832: Triple manual save/restore is fragile.
prev_virtualandprev_requiredare restored identically in all three exit paths (lines 1815–1816, 1824–1825, 1829–1830). A RAII-style guard or a small closure would make this non-leakable and DRY — especially sinceparse_quoteitself (inquote.rs) doesn't have this pattern because it doesn't need to save/restore.Not a blocker, but worth tracking if the function grows.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1775 - 1832, The function parse_first_line_blockquote duplicates saving and restoring parser state (prev_virtual and prev_required) in three exit paths; create an RAII-style guard (e.g., a small struct Drop guard) or a closure that captures p and the saved values and restores p.state_mut().virtual_line_start and p.state_mut().list_item_required_indent on drop/return, then use it at the top of parse_first_line_blockquote (replacing the manual prev_virtual/prev_required saves and the three manual restores) so state is restored exactly once and non-leakably even if early returns or panics occur.
1329-1334: Redundanthas_quote_prefixguard.
consume_quote_prefixalready callshas_quote_prefixinternally and is a no-op when it returnsfalse, so the outerifis unnecessary.♻️ Suggested simplification
- if has_quote_prefix(p, quote_depth) { - consume_quote_prefix(p, quote_depth); - } + consume_quote_prefix(p, quote_depth);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1329 - 1334, Remove the redundant has_quote_prefix check and call consume_quote_prefix unconditionally: the outer if around has_quote_prefix(p, quote_depth) can be deleted since consume_quote_prefix(p, quote_depth) already guards internally and is a no-op when no prefix exists; update the sequence around m.complete(p, MD_NEWLINE) to directly call consume_quote_prefix(p, quote_depth) (referencing functions has_quote_prefix and consume_quote_prefix to locate the logic).
1436-1442: Dead parameter_marker_line_breakshould be removed.The parameter is never read (the
_prefix confirms this), and the call site always passesfalse. It inflates the signature and creates confusion with the analogousmarker_line_break: boolinapply_blank_line_action_with_prefixwhere it is used.♻️ Suggested fix
fn apply_blank_line_action( p: &mut MarkdownParser, state: &mut ListItemLoopState, action: BlankLineAction, is_blank: bool, - _marker_line_break: bool, ) -> LoopAction {- let result = apply_blank_line_action(p, state, action, is_blank, false); + let result = apply_blank_line_action(p, state, action, is_blank);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1436 - 1442, The function apply_blank_line_action currently has an unused parameter `_marker_line_break: bool`; remove this parameter from its signature and update all callers that pass the extraneous argument (they currently pass false) to call apply_blank_line_action with one fewer argument. Keep the rest of the signature and behavior intact; do not change apply_blank_line_action_with_prefix (which still uses marker_line_break). Search for references to apply_blank_line_action and adjust their argument lists, and run tests/compile to ensure no remaining uses of the removed parameter remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1752-1768: The code manually starts/completes MD_HASH and
MD_HASH_LIST using hash_m/hash_list_m which duplicates the logic in
parse_hash_list() and risks divergence; replace the inline start/complete
sequence with a single call to parse_hash_list(p) (or extract the shared
start/complete behavior into a helper used by both parse_hash_list and this
code) and then call parse_header_content(p) and parse_trailing_hashes(p) as
before; if parse_hash_list cannot be used directly for a specific reason, add a
concise comment near parse_header_content/parse_trailing_hashes explaining why
the shared helper is unsuitable to prevent future drift (referencing
parse_hash_list, MD_HASH, MD_HASH_LIST, hash_m, hash_list_m, and header_m).
---
Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1401-1413: The second conditional comparing next_indent <
state.required_indent is redundant because the previous branch returns when
next_indent >= state.required_indent; replace that second if with an else to
simplify control flow in the function in list.rs (the block that calls
consume_quote_prefix, consume_blank_line and returns LoopAction::Continue/Break)
so it becomes an else returning (LoopAction::Break, line_has_quote_prefix)
instead of re-checking the comparison.
- Around line 1775-1832: The function parse_first_line_blockquote duplicates
saving and restoring parser state (prev_virtual and prev_required) in three exit
paths; create an RAII-style guard (e.g., a small struct Drop guard) or a closure
that captures p and the saved values and restores
p.state_mut().virtual_line_start and p.state_mut().list_item_required_indent on
drop/return, then use it at the top of parse_first_line_blockquote (replacing
the manual prev_virtual/prev_required saves and the three manual restores) so
state is restored exactly once and non-leakably even if early returns or panics
occur.
- Around line 1329-1334: Remove the redundant has_quote_prefix check and call
consume_quote_prefix unconditionally: the outer if around has_quote_prefix(p,
quote_depth) can be deleted since consume_quote_prefix(p, quote_depth) already
guards internally and is a no-op when no prefix exists; update the sequence
around m.complete(p, MD_NEWLINE) to directly call consume_quote_prefix(p,
quote_depth) (referencing functions has_quote_prefix and consume_quote_prefix to
locate the logic).
- Around line 1436-1442: The function apply_blank_line_action currently has an
unused parameter `_marker_line_break: bool`; remove this parameter from its
signature and update all callers that pass the extraneous argument (they
currently pass false) to call apply_blank_line_action with one fewer argument.
Keep the rest of the signature and behavior intact; do not change
apply_blank_line_action_with_prefix (which still uses marker_line_break). Search
for references to apply_blank_line_action and adjust their argument lists, and
run tests/compile to ensure no remaining uses of the removed parameter remain.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
crates/biome_markdown_parser/src/syntax/list.rs (3)
1086-1214: Duplicated quote-prefix–stripping logic between two adjacent helpers.
quote_only_line_indent_at_currentandnext_quote_content_indenteach contain a nearly identicalfor _ in 0..depthbyte loop that consumesNlevels of>markers. The duplication spans ~20 lines. Extracting astrip_quote_prefix_bytes(bytes, pos, end, depth) -> Option<usize>helper (returning the position after all markers, orNoneif the prefix is absent) would keep both in sync.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1086 - 1214, Both functions quote_only_line_indent_at_current and next_quote_content_indent duplicate the same for _ in 0..depth loop that consumes N levels of '>' quote markers; extract that logic into a single helper strip_quote_prefix_bytes(bytes: &[u8], mut pos: usize, end: usize, depth: usize) -> Option<usize> which implements the identical behavior (handling up to 3 indentation columns with spaces/tabs, advancing past each '>' and an optional single following space/tab) and returns the new position after all markers or None if the prefix is absent, then call this helper from both quote_only_line_indent_at_current and next_quote_content_indent in place of the duplicated loop so both use the same implementation and return semantics.
1743-1772:atx_heading_infocarries ausizethat's never used.The lookahead returns
Option<usize>(hash count), but line 1770 only calls.is_none(). The inner value is silently dropped. Change the return type toboolto match intent and avoid the misleadingSome(hash_count)pattern.♻️ Proposed simplification
- let atx_heading_info = p.lookahead(|p| { + let is_atx_heading = p.lookahead(|p| { while p.at(MD_TEXTUAL_LITERAL) && is_whitespace_only(p.cur_text()) { p.bump(MD_TEXTUAL_LITERAL); } let is_hash = p.at(T![#]) || (p.at(MD_TEXTUAL_LITERAL) && p.cur_text().chars().all(|c| c == '#')); if !is_hash { - return None; + return false; } let text = p.cur_text(); let hash_count = text.len(); if !(1..=6).contains(&hash_count) { - return None; + return false; } p.bump(p.cur()); if p.at(NEWLINE) || p.at(T![EOF]) { - return Some(hash_count); + return true; } if p.at(MD_TEXTUAL_LITERAL) { let t = p.cur_text(); if t.starts_with(' ') || t.starts_with('\t') { - return Some(hash_count); + return true; } } - None + false }); - if atx_heading_info.is_none() { + if !is_atx_heading { return false; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1743 - 1772, The lookahead closure assigned to atx_heading_info returns Option<usize> but the usize is never used; change the closure to return bool instead: replace all returns of Some(hash_count) with true and None with false, remove the unused hash_count where not needed (or keep for validation but don't return it), and update the post-check from if atx_heading_info.is_none() to if !atx_heading_info (i.e., treat atx_heading_info as a bool). Keep the same checks inside the p.lookahead closure (MD_TEXTUAL_LITERAL, T![#], is_whitespace_only, NEWLINE/EOF, leading space/tab) but return true/false to reflect presence of an ATX heading.
1342-1351:unreachable!()guard is a latent production panic.
apply_blank_line_actioncan only returnBreakorContinuetoday, so the assertion is never triggered. But if someone adds aFallThrougharm later, this panics at runtime in both debug and release builds. Either drop the guard and return directly, or usedebug_unreachable!if you want the invariant documented.♻️ Simplification
- let result = apply_blank_line_action(p, state, action, is_blank, false); - if !matches!(result, LoopAction::Break | LoopAction::Continue) { - unreachable!("classify_blank_line always produces Break or Continue"); - } - return (result, false); + // apply_blank_line_action always returns Break or Continue + return (apply_blank_line_action(p, state, action, is_blank, false), false);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1342 - 1351, The unreachable!() guard after calling apply_blank_line_action is a latent runtime panic if new LoopAction variants are added; remove that guard and return the computed result directly: after computing action, is_blank and result from classify_blank_line, list_newline_is_blank_line and apply_blank_line_action, delete the if !matches!(result, LoopAction::Break | LoopAction::Continue) { unreachable!(...) } check and simply return (result, false). This change touches the block that calls classify_blank_line, list_newline_is_blank_line and apply_blank_line_action and eliminates the runtime panic risk.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1616-1630: Detected thematic break via is_thematic_break but
discarding parse_thematic_break_block() result can set state.first_line
incorrectly when parse returns Absent; modify the code to store the return value
of parse_thematic_break_block(p) (the ParseResult/BlockResult it returns), check
whether it actually produced a block (i.e., is not Absent) before calling
state.record_first_line_block() and returning LoopAction::Continue, and only
treat it as handled when parse_thematic_break_block succeeded; this ensures
parse_thematic_break_block, which gates on at_thematic_break_block / the
token-based is_thematic_break_pattern in thematic_break_block.rs, and the local
is_thematic_break_pattern stay consistent in list contexts and prevents
advancing first_line when nothing was emitted.
---
Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1086-1214: Both functions quote_only_line_indent_at_current and
next_quote_content_indent duplicate the same for _ in 0..depth loop that
consumes N levels of '>' quote markers; extract that logic into a single helper
strip_quote_prefix_bytes(bytes: &[u8], mut pos: usize, end: usize, depth: usize)
-> Option<usize> which implements the identical behavior (handling up to 3
indentation columns with spaces/tabs, advancing past each '>' and an optional
single following space/tab) and returns the new position after all markers or
None if the prefix is absent, then call this helper from both
quote_only_line_indent_at_current and next_quote_content_indent in place of the
duplicated loop so both use the same implementation and return semantics.
- Around line 1743-1772: The lookahead closure assigned to atx_heading_info
returns Option<usize> but the usize is never used; change the closure to return
bool instead: replace all returns of Some(hash_count) with true and None with
false, remove the unused hash_count where not needed (or keep for validation but
don't return it), and update the post-check from if atx_heading_info.is_none()
to if !atx_heading_info (i.e., treat atx_heading_info as a bool). Keep the same
checks inside the p.lookahead closure (MD_TEXTUAL_LITERAL, T![#],
is_whitespace_only, NEWLINE/EOF, leading space/tab) but return true/false to
reflect presence of an ATX heading.
- Around line 1342-1351: The unreachable!() guard after calling
apply_blank_line_action is a latent runtime panic if new LoopAction variants are
added; remove that guard and return the computed result directly: after
computing action, is_blank and result from classify_blank_line,
list_newline_is_blank_line and apply_blank_line_action, delete the if
!matches!(result, LoopAction::Break | LoopAction::Continue) { unreachable!(...)
} check and simply return (result, false). This change touches the block that
calls classify_blank_line, list_newline_is_blank_line and
apply_blank_line_action and eliminates the runtime panic risk.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/biome_markdown_parser/src/syntax/list.rs (1)
1882-1901:state.first_line = falseat lines 1884 and 1894 are no-ops.The guard at line 1856 already returns
FallThroughwhenstate.first_lineistrue, so both assignments are unreachable in the truthy state. Easy to drop.♻️ Proposed cleanup
if at_bullet_list_item(p) { let _ = parse_bullet_list_item(p); state.last_block_was_paragraph = false; - state.first_line = false; p.state_mut().virtual_line_start = prev_virtual; return ContinuationResult { ... }; } if at_order_list_item(p) { let _ = parse_order_list_item(p); state.last_block_was_paragraph = false; - state.first_line = false; p.state_mut().virtual_line_start = prev_virtual; return ContinuationResult { ... }; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1882 - 1901, Remove the redundant no-op assignments to state.first_line = false inside the blocks that handle at_bullet_list_item and at_order_list_item: locate the branches where at_bullet_list_item(p) and at_order_list_item(p) call parse_bullet_list_item(p) / parse_order_list_item(p) and return a ContinuationResult, and delete the state.first_line = false statements (they're unreachable because the earlier guard returns FallThrough when state.first_line is true); leave the other statements (state.last_block_was_paragraph, p.state_mut().virtual_line_start, and the returned ContinuationResult) intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1622-1626: Capture and inspect the return value of
parse_thematic_break_block instead of discarding it: call
parse_thematic_break_block(p) when is_thematic_break is true, then only call
state.record_first_line_block() and return LoopAction::Continue if the parser
result indicates a successful block (e.g., Present); if
parse_thematic_break_block returns Absent (or otherwise not-successful), do not
flip first_line and allow normal parsing of the tokens as continuation content.
This change should reference parse_thematic_break_block,
is_thematic_break/is_thematic_break_pattern, at_thematic_break_block, and
state.record_first_line_block to ensure the detector and parser agree before
mutating state.
---
Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1882-1901: Remove the redundant no-op assignments to
state.first_line = false inside the blocks that handle at_bullet_list_item and
at_order_list_item: locate the branches where at_bullet_list_item(p) and
at_order_list_item(p) call parse_bullet_list_item(p) / parse_order_list_item(p)
and return a ContinuationResult, and delete the state.first_line = false
statements (they're unreachable because the earlier guard returns FallThrough
when state.first_line is true); leave the other statements
(state.last_block_was_paragraph, p.state_mut().virtual_line_start, and the
returned ContinuationResult) intact.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
crates/biome_markdown_parser/src/syntax/list.rs (2)
1334-1336:state.first_line = falsein Phase 2 is redundant.The guarding condition at line 1315 is
if !state.first_line && ..., so by the time we reach line 1335 we already knowstate.first_lineisfalse. The assignment is a no-op. Tiny, but worth trimming for clarity.✏️ Proposed cleanup
state.record_blank(); - state.first_line = false; return (LoopAction::Continue, false);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1334 - 1336, The assignment state.first_line = false is redundant in the Phase 2 branch because the guard at the top (if !state.first_line && ...) ensures state.first_line is already false; remove the redundant line from the block that calls state.record_blank() and returns (LoopAction::Continue, false) in list.rs to simplify the code and avoid no-op assignments.
1993-2043: Custom loop inparse_list_item_block_contenthas no explicit progress guard.If every phase helper returns
FallThroughandparse_any_block_with_indent_code_policyreturnsAbsentwithout consuming a token (however unlikely in practice), the loop stalls forever. The outerBulletList/OrderedListare correctly usingParseNodeList, but this inner content loop is bespoke. Worth adding at minimum adebug_assert!that the parser has advanced, or a token-consumed sentinel, as a safety net.Based on learnings: "Use
ParseSeparatedListandParseNodeListfor parsing lists with error recovery to avoid infinite loops" (crates/biome_parser/CONTRIBUTING.md).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1993 - 2043, The custom loop inside parse_list_item_block_content can stall if none of the helpers consume input; add a progress guard: track whether the parser advanced this iteration (e.g., a token_consumed boolean set when any helper or parse_any_block_with_indent_code_policy consumes) and if not, either break with an error node or trigger a debug_assert! to fail in debug builds. Update the loop around calls to handle_blank_lines, handle_first_line_marker_only, parse_first_line_blocks, check_continuation_indent, and parse_continuation_block (and the call to parse_any_block_with_indent_code_policy) to set the sentinel when they consume tokens, and bail out / assert when no progress is observed; alternatively replace this bespoke loop with ParseNodeList/ParseSeparatedList per CONTRIBUTING.md for robust recovery.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1343-1348: The unreachable! panic message is misleading: it's
checking the return of apply_blank_line_action but claims "classify_blank_line
always produces Break or Continue"; update the message to correctly reference
apply_blank_line_action (e.g., "apply_blank_line_action always produces Break or
Continue") or remove the panic and simply assert the invariant using
debug_assert! so the check refers to apply_blank_line_action and the symbols
involved (list_newline_is_blank_line, apply_blank_line_action, LoopAction::Break
| LoopAction::Continue, unreachable!) are updated accordingly.
- Around line 2022-2026: The match on parse_first_line_blocks(...) includes a
dead LoopAction::Break => break arm even though parse_first_line_blocks never
returns Break; remove that unreachable arm and either replace it with a
catch-all that calls unreachable!() (e.g., _ => unreachable!()) to document
invariants, or keep only the two valid arms (LoopAction::Continue => continue,
LoopAction::FallThrough => {}) and add a debug_assert! inside
parse_first_line_blocks to reserve a Break path for future use; reference
parse_first_line_blocks and LoopAction::Break when making the change.
---
Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1334-1336: The assignment state.first_line = false is redundant in
the Phase 2 branch because the guard at the top (if !state.first_line && ...)
ensures state.first_line is already false; remove the redundant line from the
block that calls state.record_blank() and returns (LoopAction::Continue, false)
in list.rs to simplify the code and avoid no-op assignments.
- Around line 1993-2043: The custom loop inside parse_list_item_block_content
can stall if none of the helpers consume input; add a progress guard: track
whether the parser advanced this iteration (e.g., a token_consumed boolean set
when any helper or parse_any_block_with_indent_code_policy consumes) and if not,
either break with an error node or trigger a debug_assert! to fail in debug
builds. Update the loop around calls to handle_blank_lines,
handle_first_line_marker_only, parse_first_line_blocks,
check_continuation_indent, and parse_continuation_block (and the call to
parse_any_block_with_indent_code_policy) to set the sentinel when they consume
tokens, and bail out / assert when no progress is observed; alternatively
replace this bespoke loop with ParseNodeList/ParseSeparatedList per
CONTRIBUTING.md for robust recovery.
| let is_blank = list_newline_is_blank_line(p); | ||
| let result = apply_blank_line_action(p, state, action, is_blank); | ||
| if !matches!(result, LoopAction::Break | LoopAction::Continue) { | ||
| unreachable!("classify_blank_line always produces Break or Continue"); | ||
| } | ||
| let newline_has_quote_prefix = quote_depth > 0 | ||
| && p.at(NEWLINE) | ||
| && (p.at_line_start() || p.has_preceding_line_break()) | ||
| && has_quote_prefix(p, quote_depth); | ||
| return (result, false); |
There was a problem hiding this comment.
Misleading unreachable! message — references the wrong function.
The message "classify_blank_line always produces Break or Continue" is attached to a check on the return value of apply_blank_line_action, not classify_blank_line. Since apply_blank_line_action maps every BlankLineAction variant to either Continue or Break, the branch is provably dead — but the message blames the wrong callee, which will confuse anyone who ever trips over it.
✏️ Suggested fix
- if !matches!(result, LoopAction::Break | LoopAction::Continue) {
- unreachable!("classify_blank_line always produces Break or Continue");
- }
+ debug_assert!(
+ matches!(result, LoopAction::Break | LoopAction::Continue),
+ "apply_blank_line_action always produces Break or Continue"
+ );🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1343 - 1348,
The unreachable! panic message is misleading: it's checking the return of
apply_blank_line_action but claims "classify_blank_line always produces Break or
Continue"; update the message to correctly reference apply_blank_line_action
(e.g., "apply_blank_line_action always produces Break or Continue") or remove
the panic and simply assert the invariant using debug_assert! so the check
refers to apply_blank_line_action and the symbols involved
(list_newline_is_blank_line, apply_blank_line_action, LoopAction::Break |
LoopAction::Continue, unreachable!) are updated accordingly.
| match parse_first_line_blocks(p, &mut state, spaces_after_marker) { | ||
| LoopAction::Continue => continue, | ||
| LoopAction::Break => break, | ||
| LoopAction::FallThrough => {} | ||
| } |
There was a problem hiding this comment.
Dead LoopAction::Break arm — parse_first_line_blocks never returns Break.
Tracing every return path in parse_first_line_blocks (lines 1557–1708): all early-exit paths return Continue; the fall-through at line 1707 returns FallThrough. No path produces Break, so the LoopAction::Break => break arm at line 2024 is unreachable dead code that misleads readers into thinking a break condition can originate there.
✏️ Suggested fix
- match parse_first_line_blocks(p, &mut state, spaces_after_marker) {
- LoopAction::Continue => continue,
- LoopAction::Break => break,
- LoopAction::FallThrough => {}
- }
+ if let LoopAction::Continue =
+ parse_first_line_blocks(p, &mut state, spaces_after_marker)
+ {
+ continue;
+ }Alternatively, add a debug_assert! inside parse_first_line_blocks if a Break path is intentionally reserved for the future.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| match parse_first_line_blocks(p, &mut state, spaces_after_marker) { | |
| LoopAction::Continue => continue, | |
| LoopAction::Break => break, | |
| LoopAction::FallThrough => {} | |
| } | |
| if let LoopAction::Continue = | |
| parse_first_line_blocks(p, &mut state, spaces_after_marker) | |
| { | |
| continue; | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 2022 - 2026,
The match on parse_first_line_blocks(...) includes a dead LoopAction::Break =>
break arm even though parse_first_line_blocks never returns Break; remove that
unreachable arm and either replace it with a catch-all that calls unreachable!()
(e.g., _ => unreachable!()) to document invariants, or keep only the two valid
arms (LoopAction::Continue => continue, LoopAction::FallThrough => {}) and add a
debug_assert! inside parse_first_line_blocks to reserve a Break path for future
use; reference parse_first_line_blocks and LoopAction::Break when making the
change.
|
It's fine for these changes to go on |
…phased helpers Refactor `parse_list_item_block_content` in `syntax/list.rs` into a small orchestrator with focused phase helpers while preserving parser behavior. - add explicit loop/state scaffolding (`ListItemLoopState`, `LoopAction`, `ContinuationResult`, `VirtualLineRestore`) - extract blank-line handling, first-line marker-only handling, first-line block parsing, continuation- indent checks, and continuation block parsing - hoist `NestedListMarker` to module scope and keep restore semantics explicit for `virtual_line_start` - remove minor duplicated/redundant branches in blank-line action handling No intended behavior change; parser tests and markdown conformance continue to pass.
Reverts the parse_first_line_header_body() extraction (single-call helper). Adds a brief inline comment explaining why header::parse_hash_list() cannot be reused here (bump_remap needed for MD_TEXTUAL_LITERAL-lexed '#').
4c3405a to
4654c23
Compare
Now targeting main |
ematipico
left a comment
There was a problem hiding this comment.
Let's remove code that panics in production
Replace `unreachable!()` calls with safe exhaustive match arms to avoid panics on unpredictable user input. Use `// #region` / `// #endregion` comments instead of decorative separator lines.
Note
AI Assistance Disclosure: This PR was developed with assistance from Claude Code.
Summary
Refactor
parse_list_item_block_contentincrates/biome_markdown_parser/src/syntax/list.rsinto phased helpers with an orchestrator loop.This keeps parser behavior the same while improving readability and maintainability by:
ListItemLoopState,LoopAction,ContinuationResult,VirtualLineRestore)NestedListMarkerto module scopeAdded list_continuation_edge_cases.md and snapshot coverage for high-risk list-continuation paths (quote-only blanks, marker-only items, nested-list continuation restore, lazy continuation). Regression-only; no intended behavior change.
Test Plan
just test-crate biome_markdown_parserjust test-markdown-conformancejust fjust lAll pass locally with no intended behavior change.
Documentation
Not applicable (internal parser refactor, no user-facing behavior change).