Skip to content

refactor(markdown): split list item parser#9167

Merged
ematipico merged 8 commits intobiomejs:mainfrom
jfmcdowell:refactor/list-item-block-content
Feb 23, 2026
Merged

refactor(markdown): split list item parser#9167
ematipico merged 8 commits intobiomejs:mainfrom
jfmcdowell:refactor/list-item-block-content

Conversation

@jfmcdowell
Copy link
Contributor

Note

AI Assistance Disclosure: This PR was developed with assistance from Claude Code.

Summary

Refactor parse_list_item_block_content in crates/biome_markdown_parser/src/syntax/list.rs into phased helpers with an orchestrator loop.

This keeps parser behavior the same while improving readability and maintainability by:

  • introducing explicit loop/state scaffolding (ListItemLoopState, LoopAction, ContinuationResult, VirtualLineRestore)
  • extracting phase helpers for blank-line handling, first-line block handling, continuation-indent checks, and continuation block parsing
  • hoisting NestedListMarker to module scope
  • removing minor duplicated/redundant branches in blank-line action handling

Added list_continuation_edge_cases.md and snapshot coverage for high-risk list-continuation paths (quote-only blanks, marker-only items, nested-list continuation restore, lazy continuation). Regression-only; no intended behavior change.

Test Plan

  • just test-crate biome_markdown_parser
  • just test-markdown-conformance
  • just f
  • just l

All pass locally with no intended behavior change.

Documentation

Not applicable (internal parser refactor, no user-facing behavior change).

@changeset-bot
Copy link

changeset-bot bot commented Feb 21, 2026

⚠️ No Changeset found

Latest commit: de1b1ca

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@github-actions github-actions bot added the A-Parser Area: parser label Feb 21, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 21, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

This PR refactors list-item block-content parsing in the Markdown parser by replacing a single-pass loop with a multi‑phase, stateful workflow. It introduces internal types and enums (e.g. LoopAction, ContinuationResult, ListItemLoopState, NestedListMarker) and phases for blank-line classification, first-line block detection (fenced code, HTML block, ATX heading, blockquote, thematic break, nested list marker), quote-prefix handling, and indentation-based continuation including virtual-line restoration. Several internal helpers were made crate-visible and a new test file exercises list-continuation edge cases. Public API signatures are unchanged.

Possibly related PRs

Suggested reviewers

  • dyc3
  • ematipico
🚥 Pre-merge checks | ✅ 2
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title 'refactor(markdown): split list item parser' accurately describes the main change: refactoring the list item parser function into phased helpers.
Description check ✅ Passed The description is comprehensive and directly related to the changeset, explaining the refactor goals, specific improvements, test coverage, and AI assistance disclosure.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
crates/biome_markdown_parser/src/syntax/list.rs (2)

1344-1350: Permanently dead unreachable! guard — simplify.

apply_blank_line_action exhaustively matches all BlankLineAction variants and only ever returns Break or Continue, so the guard at line 1347 never fires. It also silently becomes load-bearing if FallThrough is ever added to apply_blank_line_action's return set (it would panic rather than propagate cleanly).

♻️ Proposed cleanup
-        let result = apply_blank_line_action(p, state, action, is_blank, false);
-        if !matches!(result, LoopAction::Break | LoopAction::Continue) {
-            unreachable!("classify_blank_line always produces Break or Continue");
-        }
-        return (result, false);
+        return (apply_blank_line_action(p, state, action, is_blank, false), false);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1344 - 1350,
The unreachable guard should be removed and the code simplified: call
classify_blank_line, list_newline_is_blank_line, and apply_blank_line_action as
before but do not assert that result matches LoopAction::Break |
LoopAction::Continue; instead directly return (result, false). Update the block
containing classify_blank_line, list_newline_is_blank_line, and
apply_blank_line_action to eliminate the unreachable! check so the function
returns the apply_blank_line_action result without a hard panic.

1435-1467: Dead _marker_line_break parameter.

_marker_line_break is never read inside the function and is always passed as false from the single call site. Consider dropping it — apply_blank_line_action_with_prefix has a live analogue, but this one doesn't need it.

♻️ Proposed cleanup
 fn apply_blank_line_action(
     p: &mut MarkdownParser,
     state: &mut ListItemLoopState,
     action: BlankLineAction,
     is_blank: bool,
-    _marker_line_break: bool,
 ) -> LoopAction {
-        let result = apply_blank_line_action(p, state, action, is_blank, false);
+        let result = apply_blank_line_action(p, state, action, is_blank);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1435 - 1467,
The parameter `_marker_line_break` on function apply_blank_line_action is unused
and always passed false, so remove it from the signature and all call sites;
update fn apply_blank_line_action(p: &mut MarkdownParser, state: &mut
ListItemLoopState, action: BlankLineAction, is_blank: bool) -> LoopAction and
delete the `_marker_line_break` argument where apply_blank_line_action(...) is
invoked, then run tests/formatting to ensure no remaining references to
`_marker_line_break`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1797-1815: The code handling the ">" quote marker should mirror
parse_quote: start the MD_QUOTE node before consuming/remapping the marker, then
consume the marker, call skip_optional_marker_space() to handle spaces/tabs and
indented-code heuristics, and call record_quote_indent() to record the quote
indent into parser state before parsing the inner block; specifically, move the
let quote_m = p.start() before p.bump_remap(T![>]), replace the manual
single-space skip with a call to skip_optional_marker_space(p), and invoke
record_quote_indent(p, quote_m) (or the equivalent record_quote_indent call used
by parse_quote) prior to parse_quote_block_list(p) and completing quote_m to
MD_QUOTE so the CST, quote_indents, and indentation heuristics remain consistent
with parse_quote.
- Around line 1752-1770: The manual construction of MD_HASH_LIST/MD_HASH and
subsequent calls to parse_header_content and parse_trailing_hashes in this block
duplicates parse_hash_list() logic and risks divergence; extract the repeated
hash-list creation into a shared helper (eg. new parse_hash_list_helper(p) or
build_hash_list_nodes(p)) and call it here instead of re-implementing the
sequence, or if extraction is infeasible, add a clear explanatory comment
referencing parse_hash_list(), parse_header, with_virtual_line_start, at_header,
and skip_line_indent to justify the manual steps; ensure the helper encapsulates
the start/completion of hash_m/hash_list_m and then calls parse_header_content
and parse_trailing_hashes so header_m.complete(p, MD_HEADER) and
state.record_first_line_block() remain here.

---

Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1344-1350: The unreachable guard should be removed and the code
simplified: call classify_blank_line, list_newline_is_blank_line, and
apply_blank_line_action as before but do not assert that result matches
LoopAction::Break | LoopAction::Continue; instead directly return (result,
false). Update the block containing classify_blank_line,
list_newline_is_blank_line, and apply_blank_line_action to eliminate the
unreachable! check so the function returns the apply_blank_line_action result
without a hard panic.
- Around line 1435-1467: The parameter `_marker_line_break` on function
apply_blank_line_action is unused and always passed false, so remove it from the
signature and all call sites; update fn apply_blank_line_action(p: &mut
MarkdownParser, state: &mut ListItemLoopState, action: BlankLineAction,
is_blank: bool) -> LoopAction and delete the `_marker_line_break` argument where
apply_blank_line_action(...) is invoked, then run tests/formatting to ensure no
remaining references to `_marker_line_break`.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (4)
crates/biome_markdown_parser/src/syntax/list.rs (4)

1401-1413: Second if is redundant — else suffices.

After the if next_indent >= state.required_indent block returns, next_indent < state.required_indent is tautologically true; an else is clearer and avoids the redundant comparison.

♻️ Suggested simplification
             if next_indent >= state.required_indent {
                 consume_quote_prefix(p, quote_depth);
                 consume_blank_line(p);
                 if !state.first_line {
                     state.has_blank_line = true;
                 }
                 state.last_was_blank = true;
                 state.first_line = false;
                 return (LoopAction::Continue, line_has_quote_prefix);
             }
-            if next_indent < state.required_indent {
-                return (LoopAction::Break, line_has_quote_prefix);
-            }
+            return (LoopAction::Break, line_has_quote_prefix);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1401 - 1413,
The second conditional comparing next_indent < state.required_indent is
redundant because the previous branch returns when next_indent >=
state.required_indent; replace that second if with an else to simplify control
flow in the function in list.rs (the block that calls consume_quote_prefix,
consume_blank_line and returns LoopAction::Continue/Break) so it becomes an else
returning (LoopAction::Break, line_has_quote_prefix) instead of re-checking the
comparison.

1775-1832: Triple manual save/restore is fragile.

prev_virtual and prev_required are restored identically in all three exit paths (lines 1815–1816, 1824–1825, 1829–1830). A RAII-style guard or a small closure would make this non-leakable and DRY — especially since parse_quote itself (in quote.rs) doesn't have this pattern because it doesn't need to save/restore.

Not a blocker, but worth tracking if the function grows.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1775 - 1832,
The function parse_first_line_blockquote duplicates saving and restoring parser
state (prev_virtual and prev_required) in three exit paths; create an RAII-style
guard (e.g., a small struct Drop guard) or a closure that captures p and the
saved values and restores p.state_mut().virtual_line_start and
p.state_mut().list_item_required_indent on drop/return, then use it at the top
of parse_first_line_blockquote (replacing the manual prev_virtual/prev_required
saves and the three manual restores) so state is restored exactly once and
non-leakably even if early returns or panics occur.

1329-1334: Redundant has_quote_prefix guard.

consume_quote_prefix already calls has_quote_prefix internally and is a no-op when it returns false, so the outer if is unnecessary.

♻️ Suggested simplification
-            if has_quote_prefix(p, quote_depth) {
-                consume_quote_prefix(p, quote_depth);
-            }
+            consume_quote_prefix(p, quote_depth);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1329 - 1334,
Remove the redundant has_quote_prefix check and call consume_quote_prefix
unconditionally: the outer if around has_quote_prefix(p, quote_depth) can be
deleted since consume_quote_prefix(p, quote_depth) already guards internally and
is a no-op when no prefix exists; update the sequence around m.complete(p,
MD_NEWLINE) to directly call consume_quote_prefix(p, quote_depth) (referencing
functions has_quote_prefix and consume_quote_prefix to locate the logic).

1436-1442: Dead parameter _marker_line_break should be removed.

The parameter is never read (the _ prefix confirms this), and the call site always passes false. It inflates the signature and creates confusion with the analogous marker_line_break: bool in apply_blank_line_action_with_prefix where it is used.

♻️ Suggested fix
 fn apply_blank_line_action(
     p: &mut MarkdownParser,
     state: &mut ListItemLoopState,
     action: BlankLineAction,
     is_blank: bool,
-    _marker_line_break: bool,
 ) -> LoopAction {
-        let result = apply_blank_line_action(p, state, action, is_blank, false);
+        let result = apply_blank_line_action(p, state, action, is_blank);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1436 - 1442,
The function apply_blank_line_action currently has an unused parameter
`_marker_line_break: bool`; remove this parameter from its signature and update
all callers that pass the extraneous argument (they currently pass false) to
call apply_blank_line_action with one fewer argument. Keep the rest of the
signature and behavior intact; do not change apply_blank_line_action_with_prefix
(which still uses marker_line_break). Search for references to
apply_blank_line_action and adjust their argument lists, and run tests/compile
to ensure no remaining uses of the removed parameter remain.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1752-1768: The code manually starts/completes MD_HASH and
MD_HASH_LIST using hash_m/hash_list_m which duplicates the logic in
parse_hash_list() and risks divergence; replace the inline start/complete
sequence with a single call to parse_hash_list(p) (or extract the shared
start/complete behavior into a helper used by both parse_hash_list and this
code) and then call parse_header_content(p) and parse_trailing_hashes(p) as
before; if parse_hash_list cannot be used directly for a specific reason, add a
concise comment near parse_header_content/parse_trailing_hashes explaining why
the shared helper is unsuitable to prevent future drift (referencing
parse_hash_list, MD_HASH, MD_HASH_LIST, hash_m, hash_list_m, and header_m).

---

Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1401-1413: The second conditional comparing next_indent <
state.required_indent is redundant because the previous branch returns when
next_indent >= state.required_indent; replace that second if with an else to
simplify control flow in the function in list.rs (the block that calls
consume_quote_prefix, consume_blank_line and returns LoopAction::Continue/Break)
so it becomes an else returning (LoopAction::Break, line_has_quote_prefix)
instead of re-checking the comparison.
- Around line 1775-1832: The function parse_first_line_blockquote duplicates
saving and restoring parser state (prev_virtual and prev_required) in three exit
paths; create an RAII-style guard (e.g., a small struct Drop guard) or a closure
that captures p and the saved values and restores
p.state_mut().virtual_line_start and p.state_mut().list_item_required_indent on
drop/return, then use it at the top of parse_first_line_blockquote (replacing
the manual prev_virtual/prev_required saves and the three manual restores) so
state is restored exactly once and non-leakably even if early returns or panics
occur.
- Around line 1329-1334: Remove the redundant has_quote_prefix check and call
consume_quote_prefix unconditionally: the outer if around has_quote_prefix(p,
quote_depth) can be deleted since consume_quote_prefix(p, quote_depth) already
guards internally and is a no-op when no prefix exists; update the sequence
around m.complete(p, MD_NEWLINE) to directly call consume_quote_prefix(p,
quote_depth) (referencing functions has_quote_prefix and consume_quote_prefix to
locate the logic).
- Around line 1436-1442: The function apply_blank_line_action currently has an
unused parameter `_marker_line_break: bool`; remove this parameter from its
signature and update all callers that pass the extraneous argument (they
currently pass false) to call apply_blank_line_action with one fewer argument.
Keep the rest of the signature and behavior intact; do not change
apply_blank_line_action_with_prefix (which still uses marker_line_break). Search
for references to apply_blank_line_action and adjust their argument lists, and
run tests/compile to ensure no remaining uses of the removed parameter remain.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (3)
crates/biome_markdown_parser/src/syntax/list.rs (3)

1086-1214: Duplicated quote-prefix–stripping logic between two adjacent helpers.

quote_only_line_indent_at_current and next_quote_content_indent each contain a nearly identical for _ in 0..depth byte loop that consumes N levels of > markers. The duplication spans ~20 lines. Extracting a strip_quote_prefix_bytes(bytes, pos, end, depth) -> Option<usize> helper (returning the position after all markers, or None if the prefix is absent) would keep both in sync.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1086 - 1214,
Both functions quote_only_line_indent_at_current and next_quote_content_indent
duplicate the same for _ in 0..depth loop that consumes N levels of '>' quote
markers; extract that logic into a single helper strip_quote_prefix_bytes(bytes:
&[u8], mut pos: usize, end: usize, depth: usize) -> Option<usize> which
implements the identical behavior (handling up to 3 indentation columns with
spaces/tabs, advancing past each '>' and an optional single following space/tab)
and returns the new position after all markers or None if the prefix is absent,
then call this helper from both quote_only_line_indent_at_current and
next_quote_content_indent in place of the duplicated loop so both use the same
implementation and return semantics.

1743-1772: atx_heading_info carries a usize that's never used.

The lookahead returns Option<usize> (hash count), but line 1770 only calls .is_none(). The inner value is silently dropped. Change the return type to bool to match intent and avoid the misleading Some(hash_count) pattern.

♻️ Proposed simplification
-    let atx_heading_info = p.lookahead(|p| {
+    let is_atx_heading = p.lookahead(|p| {
         while p.at(MD_TEXTUAL_LITERAL) && is_whitespace_only(p.cur_text()) {
             p.bump(MD_TEXTUAL_LITERAL);
         }
         let is_hash =
             p.at(T![#]) || (p.at(MD_TEXTUAL_LITERAL) && p.cur_text().chars().all(|c| c == '#'));
         if !is_hash {
-            return None;
+            return false;
         }
         let text = p.cur_text();
         let hash_count = text.len();
         if !(1..=6).contains(&hash_count) {
-            return None;
+            return false;
         }
         p.bump(p.cur());
         if p.at(NEWLINE) || p.at(T![EOF]) {
-            return Some(hash_count);
+            return true;
         }
         if p.at(MD_TEXTUAL_LITERAL) {
             let t = p.cur_text();
             if t.starts_with(' ') || t.starts_with('\t') {
-                return Some(hash_count);
+                return true;
             }
         }
-        None
+        false
     });

-    if atx_heading_info.is_none() {
+    if !is_atx_heading {
         return false;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1743 - 1772,
The lookahead closure assigned to atx_heading_info returns Option<usize> but the
usize is never used; change the closure to return bool instead: replace all
returns of Some(hash_count) with true and None with false, remove the unused
hash_count where not needed (or keep for validation but don't return it), and
update the post-check from if atx_heading_info.is_none() to if !atx_heading_info
(i.e., treat atx_heading_info as a bool). Keep the same checks inside the
p.lookahead closure (MD_TEXTUAL_LITERAL, T![#], is_whitespace_only, NEWLINE/EOF,
leading space/tab) but return true/false to reflect presence of an ATX heading.

1342-1351: unreachable!() guard is a latent production panic.

apply_blank_line_action can only return Break or Continue today, so the assertion is never triggered. But if someone adds a FallThrough arm later, this panics at runtime in both debug and release builds. Either drop the guard and return directly, or use debug_unreachable! if you want the invariant documented.

♻️ Simplification
-        let result = apply_blank_line_action(p, state, action, is_blank, false);
-        if !matches!(result, LoopAction::Break | LoopAction::Continue) {
-            unreachable!("classify_blank_line always produces Break or Continue");
-        }
-        return (result, false);
+        // apply_blank_line_action always returns Break or Continue
+        return (apply_blank_line_action(p, state, action, is_blank, false), false);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1342 - 1351,
The unreachable!() guard after calling apply_blank_line_action is a latent
runtime panic if new LoopAction variants are added; remove that guard and return
the computed result directly: after computing action, is_blank and result from
classify_blank_line, list_newline_is_blank_line and apply_blank_line_action,
delete the if !matches!(result, LoopAction::Break | LoopAction::Continue) {
unreachable!(...) } check and simply return (result, false). This change touches
the block that calls classify_blank_line, list_newline_is_blank_line and
apply_blank_line_action and eliminates the runtime panic risk.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1616-1630: Detected thematic break via is_thematic_break but
discarding parse_thematic_break_block() result can set state.first_line
incorrectly when parse returns Absent; modify the code to store the return value
of parse_thematic_break_block(p) (the ParseResult/BlockResult it returns), check
whether it actually produced a block (i.e., is not Absent) before calling
state.record_first_line_block() and returning LoopAction::Continue, and only
treat it as handled when parse_thematic_break_block succeeded; this ensures
parse_thematic_break_block, which gates on at_thematic_break_block / the
token-based is_thematic_break_pattern in thematic_break_block.rs, and the local
is_thematic_break_pattern stay consistent in list contexts and prevents
advancing first_line when nothing was emitted.

---

Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1086-1214: Both functions quote_only_line_indent_at_current and
next_quote_content_indent duplicate the same for _ in 0..depth loop that
consumes N levels of '>' quote markers; extract that logic into a single helper
strip_quote_prefix_bytes(bytes: &[u8], mut pos: usize, end: usize, depth: usize)
-> Option<usize> which implements the identical behavior (handling up to 3
indentation columns with spaces/tabs, advancing past each '>' and an optional
single following space/tab) and returns the new position after all markers or
None if the prefix is absent, then call this helper from both
quote_only_line_indent_at_current and next_quote_content_indent in place of the
duplicated loop so both use the same implementation and return semantics.
- Around line 1743-1772: The lookahead closure assigned to atx_heading_info
returns Option<usize> but the usize is never used; change the closure to return
bool instead: replace all returns of Some(hash_count) with true and None with
false, remove the unused hash_count where not needed (or keep for validation but
don't return it), and update the post-check from if atx_heading_info.is_none()
to if !atx_heading_info (i.e., treat atx_heading_info as a bool). Keep the same
checks inside the p.lookahead closure (MD_TEXTUAL_LITERAL, T![#],
is_whitespace_only, NEWLINE/EOF, leading space/tab) but return true/false to
reflect presence of an ATX heading.
- Around line 1342-1351: The unreachable!() guard after calling
apply_blank_line_action is a latent runtime panic if new LoopAction variants are
added; remove that guard and return the computed result directly: after
computing action, is_blank and result from classify_blank_line,
list_newline_is_blank_line and apply_blank_line_action, delete the if
!matches!(result, LoopAction::Break | LoopAction::Continue) { unreachable!(...)
} check and simply return (result, false). This change touches the block that
calls classify_blank_line, list_newline_is_blank_line and
apply_blank_line_action and eliminates the runtime panic risk.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
crates/biome_markdown_parser/src/syntax/list.rs (1)

1882-1901: state.first_line = false at lines 1884 and 1894 are no-ops.

The guard at line 1856 already returns FallThrough when state.first_line is true, so both assignments are unreachable in the truthy state. Easy to drop.

♻️ Proposed cleanup
         if at_bullet_list_item(p) {
             let _ = parse_bullet_list_item(p);
             state.last_block_was_paragraph = false;
-            state.first_line = false;
             p.state_mut().virtual_line_start = prev_virtual;
             return ContinuationResult { ... };
         }
         if at_order_list_item(p) {
             let _ = parse_order_list_item(p);
             state.last_block_was_paragraph = false;
-            state.first_line = false;
             p.state_mut().virtual_line_start = prev_virtual;
             return ContinuationResult { ... };
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1882 - 1901,
Remove the redundant no-op assignments to state.first_line = false inside the
blocks that handle at_bullet_list_item and at_order_list_item: locate the
branches where at_bullet_list_item(p) and at_order_list_item(p) call
parse_bullet_list_item(p) / parse_order_list_item(p) and return a
ContinuationResult, and delete the state.first_line = false statements (they're
unreachable because the earlier guard returns FallThrough when state.first_line
is true); leave the other statements (state.last_block_was_paragraph,
p.state_mut().virtual_line_start, and the returned ContinuationResult) intact.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1622-1626: Capture and inspect the return value of
parse_thematic_break_block instead of discarding it: call
parse_thematic_break_block(p) when is_thematic_break is true, then only call
state.record_first_line_block() and return LoopAction::Continue if the parser
result indicates a successful block (e.g., Present); if
parse_thematic_break_block returns Absent (or otherwise not-successful), do not
flip first_line and allow normal parsing of the tokens as continuation content.
This change should reference parse_thematic_break_block,
is_thematic_break/is_thematic_break_pattern, at_thematic_break_block, and
state.record_first_line_block to ensure the detector and parser agree before
mutating state.

---

Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1882-1901: Remove the redundant no-op assignments to
state.first_line = false inside the blocks that handle at_bullet_list_item and
at_order_list_item: locate the branches where at_bullet_list_item(p) and
at_order_list_item(p) call parse_bullet_list_item(p) / parse_order_list_item(p)
and return a ContinuationResult, and delete the state.first_line = false
statements (they're unreachable because the earlier guard returns FallThrough
when state.first_line is true); leave the other statements
(state.last_block_was_paragraph, p.state_mut().virtual_line_start, and the
returned ContinuationResult) intact.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
crates/biome_markdown_parser/src/syntax/list.rs (2)

1334-1336: state.first_line = false in Phase 2 is redundant.

The guarding condition at line 1315 is if !state.first_line && ..., so by the time we reach line 1335 we already know state.first_line is false. The assignment is a no-op. Tiny, but worth trimming for clarity.

✏️ Proposed cleanup
         state.record_blank();
-        state.first_line = false;
         return (LoopAction::Continue, false);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1334 - 1336,
The assignment state.first_line = false is redundant in the Phase 2 branch
because the guard at the top (if !state.first_line && ...) ensures
state.first_line is already false; remove the redundant line from the block that
calls state.record_blank() and returns (LoopAction::Continue, false) in list.rs
to simplify the code and avoid no-op assignments.

1993-2043: Custom loop in parse_list_item_block_content has no explicit progress guard.

If every phase helper returns FallThrough and parse_any_block_with_indent_code_policy returns Absent without consuming a token (however unlikely in practice), the loop stalls forever. The outer BulletList/OrderedList are correctly using ParseNodeList, but this inner content loop is bespoke. Worth adding at minimum a debug_assert! that the parser has advanced, or a token-consumed sentinel, as a safety net.

Based on learnings: "Use ParseSeparatedList and ParseNodeList for parsing lists with error recovery to avoid infinite loops" (crates/biome_parser/CONTRIBUTING.md).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1993 - 2043,
The custom loop inside parse_list_item_block_content can stall if none of the
helpers consume input; add a progress guard: track whether the parser advanced
this iteration (e.g., a token_consumed boolean set when any helper or
parse_any_block_with_indent_code_policy consumes) and if not, either break with
an error node or trigger a debug_assert! to fail in debug builds. Update the
loop around calls to handle_blank_lines, handle_first_line_marker_only,
parse_first_line_blocks, check_continuation_indent, and parse_continuation_block
(and the call to parse_any_block_with_indent_code_policy) to set the sentinel
when they consume tokens, and bail out / assert when no progress is observed;
alternatively replace this bespoke loop with ParseNodeList/ParseSeparatedList
per CONTRIBUTING.md for robust recovery.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1343-1348: The unreachable! panic message is misleading: it's
checking the return of apply_blank_line_action but claims "classify_blank_line
always produces Break or Continue"; update the message to correctly reference
apply_blank_line_action (e.g., "apply_blank_line_action always produces Break or
Continue") or remove the panic and simply assert the invariant using
debug_assert! so the check refers to apply_blank_line_action and the symbols
involved (list_newline_is_blank_line, apply_blank_line_action, LoopAction::Break
| LoopAction::Continue, unreachable!) are updated accordingly.
- Around line 2022-2026: The match on parse_first_line_blocks(...) includes a
dead LoopAction::Break => break arm even though parse_first_line_blocks never
returns Break; remove that unreachable arm and either replace it with a
catch-all that calls unreachable!() (e.g., _ => unreachable!()) to document
invariants, or keep only the two valid arms (LoopAction::Continue => continue,
LoopAction::FallThrough => {}) and add a debug_assert! inside
parse_first_line_blocks to reserve a Break path for future use; reference
parse_first_line_blocks and LoopAction::Break when making the change.

---

Nitpick comments:
In `@crates/biome_markdown_parser/src/syntax/list.rs`:
- Around line 1334-1336: The assignment state.first_line = false is redundant in
the Phase 2 branch because the guard at the top (if !state.first_line && ...)
ensures state.first_line is already false; remove the redundant line from the
block that calls state.record_blank() and returns (LoopAction::Continue, false)
in list.rs to simplify the code and avoid no-op assignments.
- Around line 1993-2043: The custom loop inside parse_list_item_block_content
can stall if none of the helpers consume input; add a progress guard: track
whether the parser advanced this iteration (e.g., a token_consumed boolean set
when any helper or parse_any_block_with_indent_code_policy consumes) and if not,
either break with an error node or trigger a debug_assert! to fail in debug
builds. Update the loop around calls to handle_blank_lines,
handle_first_line_marker_only, parse_first_line_blocks,
check_continuation_indent, and parse_continuation_block (and the call to
parse_any_block_with_indent_code_policy) to set the sentinel when they consume
tokens, and bail out / assert when no progress is observed; alternatively
replace this bespoke loop with ParseNodeList/ParseSeparatedList per
CONTRIBUTING.md for robust recovery.

Comment on lines +1343 to +1348
let is_blank = list_newline_is_blank_line(p);
let result = apply_blank_line_action(p, state, action, is_blank);
if !matches!(result, LoopAction::Break | LoopAction::Continue) {
unreachable!("classify_blank_line always produces Break or Continue");
}
let newline_has_quote_prefix = quote_depth > 0
&& p.at(NEWLINE)
&& (p.at_line_start() || p.has_preceding_line_break())
&& has_quote_prefix(p, quote_depth);
return (result, false);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Misleading unreachable! message — references the wrong function.

The message "classify_blank_line always produces Break or Continue" is attached to a check on the return value of apply_blank_line_action, not classify_blank_line. Since apply_blank_line_action maps every BlankLineAction variant to either Continue or Break, the branch is provably dead — but the message blames the wrong callee, which will confuse anyone who ever trips over it.

✏️ Suggested fix
-    if !matches!(result, LoopAction::Break | LoopAction::Continue) {
-        unreachable!("classify_blank_line always produces Break or Continue");
-    }
+    debug_assert!(
+        matches!(result, LoopAction::Break | LoopAction::Continue),
+        "apply_blank_line_action always produces Break or Continue"
+    );
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 1343 - 1348,
The unreachable! panic message is misleading: it's checking the return of
apply_blank_line_action but claims "classify_blank_line always produces Break or
Continue"; update the message to correctly reference apply_blank_line_action
(e.g., "apply_blank_line_action always produces Break or Continue") or remove
the panic and simply assert the invariant using debug_assert! so the check
refers to apply_blank_line_action and the symbols involved
(list_newline_is_blank_line, apply_blank_line_action, LoopAction::Break |
LoopAction::Continue, unreachable!) are updated accordingly.

Comment on lines +2022 to +2026
match parse_first_line_blocks(p, &mut state, spaces_after_marker) {
LoopAction::Continue => continue,
LoopAction::Break => break,
LoopAction::FallThrough => {}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Dead LoopAction::Break arm — parse_first_line_blocks never returns Break.

Tracing every return path in parse_first_line_blocks (lines 1557–1708): all early-exit paths return Continue; the fall-through at line 1707 returns FallThrough. No path produces Break, so the LoopAction::Break => break arm at line 2024 is unreachable dead code that misleads readers into thinking a break condition can originate there.

✏️ Suggested fix
-        match parse_first_line_blocks(p, &mut state, spaces_after_marker) {
-            LoopAction::Continue => continue,
-            LoopAction::Break => break,
-            LoopAction::FallThrough => {}
-        }
+        if let LoopAction::Continue =
+            parse_first_line_blocks(p, &mut state, spaces_after_marker)
+        {
+            continue;
+        }

Alternatively, add a debug_assert! inside parse_first_line_blocks if a Break path is intentionally reserved for the future.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
match parse_first_line_blocks(p, &mut state, spaces_after_marker) {
LoopAction::Continue => continue,
LoopAction::Break => break,
LoopAction::FallThrough => {}
}
if let LoopAction::Continue =
parse_first_line_blocks(p, &mut state, spaces_after_marker)
{
continue;
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/biome_markdown_parser/src/syntax/list.rs` around lines 2022 - 2026,
The match on parse_first_line_blocks(...) includes a dead LoopAction::Break =>
break arm even though parse_first_line_blocks never returns Break; remove that
unreachable arm and either replace it with a catch-all that calls unreachable!()
(e.g., _ => unreachable!()) to document invariants, or keep only the two valid
arms (LoopAction::Continue => continue, LoopAction::FallThrough => {}) and add a
debug_assert! inside parse_first_line_blocks to reserve a Break path for future
use; reference parse_first_line_blocks and LoopAction::Break when making the
change.

@dyc3
Copy link
Contributor

dyc3 commented Feb 21, 2026

It's fine for these changes to go on main since markdown isn't exposed publicly yet.

…phased helpers

Refactor `parse_list_item_block_content` in `syntax/list.rs` into a small orchestrator with focused phase
helpers while preserving parser behavior.

- add explicit loop/state scaffolding (`ListItemLoopState`, `LoopAction`, `ContinuationResult`,
  `VirtualLineRestore`)
- extract blank-line handling, first-line marker-only handling, first-line block parsing, continuation-
  indent checks, and continuation block parsing
- hoist `NestedListMarker` to module scope and keep restore semantics explicit for `virtual_line_start`
- remove minor duplicated/redundant branches in blank-line action handling

No intended behavior change; parser tests and markdown conformance continue to pass.
Reverts the parse_first_line_header_body() extraction (single-call helper).
Adds a brief inline comment explaining why header::parse_hash_list() cannot
be reused here (bump_remap needed for MD_TEXTUAL_LITERAL-lexed '#').
@jfmcdowell jfmcdowell force-pushed the refactor/list-item-block-content branch from 4c3405a to 4654c23 Compare February 22, 2026 00:44
@jfmcdowell jfmcdowell changed the base branch from next to main February 22, 2026 00:44
@jfmcdowell
Copy link
Contributor Author

It's fine for these changes to go on main since markdown isn't exposed publicly yet.

Now targeting main

Copy link
Member

@ematipico ematipico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's remove code that panics in production

Replace `unreachable!()` calls with safe exhaustive match arms to avoid
panics on unpredictable user input. Use `// #region` / `// #endregion`
comments instead of decorative separator lines.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-Parser Area: parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants