fix(md-formatter): remove last hard line break by tidefield · Pull Request #9856 · biomejs/biome

tidefield · 2026-04-08T08:35:53Z

Summary

By default, we keep all the hard line breaks to preserve the original semantic. To match prettier's behavior, this PR removes the last hard line break because MD_INLINE_ITEM_LIST always ends with MD_TEXTUAL \n.

See the AST in https://spec.commonmark.org/dingus/?text=foo%20%20%0Abaz%20%20%20%0A%0Afoo for a better demonstration.

I added more test coverage which yields a bug in the parser. I'll comment below and file an issue.

Test Plan

$ cargo test -p biome_markdown_formatter

Docs

changeset-bot · 2026-04-08T08:36:09Z

⚠️ No Changeset found

Latest commit: 39dc847

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-04-08T08:38:34Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 8fc11f0b-1d87-4d61-aae9-ace97b1496d2

📥 Commits

Reviewing files that changed from the base of the PR and between 2d041a8 and 395d479.

⛔ Files ignored due to path filters (3)

crates/biome_markdown_formatter/tests/specs/markdown/hard_line.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-186.md.snap is excluded by !**/*.snap and included by **
crates/biome_markdown_formatter/tests/specs/prettier/markdown/spec/example-609.md.snap is excluded by !**/*.snap and included by **

📒 Files selected for processing (3)

crates/biome_markdown_formatter/src/markdown/auxiliary/hard_line.rs
crates/biome_markdown_formatter/tests/quick_test.rs
crates/biome_markdown_formatter/tests/specs/markdown/hard_line.md

✅ Files skipped from review due to trivial changes (1)

crates/biome_markdown_formatter/tests/specs/markdown/hard_line.md

🚧 Files skipped from review as they are similar to previous changes (2)

crates/biome_markdown_formatter/tests/quick_test.rs
crates/biome_markdown_formatter/src/markdown/auxiliary/hard_line.rs

Walkthrough

The formatter for Markdown hard lines now examines surrounding syntax (including MarkdownSyntaxKind and Direction) to decide if an MdHardLine is the last in its paragraph. If there is no next sibling or the next sibling is textual but trims to empty, the formatter emits format_removed(&token) followed by a single hard_line_break() (omitting the reinserted two-space marker). In all other cases it retains re-inserting exactly two spaces before the hard line break. Tests and fixtures were updated to cover backslash continuations, double-space hard breaks, and cases with or without intervening blank lines.

Suggested reviewers

dyc3
ematipico

🚥 Pre-merge checks | ✅ 2

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly describes the main change: removing the last hard line break in the Markdown formatter to match Prettier's behaviour.
Description check	✅ Passed	The description clearly explains the motivation (matching Prettier behaviour), references the technical reason (MD_INLINE_ITEM_LIST ending), and provides test instructions.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

tidefield · 2026-04-08T08:39:47Z

 no hard line
 here

+foo  


This is incorrect but the bug is in the parser.

CST for this block is currently

6: MD_PARAGRAPH@43..115 0: MD_INLINE_ITEM_LIST@43..115 0: MD_TEXTUAL@43..46 0: MD_TEXTUAL_LITERAL@43..46 "foo" [] [] 1: MD_HARD_LINE@46..49 0: MD_HARD_LINE_LITERAL@46..49 " \n" [] [] 2: MD_TEXTUAL@49..74 0: MD_TEXTUAL_LITERAL@49..74 "bar with empty line after" [] [] 3: MD_HARD_LINE@74..77 0: MD_HARD_LINE_LITERAL@74..77 " \n" [] [] 4: MD_TEXTUAL@77..78 0: MD_TEXTUAL_LITERAL@77..78 "\n" [] [] 5: MD_TEXTUAL@78..81 0: MD_TEXTUAL_LITERAL@78..81 "foo" [] [] 6: MD_HARD_LINE@81..84 0: MD_HARD_LINE_LITERAL@81..84 " \n" [] [] 7: MD_TEXTUAL@84..112 0: MD_TEXTUAL_LITERAL@84..112 "bar without empty line after" [] [] 8: MD_HARD_LINE@112..115 0: MD_HARD_LINE_LITERAL@112..115 " \n" [] [] 1: (empty)

The expected behavior is that there should be two MD_PARAGRAPH nodes delimited by the empty line.

See AST tab in https://spec.commonmark.org/dingus/?text=foo%20%20%20%20%20%20%20%0Abaz%0A%0Abackslash%5C%0Aform%0A%0Ano%20hard%20line%0Ahere%0A%0Afoo%20%20%0Abar%20with%20empty%20line%20after%20%20%0A%0Afoo%20%20%0Abar%20without%20empty%20line%20after%20%20.

@coderabbitai file an issue for this parser bug

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash gh issue create \ --repo biomejs/biome \ --title "biome_markdown_parser: paragraph containing hard line breaks incorrectly absorbs following empty line and paragraph" \ --body "## Description When a Markdown paragraph ends with a hard line break (\` \n\`) and is followed by an empty line and another paragraph, the parser incorrectly merges both paragraphs into a single \`MD_PARAGRAPH\` node rather than producing two separate \`MD_PARAGRAPH\` nodes delimited by the empty line. ## Steps to Reproduce Input: \`\`\`md foo bar with empty line after foo bar without empty line after \`\`\` ## Incorrect CST Output \`\`\` 6: MD_PARAGRAPH@43..115 0: MD_INLINE_ITEM_LIST@43..115 0: MD_TEXTUAL@43..46 0: MD_TEXTUAL_LITERAL@43..46 \"foo\" [] [] 1: MD_HARD_LINE@46..49 0: MD_HARD_LINE_LITERAL@46..49 \" \\n\" [] [] 2: MD_TEXTUAL@49..74 0: MD_TEXTUAL_LITERAL@49..74 \"bar with empty line after\" [] [] 3: MD_HARD_LINE@74..77 0: MD_HARD_LINE_LITERAL@74..77 \" \\n\" [] [] 4: MD_TEXTUAL@77..78 0: MD_TEXTUAL_LITERAL@77..78 \"\\n\" [] [] <-- empty line absorbed into paragraph 5: MD_TEXTUAL@78..81 0: MD_TEXTUAL_LITERAL@78..81 \"foo\" [] [] 6: MD_HARD_LINE@81..84 0: MD_HARD_LINE_LITERAL@81..84 \" \\n\" [] [] 7: MD_TEXTUAL@84..112 0: MD_TEXTUAL_LITERAL@84..112 \"bar without empty line after\" [] [] 8: MD_HARD_LINE@112..115 0: MD_HARD_LINE_LITERAL@112..115 \" \\n\" [] [] 1: (empty) \`\`\` ## Expected Behaviour Two separate \`MD_PARAGRAPH\` nodes should be produced, as per the CommonMark spec. See the [CommonMark dingus AST](https://spec.commonmark.org/dingus/?text=foo%20%20%20%20%20%20%20%0Abaz%0A%0Abackslash%5C%0Aform%0A%0Ano%20hard%20line%0Ahere%0A%0Afoo%20%20%0Abar%20with%20empty%20line%20after%20%20%0A%0Afoo%20%20%0Abar%20without%20empty%20line%20after%20%20) for reference. ## Impact This parser bug causes the formatter to incorrectly omit the blank line between the two paragraphs, as surfaced in PR `#9856`. ## Related - Discovered in `#9856` - Requested by `@tidefield` "

Length of output: 44

@tidefield, done! Filed as #9857 — it covers the incorrect single-MD_PARAGRAPH parse when a hard-line-terminated paragraph is followed by an empty line and a second paragraph, with a link back to this PR.

🧠 Learnings used

Learnt from: ematipico Repo: biomejs/biome PR: 7467 File: crates/biome_service/src/file_handlers/html.rs:456-466 Timestamp: 2025-09-13T16:16:06.459Z Learning: In biome_formatter, consecutive LineMode::Hard elements are automatically collapsed to a single newline by the printer. The printer has a test "it_prints_consecutive_hard_lines_as_one" that demonstrates this behavior - multiple consecutive hard line breaks result in only one newline in the output, not extra blank lines.

Learnt from: tidefield Repo: biomejs/biome PR: 9480 File: crates/biome_markdown_formatter/src/markdown/auxiliary/hard_line.rs:11-31 Timestamp: 2026-03-14T11:30:13.488Z Learning: In `crates/biome_markdown_formatter/src/markdown/auxiliary/hard_line.rs`, `FormatMdHardLine::fmt_fields` is responsible for emitting `hard_line_break()` itself (not the caller). As of PR `#9480` (building on `#9331` by ematipico), the function branches on the token text: if it ends with `\` it emits `format_removed(&token)` + `text("\\", ...)` + `hard_line_break()`; otherwise it normalises to two trailing spaces via `format_removed(&token)` + `text(" ", ...)` + `hard_line_break()`. The callers in `inline_item_list.rs` and `paragraph.rs` do NOT add an extra `hard_line_break()`.

Learnt from: CR Repo: biomejs/biome PR: 0 File: crates/biome_formatter/CONTRIBUTING.md:0-0 Timestamp: 2026-02-05T16:31:32.222Z Learning: Applies to crates/biome_formatter/**/tests/specs/**/* : Create test spec files in `tests/specs/<language>/` directory as input files for formatter testing; accept snapshots only after verifying correctness

Learnt from: jfmcdowell Repo: biomejs/biome PR: 0 File: :0-0 Timestamp: 2026-03-30T19:37:09.297Z Learning: In `crates/biome_markdown_parser/src/syntax/fenced_code_block.rs`, there is an intentional asymmetry between opening and closing fence indent stripping inside list items: - **Opening fence**: `check_continuation_indent` (called before the fenced code block parser runs) already consumes the list continuation indent, so the opener only needs to emit the remaining 0–3 CommonMark-allowed fence spaces (`MAX_BLOCK_PREFIX_INDENT`). Stripping `list_item_required_indent` in the opener is NOT needed. - **Closing fence**: processes the raw line without prior consumption, so it must strip `list_item_required_indent + MAX_BLOCK_PREFIX_INDENT` itself (the fix introduced in PR `#9730`).

Learnt from: jfmcdowell Repo: biomejs/biome PR: 9779 File: crates/biome_markdown_parser/tests/spec_test.rs:143-157 Timestamp: 2026-04-03T00:16:33.809Z Learning: In `crates/biome_markdown_parser/tests/spec_test.rs`, the `MdContinuationIndent` structural invariant (checking that children are `MD_INDENT_TOKEN`) intentionally does NOT assert that the `MD_INDENT_TOKEN_LIST` wrapper is present, because `has_bogus_nodes_or_empty_slots(&syntax)` runs earlier in the `ExpectedOutcome::Pass` branch and already catches a missing or empty wrapper child. The invariant is purposefully layered on top of that prior check.

Learnt from: tidefield Repo: biomejs/biome PR: 9693 File: crates/biome_markdown_formatter/src/markdown/auxiliary/bullet.rs:24-24 Timestamp: 2026-04-02T09:34:17.898Z Learning: In `crates/biome_markdown_formatter/` (and biome formatter crates generally), `format_verbatim_node` is only acceptable as initial scaffolding boilerplate when a formatter is first set up. Once active formatting features are being implemented, no new calls to `format_verbatim_node` should be introduced — new feature code must format the node's fields explicitly instead. Per ematipico: its use in feature code is considered a bug. Per tidefield: this applies to all new formatting feature work in biome_markdown_formatter.

cc @jfmcdowell

codspeed-hq · 2026-04-08T08:43:09Z

Merging this PR will not alter performance

✅ 28 untouched benchmarks
⏩ 228 skipped benchmarks¹

_{Comparing tidefield:fix-9770 (39dc847) with main (b7134d9)}

228 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

…aragraph

tidefield · 2026-04-08T08:49:15Z

@coderabbitai

coderabbitai · 2026-04-08T08:49:32Z

Tip

For best results, initiate chat on the files or code changes.

@tidefield Hey! Looks like your message came through empty. What can I help you with? If you'd like me to review the PR, just say the word! 🐰

🧠 Learnings used

Learnt from: tidefield
Repo: biomejs/biome PR: 9480
File: crates/biome_markdown_formatter/src/markdown/auxiliary/hard_line.rs:11-31
Timestamp: 2026-03-14T11:30:13.488Z
Learning: In `crates/biome_markdown_formatter/src/markdown/auxiliary/hard_line.rs`, `FormatMdHardLine::fmt_fields` is responsible for emitting `hard_line_break()` itself (not the caller). As of PR `#9480` (building on `#9331` by ematipico), the function branches on the token text: if it ends with `\` it emits `format_removed(&token)` + `text("\\", ...)` + `hard_line_break()`; otherwise it normalises to two trailing spaces via `format_removed(&token)` + `text("  ", ...)` + `hard_line_break()`. The callers in `inline_item_list.rs` and `paragraph.rs` do NOT add an extra `hard_line_break()`.

Learnt from: jfmcdowell
Repo: biomejs/biome PR: 0
File: :0-0
Timestamp: 2026-03-30T19:37:09.297Z
Learning: In `crates/biome_markdown_parser/src/syntax/fenced_code_block.rs`, there is an intentional asymmetry between opening and closing fence indent stripping inside list items:
- **Opening fence**: `check_continuation_indent` (called before the fenced code block parser runs) already consumes the list continuation indent, so the opener only needs to emit the remaining 0–3 CommonMark-allowed fence spaces (`MAX_BLOCK_PREFIX_INDENT`). Stripping `list_item_required_indent` in the opener is NOT needed.
- **Closing fence**: processes the raw line without prior consumption, so it must strip `list_item_required_indent + MAX_BLOCK_PREFIX_INDENT` itself (the fix introduced in PR `#9730`).

Learnt from: dyc3
Repo: biomejs/biome PR: 8291
File: crates/biome_html_formatter/tests/specs/prettier/vue/html-vue/elastic-header.html:10-10
Timestamp: 2025-12-04T13:29:49.287Z
Learning: Files under `crates/biome_html_formatter/tests/specs/prettier` are test fixtures synced from Prettier and should not receive detailed code quality reviews (e.g., HTTP vs HTTPS, formatting suggestions, etc.). These files are test data meant to validate formatter behavior and should be preserved as-is.

Learnt from: ematipico
Repo: biomejs/biome PR: 7467
File: crates/biome_service/src/file_handlers/html.rs:456-466
Timestamp: 2025-09-13T16:16:06.459Z
Learning: In biome_formatter, consecutive LineMode::Hard elements are automatically collapsed to a single newline by the printer. The printer has a test "it_prints_consecutive_hard_lines_as_one" that demonstrates this behavior - multiple consecutive hard line breaks result in only one newline in the output, not extra blank lines.

Learnt from: ematipico
Repo: biomejs/biome PR: 7467
File: crates/biome_service/src/file_handlers/html.rs:456-466
Timestamp: 2025-09-13T16:16:06.459Z
Learning: The biome_formatter printer deduplicates consecutive LineMode::Hard elements by checking if the current line is already empty (line_width > 0) before printing a newline. This prevents multiple consecutive hard line breaks from creating extra blank lines in the output.

Learnt from: tidefield
Repo: biomejs/biome PR: 9693
File: crates/biome_markdown_formatter/src/markdown/auxiliary/bullet.rs:24-24
Timestamp: 2026-04-02T09:34:17.898Z
Learning: In `crates/biome_markdown_formatter/` (and biome formatter crates generally), `format_verbatim_node` is only acceptable as initial scaffolding boilerplate when a formatter is first set up. Once active formatting features are being implemented, no new calls to `format_verbatim_node` should be introduced — new feature code must format the node's fields explicitly instead. Per ematipico: its use in feature code is considered a bug. Per tidefield: this applies to all new formatting feature work in biome_markdown_formatter.

Learnt from: ematipico
Repo: biomejs/biome PR: 9584
File: .changeset/fix-vue-directive-expression-parsing.md:5-5
Timestamp: 2026-03-22T14:48:22.401Z
Learning: In biomejs/biome, the bug fix changeset format `Fixed [`#NUMBER`](issue link): ...` is only required when a GitHub issue exists for the bug. When no issue is filed, a free-form description of the fix is acceptable for the changeset entry.

Learnt from: CR
Repo: biomejs/biome PR: 0
File: CONTRIBUTING.md:0-0
Timestamp: 2026-04-06T21:31:01.848Z
Learning: Applies to crates/biome_formatter/**/*.rs : Follow technical guidelines and patterns documented in the biome_formatter CONTRIBUTING.md for formatter implementation and testing

Learnt from: CR
Repo: biomejs/biome PR: 0
File: crates/biome_analyze/CONTRIBUTING.md:0-0
Timestamp: 2026-04-07T12:58:59.307Z
Learning: Applies to crates/biome_analyze/**/biome_*_analyze/lib/src/lint/**/*.rs : Use `#` prefix to hide lines from code block output in rule documentation while keeping them in validation

Learnt from: jfmcdowell
Repo: biomejs/biome PR: 9779
File: crates/biome_markdown_parser/tests/spec_test.rs:143-157
Timestamp: 2026-04-03T00:16:33.809Z
Learning: In `crates/biome_markdown_parser/tests/spec_test.rs`, the `MdContinuationIndent` structural invariant (checking that children are `MD_INDENT_TOKEN`) intentionally does NOT assert that the `MD_INDENT_TOKEN_LIST` wrapper is present, because `has_bogus_nodes_or_empty_slots(&syntax)` runs earlier in the `ExpectedOutcome::Pass` branch and already catches a missing or empty wrapper child. The invariant is purposefully layered on top of that prior check.

tidefield · 2026-04-08T08:50:38Z

https://github.com/coderabbitai review

ematipico

Just a comment to address

ematipico · 2026-04-08T11:34:59Z

 no hard line
 here

+foo  


cc @jfmcdowell

tidefield · 2026-04-08T13:51:35Z

@ematipico oops. I reverted the change.

tidefield force-pushed the fix-9770 branch from 23f5ad5 to 2d041a8 Compare April 8, 2026 08:36

github-actions bot added A-Formatter Area: formatter L-Markdown Language: Markdown labels Apr 8, 2026

tidefield commented Apr 8, 2026

View reviewed changes

coderabbitai bot mentioned this pull request Apr 8, 2026

biome_markdown_parser: paragraph containing hard line breaks incorrectly absorbs following empty line and paragraph #9857

Closed

Update hard_line.rs for better detection of last hard line break in p…

395d479

…aragraph

tidefield force-pushed the fix-9770 branch from 2d041a8 to 395d479 Compare April 8, 2026 08:48

ematipico approved these changes Apr 8, 2026

View reviewed changes

revert quick_test

39dc847

ematipico merged commit 66be88f into biomejs:main Apr 8, 2026
15 of 16 checks passed

This was referenced Apr 12, 2026

fix(markdown_parser): paragraph with trailing hard break absorbs following blank line #9931

Merged

fix(md): code info string, and fmt advancement #9979

Merged

Uh oh!

Conversation

tidefield commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test Plan

Docs

Uh oh!

changeset-bot bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Suggested reviewers

Uh oh!

tidefield Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

tidefield Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

ematipico Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

codspeed-hq bot commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Merging this PR will not alter performance

Footnotes

Uh oh!

tidefield commented Apr 8, 2026

Uh oh!

coderabbitai bot commented Apr 8, 2026

Uh oh!

tidefield commented Apr 8, 2026

Uh oh!

ematipico left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ematipico Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

tidefield commented Apr 8, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tidefield commented Apr 8, 2026 •

edited

Loading

changeset-bot bot commented Apr 8, 2026 •

edited

Loading

coderabbitai bot commented Apr 8, 2026 •

edited

Loading

codspeed-hq bot commented Apr 8, 2026 •

edited

Loading