Skip to content

[parser] Fix indentation tracking after line continuations#23299

Closed
kar-ganap wants to merge 1 commit intoastral-sh:mainfrom
kar-ganap:fix/issue-19301
Closed

[parser] Fix indentation tracking after line continuations#23299
kar-ganap wants to merge 1 commit intoastral-sh:mainfrom
kar-ganap:fix/issue-19301

Conversation

@kar-ganap
Copy link
Contributor

@kar-ganap kar-ganap commented Feb 15, 2026

Summary

Fixes #19301.

Per the Python spec, the whitespace up to the first backslash determines the line's indentation, and continuation lines should not affect the indentation level. The lexer's `eat_indentation` loop was not handling this: after a `\` continuation, it accumulated the continuation line's whitespace into the indentation, causing spurious `IndentationError` and "Expected a statement" errors.

Reproduction

```python
if True:
pass
\
print("1")
```

Before: `SyntaxError: Unexpected indentation` + `Expected a statement`
After: Parses correctly (matches CPython behavior)

Fix

After consuming `\` during indentation tracking in `eat_indentation`, skip the continuation line's whitespace with `eat_while(is_python_whitespace)` without accumulating it into `indentation`. Guard: only skip when `indentation != Indentation::root()` — when `\` is at column 0, let the loop continue so the next line's whitespace is accumulated normally (needed for the `else: \` pattern in RET505 fixtures).

Reference

Test plan

Add lexer test cases, a couple of inline parser tests.

ntBre
ntBre previously requested changes Feb 16, 2026
Copy link
Contributor

@ntBre ntBre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a regression test for this showing that the issue is now resolved?

@astral-sh-bot
Copy link

astral-sh-bot bot commented Feb 16, 2026

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

✅ ecosystem check detected no linter changes.

Formatter (stable)

✅ ecosystem check detected no format changes.

Formatter (preview)

✅ ecosystem check detected no format changes.

@kar-ganap kar-ganap force-pushed the fix/issue-19301 branch 2 times, most recently from f65948a to 2457be3 Compare February 17, 2026 00:40
@dhruvmanila
Copy link
Member

Per the Python spec, the whitespace up to the first backslash determines the line's indentation, and continuation lines should not affect the indentation level.

Can you provide a reference to where this is specified? It'd be useful to add a link to it in the comments.

@dhruvmanila dhruvmanila added the parser Related to the parser label Feb 17, 2026
@dhruvmanila
Copy link
Member

Reproduction from the issue:

This example does not raise an error on latest main: https://play.ruff.rs/ccc71f1d-f784-4b53-b253-0d220a765fd4 although the test case added does raise a syntax error.

@kar-ganap
Copy link
Contributor Author

@dhruvmanila Thanks for the review!

Re: Python spec reference — The reference is in the Python Language Reference — Indentation. The original issue also links to the related CPython discussion: python/cpython#90249. I've updated the PR description to include the spec link.

Re: reproduction not erroring on main — Good catch, the reproduction in the original PR description was wrong (it showed a mid-expression x = \ continuation which is handled differently). The actual bug is when \ appears at the start of a line during indentation tracking, like the issue's test_1:

if True:
    pass
    \
        print("1")

This still errors on latest main:

$ ruff check test.py --isolated --output-format concise -q
test.py:3:1: SyntaxError: Unexpected indentation
test.py:5:1: SyntaxError: Expected a statement

I've updated the PR description with the correct reproduction. The regression test (backslash_continuation_indentation.py) tests this exact case and was correctly flagging the bug.

@ntBre ntBre dismissed their stale review February 17, 2026 20:37

Tests have been added

)));
}
indentation = Indentation::root();
// test_ok backslash_continuation_indentation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lexer tests are written using Rust code under the #[cfg(test)] section in this file which would create snapshots of the lexed tokens. Can you move this down below and add a few more edge cases?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — added 12 lexer snapshot tests in the #[cfg(test)] section (4 scenarios × 3 EOL variants), following the existing helper + per-EOL pattern. Kept the inline tests as documentation for the fixture infrastructure.

@dhruvmanila dhruvmanila added the bug Something isn't working label Feb 18, 2026
@dhruvmanila dhruvmanila self-assigned this Feb 19, 2026
Comment on lines +3102 to +3105
fn backslash_continuation_indentation_eol(eol: &str) -> LexerOutput {
let source = format!("if True:{eol} pass{eol} \\{eol} print(\"1\"){eol}");
lex_source(&source)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry but this is not what I meant, we don't need to test for various EOL cases as that's already tested, the test cases should be specific to the behavior that's being changed.

I've updated the test cases locally and I've found that the following isn't raising a syntax error when it should:

if True:
    1
      \
    2

The indentation level at 1 is of 4 spaces while at 2 is of 6 spaces. So, I think there's more that needs to be done. I need to take a closer look at the solution and what CPython is doing.

Copy link
Member

@dhruvmanila dhruvmanila Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll look at it a later today unless you can figure out what's wrong, I've put the PR into draft for the time being.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I investigated the overindented case you raised:

if True:
    1
      \
    2

This does produce an error with our fix. The 6 spaces before \ freeze the indentation at 6, so the lexer emits an Indent from 4→6, and the parser rejects it as "Unexpected indentation":

invalid-syntax: Unexpected indentation
 --> test.py:3:1
  |
1 |   if True:
2 |       1
3 | /       \
4 | |     2
  | |____^

The issue exists on main (without this fix), where indentation = Indentation::root() resets indentation to 0, then re-accumulates 4 spaces from the continuation line — matching the block level and silently accepting the overindent. Our fix corrects this.

I've added a backslash_continuation_overindented snapshot test that captures the token stream for this exact case (showing the two Indent tokens). Also simplified the tests per your feedback — removed the 12 EOL-variant tests and replaced them with 5 targeted tests:

  1. backslash_continuation_in_block — basic valid case
  2. backslash_continuation_at_column_zero — column 0 special case
  3. backslash_continuation_multiple — consecutive continuations
  4. backslash_continuation_overindented — your case (indentation frozen at 6, Indent 4→6 emitted)
  5. backslash_continuation_mismatch — indentation doesn't match stack, lexer error

Verified all cases match CPython behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sure you didn't mean to do that but you force pushed some of the changes which removed my changes in 879255d. Please avoid making any further changes to avoid any conflict.

@dhruvmanila dhruvmanila marked this pull request as draft February 19, 2026 04:59
@kar-ganap kar-ganap marked this pull request as ready for review February 19, 2026 07:06
@dhruvmanila
Copy link
Member

Closing in favor of #23417

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working parser Related to the parser

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Syntax errors related to indentation after backslashes

3 participants