fix(regex): LineContinuation produces empty code points sequence by Sysix · Pull Request #13458 · oxc-project/oxc

Sysix · 2025-08-30T23:37:32Z

https://tc39.es/ecma262/#sec-literals-string-literals

Note 2
and cannot appear in a string literal, except as part of a LineContinuation to produce the empty code points sequence. The proper way to include either in the String value of a string literal is to use an escape sequence such as \n or \u000A.

This implementation skips the code point.
Maybe return Option<Option<u32>> for parse_string_character?

Sysix · 2025-08-30T23:37:46Z

fix(regex): LineContinuation produces empty code points sequence #13458 👈 (View in Graphite)
main

How to use the Graphite Merge Queue

Add either label to this PR to merge it via the merge queue:

0-merge - adds this PR to the back of the merge queue
hotfix - for urgent hot fixes, skip the queue and merge this PR next

You must have a Graphite account in order to use the merge queue. Sign up using this link.

_{An organization admin has enabled the Graphite Merge Queue in this repository.} _{Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.}

This stack of pull requests is managed by Graphite. Learn more about stacking.

codspeed-hq · 2025-08-30T23:43:54Z

CodSpeed Instrumentation Performance Report

Merging #13458 will not alter performance

_{Comparing 08-31-fix_regex_linecontinuation_produces_the_empty_code_points_sequence (1698435) with main (5b139aa)}

Summary

✅ 37 untouched benchmarks

Copilot

Pull Request Overview

This PR fixes a bug in the regular expression parser where LineContinuation sequences (backslash followed by line terminators) were incorrectly producing code points instead of empty sequences as required by the ECMAScript specification.

Changed parse_line_terminator_sequence to return bool instead of Option<u32> to indicate detection without producing code points
Modified parse_string_character to recursively continue parsing when a LineContinuation is detected
Added detailed documentation explaining the ECMAScript specification requirement

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

crates/oxc_regular_expression/src/parser/reader/string_literal_parser/parser_impl.rs

leaysgur

<LF> and <CR> cannot appear in a string literal, except as part of a LineContinuation to produce the empty code points sequence.

Currently, I think we can properly handle <CR> and <LF> when parsing LineContinuation. Simply skipping all of them doesn't seem appropriate. (And <PS> and <LS> should not be skipped)

The issue might be that the current AST only represents CodePoint, which makes it impossible to determine whether they originated from a LineContinuation.

So... what was the original purpose of this PR?
Was it inconvenient in some specific use case?

Sysix · 2025-09-01T17:15:28Z

So... what was the original purpose of this PR?

I found it when working on #13365.
CoPilot found it after wanted to fix this failing test case:

oxc/crates/oxc_linter/src/rules/eslint/no_misleading_character_class.rs

Lines 582 to 585 in 9ca9dc5

    
           // ( 
        
           //     r#"new RegExp('[ \\u\\\r\nfe0f]')"#, // line continuation: backslash + <CR> + <LF> 
        
           //     None 
        
           // ),

But I guess this is not the root of my problem :)

The issue might be that the current AST only represents CodePoint, which makes it impossible to determine whether they originated from a LineContinuation.

This will need then more refactoring. :/
I guess the empty codepoint does not apply for TemplateLiteral.

Note
TV excludes the code units of LineContinuation while TRV includes them. and LineTerminatorSequences are normalized to for both TV and TRV. An explicit TemplateEscapeSequence is needed to include a or sequence.

leaysgur · 2025-09-03T11:13:16Z

This will need then more refactoring. :/

Yes...🫠

Do you mind just leaving this as-is for now?

https://github.com/eslint/eslint/blob/a355a0e5b2e6a47cda099b31dc7d112cfb5c4315/tests/lib/rules/no-misleading-character-class.js#L2027C10-L2032C56

I've never seen usage like this, so I want to believe it won't cause any problems.

github-actions bot added the C-bug Category - Bug label Aug 30, 2025

Sysix changed the title ~~fix(regex): LineContinuation produces the empty code points sequence~~ fix(regex): LineContinuation produces empty code points sequence Aug 30, 2025

Sysix force-pushed the 08-31-fix_regex_linecontinuation_produces_the_empty_code_points_sequence branch from c26f50a to 096f603 Compare August 31, 2025 00:16

Sysix marked this pull request as ready for review August 31, 2025 00:19

Copilot AI review requested due to automatic review settings August 31, 2025 00:19

Sysix requested a review from leaysgur as a code owner August 31, 2025 00:19

Copilot AI reviewed Aug 31, 2025

View reviewed changes

crates/oxc_regular_expression/src/parser/reader/string_literal_parser/parser_impl.rs Show resolved Hide resolved

crates/oxc_regular_expression/src/parser/reader/string_literal_parser/parser_impl.rs Outdated Show resolved Hide resolved

fix(regex): LineContinuation produces the empty code points sequence

1698435

Sysix force-pushed the 08-31-fix_regex_linecontinuation_produces_the_empty_code_points_sequence branch from 096f603 to 1698435 Compare August 31, 2025 00:37

Boshen assigned leaysgur Aug 31, 2025

This comment was marked as outdated.

Sign in to view

leaysgur reviewed Sep 1, 2025

View reviewed changes

Sysix closed this Sep 5, 2025

Sysix deleted the 08-31-fix_regex_linecontinuation_produces_the_empty_code_points_sequence branch September 5, 2025 12:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(regex): LineContinuation produces empty code points sequence#13458

fix(regex): LineContinuation produces empty code points sequence#13458
Sysix wants to merge 1 commit intomainfrom
08-31-fix_regex_linecontinuation_produces_the_empty_code_points_sequence

Sysix commented Aug 30, 2025 •

edited

Loading

Uh oh!

Sysix commented Aug 30, 2025

Uh oh!

codspeed-hq bot commented Aug 30, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

leaysgur left a comment •

edited

Loading

Uh oh!

Sysix commented Sep 1, 2025

Uh oh!

leaysgur commented Sep 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

Sysix commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Sysix commented Aug 30, 2025

How to use the Graphite Merge Queue

Uh oh!

codspeed-hq bot commented Aug 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Instrumentation Performance Report

Merging #13458 will not alter performance

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

This comment was marked as outdated.

Uh oh!

This comment was marked as outdated.

leaysgur left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Sysix commented Sep 1, 2025

Uh oh!

leaysgur commented Sep 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Sysix commented Aug 30, 2025 •

edited

Loading

codspeed-hq bot commented Aug 30, 2025 •

edited

Loading

leaysgur left a comment •

edited

Loading