Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(find-and-replace) (whole word selection) Allow non-word to non-word boundaries #987

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

icecream17
Copy link
Contributor

@icecream17 icecream17 commented Apr 21, 2024

Identify the Bug

#986

Description of the Change

Edited 2024-6-22 x2

What is "whole word"?

There's a feature in find-and-replace called "whole word". For example, ctrl+f i will match every single i in the code:

//   v
function* countTo2() {
//          v      v      v
   for (let i = 0; i < 2; i++) {
//           v
      yield (i + 1)
   }
}

But if you turn on "whole word", it only matches the i that form a whole word. Since the i in function is only part of a word, not a whole word, turning on "whole word" will now select something more useful:

//   x (not selected!)
function* countTo2() {
//          v      v      v
   for (let i = 0; i < 2; i++) {
//           v
      yield (i + 1)
   }
}

Bug

How is whole-word detected? Basically we match a "word boundary": a transition from a non-word to word character, or from a word to non-word character.

But what if the selection is say, $wha?

let $wha = 2

Clearly, $wha is a whole word. Except it's not, because at the start we are going from non-word (space) to non-word (dollar), i.e. not a word boundary.

Change fix

Allow (non-word character --> non-word character) boundaries

Alternate Designs

Possible Drawbacks

"whole word" might still be inaccurate in cases like $ in $$; might break people's workflow

Verification Process

Release Notes

(find-and-replace) (whole word selection) Allow non-word to non-word boundaries

@savetheclocktower
Copy link
Contributor

Ooh. This looks promising, but I'd love to see some new specs so we can figure out exactly what this will and won't do.

@savetheclocktower
Copy link
Contributor

Also: Chrome has supported lookbehinds in regular expressions since version 62, so we could try that instead.

the non-capturing group still selects, apparently
@icecream17
Copy link
Contributor Author

icecream17 commented Apr 21, 2024

There's no testing for whether it works at the start or end of a file now that I think about it
Future me experimentation: https://regex101.com/r/Bt7r1y/2

Sanity check: (https://devina.io/redos-checker)
image

@icecream17
Copy link
Contributor Author

icecream17 commented Jun 22, 2024

Note to self, for faster further investigation: errors:

FindView
  finding
    when whole-word search is enabled
      it finds the whole words even when the word starts or ends with a non-word character
        Expected { start : { row : 0, column : 0 }, end : { row : 0, column : 0 } } to equal [ [ 2, 5 ], [ 2, 10 ] ].
          at jasmine.Spec.<anonymous> (find-view-spec.js:468:49)
        Expected { start : { row : 0, column : 0 }, end : { row : 0, column : 0 } } to equal [ [ 2, 0 ], [ 2, 6 ] ].
          at jasmine.Spec.<anonymous> (find-view-spec.js:472:49)
      it doesn't highlight the search inside words (non-word character at start)
        Expected object with length 0 to have length 1
          at jasmine.Spec.<anonymous> (find-view-spec.js:479:64)
      it doesn't highlight the search inside words (non-word character at end)
        Expected object with length 0 to have length 1
          at jasmine.Spec.<anonymous> (find-view-spec.js:485:61)
        Expected object with length 0 to have length 1
          at jasmine.Spec.<anonymous> (find-view-spec.js:486:64)
ProjectFindView (ripgrep=false)
  finding
    whole word
      it toggles whole word option via an event and finds files matching the pattern
        Expected /(?<=^|\W)wholeword(?=$|\W)/gim to equal /\bwholeword\b/gim.
          at jasmine.Spec.<anonymous> (project-find-view-spec.js:692:60)
      it toggles whole word option via a button and finds files matching the pattern
        Expected /(?<=^|\W)wholeword(?=$|\W)/gim to equal /\bwholeword\b/gim.
          at jasmine.Spec.<anonymous> (project-find-view-spec.js:702:60)
ProjectFindView (ripgrep=true)
  finding
    whole word
      it toggles whole word option via an event and finds files matching the pattern
        Expected /(?<=^|\W)wholeword(?=$|\W)/gim to equal /\bwholeword\b/gim.
          at jasmine.Spec.<anonymous> (project-find-view-spec.js:692:60)
      it toggles whole word option via a button and finds files matching the pattern
        Expected /(?<=^|\W)wholeword(?=$|\W)/gim to equal /\bwholeword\b/gim.
          at jasmine.Spec.<anonymous> (project-find-view-spec.js:702:60)

The last 4 are easy to fix, but I'm very concerned about the first two. The result changed from an array to an object, whose match starts at 0? So the first thing I need to do is actually understand what the code is doing

@icecream17
Copy link
Contributor Author

Apparently, the word boundary \b thing matches something of zero-length: the position between a word and non-word character.

And importantly, a word boundary can be of either the form word nonword OR nonword word

So what this pr is trying to do, is allow nonword nonword at the start or end of find-selection. In other words, disable the "word" in "word selection" if there are non-word characters. This is potentially rather unintuitive and might break someone's use case, but for now the sample size is 1 (me).

@icecream17
Copy link
Contributor Author

image

Actually, I think the purpose of "whole word" search is for there to specifically be non-word characters surrounding the selection, i.e. the selection itself forms a "word" (which may or may not include non-word characters). Therefore, simulating \b results in "incorrect" matches like in the above picture, and ignores "correct" matches like on line 1. That means what the current code tries to do is correct.

This is kinda breaking so feedback from anyone who uses this option would be nice

@icecream17 icecream17 changed the title allow non word characters at the start or end of ctrl+f whole word se… [find-and-replace] look for non-word instead of word boundaries for whole word search Jun 22, 2024
@icecream17
Copy link
Contributor Author

icecream17 commented Jun 23, 2024

OH the code is doing exactly what it does, it's just the tests are wrong
image

-word is always preceded by a non-word character so Of course the test fails to find any match


Edit so I don't ping: wow, the find-and-replace tests just happen to be the longest package tests...

@icecream17 icecream17 changed the title [find-and-replace] look for non-word instead of word boundaries for whole word search (find-and-replace) (whole word selection) Allow non-word to non-word boundaries Jun 23, 2024
@icecream17 icecream17 marked this pull request as draft June 23, 2024 02:00
@icecream17 icecream17 marked this pull request as ready for review June 23, 2024 02:24
@icecream17
Copy link
Contributor Author

icecream17 commented Jun 23, 2024

The (1) test failures now are equally baffling...

FindView
  finding
    when whole-word search is enabled
      it finds the whole words even when the word starts or ends with a non-word character
// regex101 shows 8 matches but here it's 0?

        Expected object with length 0 to have length 7
          at jasmine.Spec.<anonymous> (find-view-spec.js:467:61)
        Expected object with length 0 to have length 1
          at jasmine.Spec.<anonymous> (find-view-spec.js:468:64)

// a test in between passed

// this test is basically the same but it's not? The return of the object instead of array....

        Expected { start : { row : 4, column : 0 }, end : { row : 4, column : 6 } } to equal [ [ 2, 0 ], [ 2, 6 ] ].
          at jasmine.Spec.<anonymous> (find-view-spec.js:476:49)

Edit: Oh, I forgot to set the cursor position and to run the command for the first test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants