Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Global flag not respected on regex pattern #72

Closed
CampbellMG opened this issue Dec 19, 2019 · 26 comments
Closed

Global flag not respected on regex pattern #72

CampbellMG opened this issue Dec 19, 2019 · 26 comments
Milestone

Comments

@CampbellMG
Copy link

CampbellMG commented Dec 19, 2019

If I utilise a regex with the global flag enabled, like this:

<ParsedText parse={[{pattern: /a/g, renderText: () => 'z'}]}> aaaa </ParsedText>

I would expect the output to be: zzzz

Whereas, I would expect this:

<ParsedText parse={[{pattern: /a/, renderText: () => 'z'}]}> aaaa </ParsedText>

to output: zaaa as the global flag is removed and is only matching the first instance.

At the moment the second example will output zzzz as the parser will continue matching the pattern against any remaining text.

While I understand there may be people depending on this behaviour is there any workaround to ensure only the first match is rendered?

@cltsang
Copy link

cltsang commented Jun 5, 2020

#47 fixes this issue.

@fbartho fbartho added this to the 0.0.22 milestone Jun 10, 2020
@fbartho
Copy link
Contributor

fbartho commented Jun 10, 2020

@CampbellMG This actually would be pretty heavy a breaking change to the library.

Everyone who hasn't been using the global flag would have to update their regexes.

Basically, the global-flag is implicit & assumed to be true by the current implementation.

Would a good compromise be a separate setting on the parse command that switches on "non-exhaustive" mode?

<ParsedText parse={[{pattern: /a/, numberOfMatches: 1, renderText: () => 'z'}]}> aaaa </ParsedText>

This library has been a 0 version forever, so we probably want to avoid breaking changes like this one.

Alternatively, for your use-case, you could use a capture-group to collect the original text, and "consume" it.

<ParsedText parse={[{pattern: /a(.+$)/, renderText: (text, matches) => 'z' + matches[0]}]}> aaaa </ParsedText>

(Note: I haven't tested this, just guessing)

@CampbellMG
Copy link
Author

CampbellMG commented Jun 11, 2020

The second solution won't quite work in my app as the regex passed to the component is user-defined which makes it tricky to modify.

Your suggestion of a non-exhaustive switch would work perfectly, although in your example it looks likes you are suggesting a numberOfMatches option. How would this work if I don't know the number of occurrences?

I'm still relatively new to open source but would love to put up a PR with this change if you'd like the help.

@lizzieshipwreck
Copy link

lizzieshipwreck commented Jun 11, 2020

Hi - not sure if this is the right thread to ask, but I saw recent activity so I jumped on it, and seems like my issue might be related somehow? I'm currently not getting all matches returned for a regex pattern like the following: /@Bob|@Alice|@Steve/gi - where the number of names can be any length. No matter how many names are in the regex, it only matches the first 2 names in the text string. Any idea what's going on?

@sh-helen
Copy link

sh-helen commented Jun 11, 2020

These is a confusing problem with global flag right now.
Example: lol lol lol lol lol lol lol lol
Regex pattern: new RegExp('lol', 'gi')

Since lib is making substrings on each iteration and extracts matched string part by part https://github.com/taskrabbit/react-native-parsed-text/blob/master/src/lib/TextExtraction.js#L39
match.index won't be correct, since every exec updates the lastIndex property of the regular expression object itself. Normally, on each iteration it should be zero in this case, but because of side effect of exec and those shifts it won't parse the expression correctly, you can check it with this example.

I assume smth should be noted in the readme, i.e note for updating package if it's fixed in [email protected]

sh-helen referenced this issue Jun 11, 2020
Alternative implementation to #47
@fbartho
Copy link
Contributor

fbartho commented Jun 11, 2020

@CampbellMG -- that was a slightly different version of the API I was thinking about.

-- "non-exhaustive" should be relatively easy to implement. nonExhaustive: true or should we call it something else?

@fbartho
Copy link
Contributor

fbartho commented Jun 11, 2020

@lizzieshipwreck @sh-helen thanks for the concrete examples! I can add those to the unit tests now and see if it's behaving as you expect.

@fbartho
Copy link
Contributor

fbartho commented Jun 11, 2020

@sh-helen
Would you mind further clarifying your "expected output"? -- I haven't been using this library much where I cared about the values of the indexes, so I'm not fully sure what behavior is correct or unexpected.

This is one of the reasons why I made a release branch, and shipped a beta build -- matchIndex behavior doesn't appear to have unit tests (or my changes didn't break or fix them) and since I haven't been a consumer of that functionality, I can only go by the API description and a guess as to what the output should be.

Additionally, it looks like in the past, previous maintainers of this library chose not to maintain a Changelog. I could start doing that, but it wouldn't have historical data.

@sh-helen
Copy link

sh-helen commented Jun 11, 2020

@fbartho Sure
I think I can share a fiddle with copy-paste of Extraction class in order to play with it, https://jsfiddle.net/L28s0kdn/5/,
so as you can see the output in the console is 6, but I have 8 matches, and if you remove g flag you will exactly see 8 matches

Having changelog would be really great!

@fbartho
Copy link
Contributor

fbartho commented Jun 11, 2020

@sh-helen Thanks! -- The more I think about this, the more I don't know what the global flag was expected to do in the original code.

  • Should a global flag mean that it matches against the "original whole text" that is passed in?
  • I think the global flag used to be completely broken.

Not sure where this leaves us as to what we want to happen.

@fbartho
Copy link
Contributor

fbartho commented Jun 11, 2020

@sh-helen I have a fix here! https://jsfiddle.net/91v5dyhx/1/

Let me know if this does what you would have expected? /cc @CampbellMG

@fbartho
Copy link
Contributor

fbartho commented Jun 11, 2020

I've published v0.0.22-beta.3 with this code! (I messed up when publishing beta.2)

@sh-helen
Copy link

@fbartho yep, it separates words correct, but for some reason it takes whitespace into account as a separate part also, textLeft = textLeft.substr(matches.index + matches[0].length).trim(); trim should fix it there, but needs to be checked whether it may broke smth

@fbartho
Copy link
Contributor

fbartho commented Jun 12, 2020

@sh-helen Do you have a test case that demonstrates the problem? Happy to see what I can do!

@sh-helen
Copy link

@fbartho double checked it again with my project, it seems that it works correct now, thanks for your help!

@CampbellMG
Copy link
Author

Let me know if this does what you would have expected?

Works perfectly, thanks @fbartho!

@fbartho
Copy link
Contributor

fbartho commented Jun 15, 2020

Oh great! So we don’t need the suggested nonExhaustive flag then? Or: is that still a feature request?

@CampbellMG

@CampbellMG
Copy link
Author

CampbellMG commented Jun 16, 2020

Sorry, that was confusing. It works against the other use case, however, the nonExhaustive flag would still be required for mine.

Running my original test case against your fiddle still outputs all matches:

image

@fbartho
Copy link
Contributor

fbartho commented Jun 16, 2020

Thanks for clarifying! Yeah, I figured we’d still need to add nonExhaustive for your original request. @CampbellMG

I’ll take a look tomorrow!

@fbartho
Copy link
Contributor

fbartho commented Jun 18, 2020

@CampbellMG -- I started implementing something, but I think I confused myself, and actually don't have a good enough example to make sure I actually got what you expected to happen!

Would you mind providing a sample of your non-global pattern, and how many matches you would expect to have for a specific sample text?

@CampbellMG
Copy link
Author

Is the example from the original question okay?

Essentially, if I have the text aaaa and the pattern /a/ I would expect one match (the first letter). If I were to disable the nonExhaustive flag, the current behaviour would apply which returns 4 matches (one for each letter).

@fbartho
Copy link
Contributor

fbartho commented Jun 18, 2020

@CampbellMG -- I think that works! I implemented most of this in this PR: #80

I'll add a test now and see

@fbartho
Copy link
Contributor

fbartho commented Jun 18, 2020

@CampbellMG
Copy link
Author

That's phenomenal, thank you for that.

So by the looks of it, we set a nonExhaustiveModeMaxMatchCount rather than a flag?

@fbartho
Copy link
Contributor

fbartho commented Jun 18, 2020

Yeah, that’s right. It felt more flexible than just a Boolean, even if you didn’t directly need that behavior.

@fbartho
Copy link
Contributor

fbartho commented Jun 24, 2020

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants