Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brace pairing check in ec_glob.c incorrect? #50

Closed
craigbarnes opened this issue Apr 19, 2019 · 8 comments
Closed

Brace pairing check in ec_glob.c incorrect? #50

craigbarnes opened this issue Apr 19, 2019 · 8 comments
Assignees
Labels

Comments

@craigbarnes
Copy link

I was just reading the code in ec_glob.c and noticed this loop doesn't seem to be doing what the variable name would suggest. Checking for equal numbers of opening and closing braces doesn't necessarily mean they're "paired". It's not immediately obvious if or how this effects the correctness of the code, but I just thought I'd point it out in case it's a bug.

@craigbarnes
Copy link
Author

craigbarnes commented Apr 19, 2019

It seems if the braces are (incorrectly) determined to be "paired", all braces in the pattern get escaped and only represent themselves. That seems to prevent e.g. }*.{x,y} from matching }file.y. It also seems like it could cause false matches (and signed integer overflow?) as well, in the case of something like }foo{a,b, which has even (but not paired) braces.

@xuhdev
Copy link
Member

xuhdev commented Apr 22, 2019

Thanks for your many bug reports! How does #54 look to you?

@craigbarnes
Copy link
Author

craigbarnes commented Apr 22, 2019

It seems like #54 should fix the integer overflow and false matches, but it still seems to make the escaping more aggressive than necessary.

The approach I took is to find the last index at which there is a paired, closing brace and beyond that point just escape all braces. Closing braces before that point also get escaped (represent themselves) if brace_level == 0 (i.e. there's no opening brace to close).

The over-escaping issue is more of a question of semantics though and not necessarily a bug.

I haven't started using the editorconfig test cases yet because they're rather hard to separate from CMake and seem to require an implementation that retains all of the parsed properties as a dictionary instead of just cherry-picking the ones it's interested in (so it can use a struct instead). Otherwise, I'd start contributing some test cases to cover things like this.

@craigbarnes
Copy link
Author

craigbarnes commented Apr 23, 2019

Scratch what I said about integer overflow. The incorrect pairing check doesn't actually lead to overflow -- it just allows brace_level to possibly go negative, which seems like it would be harmless. I guess the real issue was just the possibility of emitting unpaired parentheses in the generated regex, which #54 seems to also solve.

@craigbarnes
Copy link
Author

craigbarnes commented Apr 23, 2019

I've been thinking about this some more and I'm starting to think your approach of "over-escaping" might actually be the best thing to do. Either that or just make patterns with invalid brace pairing match nothing (always return false).

I just noticed that my brace pairing check allows through unclosed braces if they are nested but the inner level is closed, e.g. {a,b{c,d{e}. It's not really obvious what the semantics should be in this case. The regex equivalent just causes regcomp() to return a REG_EPAREN error code. I guess if people supply badly formed patterns they can't really expect it to match anything.

tl;dr I think your PR #54 seems fine. My only remaining question is -- should badly paired braces just cause the pattern to match nothing at all instead?

rakus added a commit to rakus/vimscript-editorconfig that referenced this issue Apr 23, 2019
@xuhdev
Copy link
Member

xuhdev commented Apr 26, 2019

I'm personally leaning toward not making incompatible changes unless there is a compelling reason. In this case, it would be keep unpaired curly braces match. What do you think?

@craigbarnes
Copy link
Author

craigbarnes commented Apr 27, 2019

If someone authors a pattern with unpaired braces by mistake, the net result is probably that it just doesn't match what they wanted it to (or anything at all). So long as there are no exploits in the implementation, either behaviour seems fine to me. I'll just copy whatever upstream decides on.

@xuhdev
Copy link
Member

xuhdev commented May 4, 2019

Fixed by #54

@xuhdev xuhdev closed this as completed May 4, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants