-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract parts of parse_next_token into new method #144
Conversation
It was hard to figure out what was going on with that much indentation and vertical scrolling. Extracting the search code into a new method helped me understand it a bit better. It also means we can use `return` instead of `continue` in the for loop.
}; | ||
if refs_regex.is_none() && does_something { | ||
search_cache.insert(match_pat, Some(regions.clone())); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for only inserting it into the cache if does_something
is true? I kept the current logic, but I'm not sure I understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would guess that sometimes people write syntax definitions with silly rules like a regex that can match nothing (and doesn't change the context stack):
- match: a*
scope: some.scope
and if it cached places where it matched nothing, it'd never find the places where it matches something, i.e. it needs to repeat the match at a different position to find a real non-zero length match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense not to use that match, but caching it should be ok. The way it works is that if there's a cached match, it is only used if the start position of it is >=
the current position where we're trying to search. Otherwise, we search again using that pattern.
Just now, I tried removing && does_something
from the condition and both tests and syntest still pass. But maybe there's no test for this case (with caching)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the real reason I wrote it like that may have been just caution and being extra safe, but I think I might have come up with a rare case it is necessary.
The cache hit code doesn't have the logic for the does_something
check, so if a pattern uses a lookahead to match an empty string a few characters ahead, then next time around we hit it in the cache and it is the closest, we'll choose it, then get stuck in a loop choosing it again and again.
Edit: NVM not an infinite loop since there's extra protections against that, but maybe at least extra pushes and pops, although maybe those are fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On second look maybe the does_something
check is unnecessary now. I think I might have added it before the more general infinite loop protection, and Sublime may just have the more general protection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. But I think how this works might change with the fix for #127 anyway (maybe). So let's defer making this change for now.
(I've run syntest before and after these changes and it's unchanged.) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks! Gonna wait for a bit more discussion on does_something
before merging, but not too long.
Actually can you run the jQuery benchmark on old and new and make sure this doesn't regress performance? And if it does regress performance a |
I got this result for a run:
But the +/- numbers there don't make me confident that I'll get a meaningful result. I'm looking into converting the benchmarks to use criterion.rs instead :). |
I benchmarked this with criterion and it doesn't show a performance change. Haven't had time to raise a PR for using criterion, will do tomorrow. |
It was hard to figure out what was going on with that much indentation
and vertical scrolling. Extracting the search code into a new method
helped me understand it a bit better. It also means we can use
return
instead of
continue
in the for loop.