Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract parts of parse_next_token into new method #144

Merged
merged 1 commit into from
Apr 20, 2018

Conversation

robinst
Copy link
Collaborator

@robinst robinst commented Apr 18, 2018

It was hard to figure out what was going on with that much indentation
and vertical scrolling. Extracting the search code into a new method
helped me understand it a bit better. It also means we can use return
instead of continue in the for loop.

It was hard to figure out what was going on with that much indentation
and vertical scrolling. Extracting the search code into a new method
helped me understand it a bit better. It also means we can use `return`
instead of `continue` in the for loop.
};
if refs_regex.is_none() && does_something {
search_cache.insert(match_pat, Some(regions.clone()));
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for only inserting it into the cache if does_something is true? I kept the current logic, but I'm not sure I understand.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess that sometimes people write syntax definitions with silly rules like a regex that can match nothing (and doesn't change the context stack):

- match: a*
  scope: some.scope

and if it cached places where it matched nothing, it'd never find the places where it matches something, i.e. it needs to repeat the match at a different position to find a real non-zero length match.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense not to use that match, but caching it should be ok. The way it works is that if there's a cached match, it is only used if the start position of it is >= the current position where we're trying to search. Otherwise, we search again using that pattern.

Just now, I tried removing && does_something from the condition and both tests and syntest still pass. But maybe there's no test for this case (with caching)?

Copy link
Owner

@trishume trishume Apr 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the real reason I wrote it like that may have been just caution and being extra safe, but I think I might have come up with a rare case it is necessary.

The cache hit code doesn't have the logic for the does_something check, so if a pattern uses a lookahead to match an empty string a few characters ahead, then next time around we hit it in the cache and it is the closest, we'll choose it, then get stuck in a loop choosing it again and again.

Edit: NVM not an infinite loop since there's extra protections against that, but maybe at least extra pushes and pops, although maybe those are fine.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On second look maybe the does_something check is unnecessary now. I think I might have added it before the more general infinite loop protection, and Sublime may just have the more general protection.

Copy link
Collaborator Author

@robinst robinst Apr 19, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. But I think how this works might change with the fix for #127 anyway (maybe). So let's defer making this change for now.

@robinst
Copy link
Collaborator Author

robinst commented Apr 18, 2018

(I've run syntest before and after these changes and it's unchanged.)

Copy link
Owner

@trishume trishume left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks! Gonna wait for a bit more discussion on does_something before merging, but not too long.

@trishume
Copy link
Owner

Actually can you run the jQuery benchmark on old and new and make sure this doesn't regress performance? And if it does regress performance a #[inline] might fix it.

@robinst
Copy link
Collaborator Author

robinst commented Apr 19, 2018

I got this result for a run:

test bench_highlighting_jquery  ... bench: 798,202,387 ns/iter (+/- 366,799,381)

But the +/- numbers there don't make me confident that I'll get a meaningful result. I'm looking into converting the benchmarks to use criterion.rs instead :).

@robinst
Copy link
Collaborator Author

robinst commented Apr 19, 2018

I benchmarked this with criterion and it doesn't show a performance change.

Haven't had time to raise a PR for using criterion, will do tomorrow.

@trishume trishume merged commit 9c28e45 into master Apr 20, 2018
@trishume trishume deleted the extract-search-method branch April 20, 2018 06:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants