Extract parts of parse_next_token into new method #144

robinst · 2018-04-18T07:49:29Z

It was hard to figure out what was going on with that much indentation
and vertical scrolling. Extracting the search code into a new method
helped me understand it a bit better. It also means we can use return
instead of continue in the for loop.

It was hard to figure out what was going on with that much indentation and vertical scrolling. Extracting the search code into a new method helped me understand it a bit better. It also means we can use `return` instead of `continue` in the for loop.

robinst · 2018-04-18T07:50:40Z

src/parsing/parser.rs

+            };
+            if refs_regex.is_none() && does_something {
+                search_cache.insert(match_pat, Some(regions.clone()));
+            }


What's the reason for only inserting it into the cache if does_something is true? I kept the current logic, but I'm not sure I understand.

I would guess that sometimes people write syntax definitions with silly rules like a regex that can match nothing (and doesn't change the context stack):

- match: a* scope: some.scope

and if it cached places where it matched nothing, it'd never find the places where it matches something, i.e. it needs to repeat the match at a different position to find a real non-zero length match.

It makes sense not to use that match, but caching it should be ok. The way it works is that if there's a cached match, it is only used if the start position of it is >= the current position where we're trying to search. Otherwise, we search again using that pattern.

Just now, I tried removing && does_something from the condition and both tests and syntest still pass. But maybe there's no test for this case (with caching)?

I think the real reason I wrote it like that may have been just caution and being extra safe, but I think I might have come up with a rare case it is necessary.

The cache hit code doesn't have the logic for the does_something check, so if a pattern uses a lookahead to match an empty string a few characters ahead, then next time around we hit it in the cache and it is the closest, we'll choose it, then get stuck in a loop choosing it again and again.

Edit: NVM not an infinite loop since there's extra protections against that, but maybe at least extra pushes and pops, although maybe those are fine.

On second look maybe the does_something check is unnecessary now. I think I might have added it before the more general infinite loop protection, and Sublime may just have the more general protection.

Yeah. But I think how this works might change with the fix for #127 anyway (maybe). So let's defer making this change for now.

robinst · 2018-04-18T07:53:15Z

(I've run syntest before and after these changes and it's unchanged.)

trishume

Looks good, thanks! Gonna wait for a bit more discussion on does_something before merging, but not too long.

trishume · 2018-04-19T02:13:01Z

Actually can you run the jQuery benchmark on old and new and make sure this doesn't regress performance? And if it does regress performance a #[inline] might fix it.

robinst · 2018-04-19T03:55:37Z

I got this result for a run:

test bench_highlighting_jquery  ... bench: 798,202,387 ns/iter (+/- 366,799,381)

But the +/- numbers there don't make me confident that I'll get a meaningful result. I'm looking into converting the benchmarks to use criterion.rs instead :).

robinst · 2018-04-19T08:57:14Z

I benchmarked this with criterion and it doesn't show a performance change.

Haven't had time to raise a PR for using criterion, will do tomorrow.

robinst commented Apr 18, 2018

View reviewed changes

robinst mentioned this pull request Apr 18, 2018

syntect handles push & immediately pop non-consuming patterns differently to ST #127

Closed

trishume approved these changes Apr 19, 2018

View reviewed changes

trishume merged commit 9c28e45 into master Apr 20, 2018

trishume deleted the extract-search-method branch April 20, 2018 06:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Extract parts of parse_next_token into new method #144

Extract parts of parse_next_token into new method #144

robinst commented Apr 18, 2018

robinst Apr 18, 2018

keith-hall Apr 18, 2018

robinst Apr 18, 2018

trishume Apr 19, 2018 •

edited

Loading

trishume Apr 19, 2018

robinst Apr 19, 2018 •

edited

Loading

robinst commented Apr 18, 2018

trishume left a comment

trishume commented Apr 19, 2018

robinst commented Apr 19, 2018

robinst commented Apr 19, 2018

Extract parts of parse_next_token into new method #144

Extract parts of parse_next_token into new method #144

Conversation

robinst commented Apr 18, 2018

robinst Apr 18, 2018

Choose a reason for hiding this comment

keith-hall Apr 18, 2018

Choose a reason for hiding this comment

robinst Apr 18, 2018

Choose a reason for hiding this comment

trishume Apr 19, 2018 • edited Loading

Choose a reason for hiding this comment

trishume Apr 19, 2018

Choose a reason for hiding this comment

robinst Apr 19, 2018 • edited Loading

Choose a reason for hiding this comment

robinst commented Apr 18, 2018

trishume left a comment

Choose a reason for hiding this comment

trishume commented Apr 19, 2018

robinst commented Apr 19, 2018

robinst commented Apr 19, 2018

trishume Apr 19, 2018 •

edited

Loading

robinst Apr 19, 2018 •

edited

Loading