-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Regexp compilation taking a long time #333
Comments
Interesting - do you have any more details you can share? I went back and forth on this for a while, including doing benchmarking especially because of this this huge regexp because that is large enough that it adds a few seconds to the test suite. However at runtime I found there wasn't a significant difference which I assumed was due to Go internally caching some things that it didn't do when the tests were being run in parallel, whereas moving all those regexp compiles out into variables would mean their compilation cost has to be paid regardless of if the related function is actually used (so be clear: moving that big regexp out meant I had also trailed a JIT-based approach but again found no significant difference in my benchmarks. |
So this came from profiling Scorecard's cron, which repeatedly calls
Which is a valid concern for the common case. |
I'm assuming that is the total time across all calls rather than the time of a single call.
That would be the preferable option - as I said, when I tried this I didn't see any significant difference in my benchmarking to consider it worth the readability cost, but I didn't know (it also is entirely possible I mucked up my benchmarking 🤷) |
Correct, the timings were aggregates over several 10 - 20 min samples of
I do think sync.Once helps with readability, but it is still an extra burden over the current code (didn't bother writing the array logic, but could be done here).
Adapted from https://stackoverflow.com/a/72185182 |
Fixes #333 I measured the time taken to compile the regexp at startup and it's unnoticeable for these regexp, so I don't think we need the added complexity with `sync.Once`.
It was noted that in some workloads regexp compliation is taking up to 60% of osv-scanner's runtime.
Especially this line:
osv-scanner/pkg/lockfile/parse-yarn-lock.go
Line 114 in 26c9df8
A solution would be to store the result of the compilation, rather than compiling every time.
FYI @spencerschrock
The text was updated successfully, but these errors were encountered: