Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: remove redundant calls from PreFetch #1456

Merged
merged 2 commits into from
Dec 19, 2024

Conversation

michaelkedar
Copy link
Member

@michaelkedar michaelkedar commented Dec 18, 2024

In every DependencyClient we have, MatchingVersions() is just a call to Versions() plus semver matching, which is computationally expensive. It also was being called on every edge of the pre-fetched graphs.

Removed it to reduce CPU usage and hopefully improve performance with Maven resolution / Guided Remediation.

I've also skipped fetching things with MavenDependencyOrigin set (e.g. dependencyManagement dependencies) since there's potentially hundreds of them.

Copy link
Contributor

@cuixq cuixq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Michael! Do you know how much this could improve the performance?

@codecov-commenter
Copy link

codecov-commenter commented Dec 18, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 67.25%. Comparing base (3653a1d) to head (7afd12d).

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1456      +/-   ##
==========================================
- Coverage   67.27%   67.25%   -0.02%     
==========================================
  Files         192      192              
  Lines       18144    18133      -11     
==========================================
- Hits        12207    12196      -11     
  Misses       5291     5291              
  Partials      646      646              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@michaelkedar
Copy link
Member Author

I'm not exactly sure - it would help CPU usage and the number of goroutines running, but I'm not sure how this translates to run time.

@spencerschrock
Copy link
Contributor

spencerschrock commented Dec 18, 2024

Do you know how much this could improve the performance?
but I'm not sure how this translates to run time.

For my specific use case, on my specific machine (sample repo I was using to debug), performance is:

v1.9.0: ~2 min
v1.9.1: ~12 min
#1436: ~8 min
This PR: ~8min
cherry picking both PRs: ~8min

@michaelkedar michaelkedar changed the title fix: remove MatchingVersions calls from PreFetch fix: remove redundant calls from PreFetch Dec 19, 2024
@michaelkedar michaelkedar requested a review from cuixq December 19, 2024 00:51
@michaelkedar
Copy link
Member Author

I added another change to this PR that should hopefully reduce the number of calls for Maven specifically.

@michaelkedar michaelkedar merged commit 36bf2ee into google:main Dec 19, 2024
15 checks passed
cuixq pushed a commit to cuixq/osv-scanner that referenced this pull request Dec 19, 2024
In every `DependencyClient` we have, `MatchingVersions()` is just a call
to `Versions()` plus semver matching, which is computationally
expensive. It also was being called on every edge of the pre-fetched
graphs.

Removed it to reduce CPU usage and hopefully improve performance with
Maven resolution / Guided Remediation.

I've also skipped fetching things with `MavenDependencyOrigin` set (e.g.
dependencyManagement dependencies) since there's potentially hundreds of
them.
cuixq added a commit that referenced this pull request Dec 19, 2024
This PR cherry-pick two fixes to v1:
 - #1436
 - #1456

---------

Co-authored-by: Michael Kedar <[email protected]>
@spencerschrock
Copy link
Contributor

I added another change to this PR that should hopefully reduce the number of calls for Maven specifically.

Much faster, even more than 1.9.0! Thanks!

@spencerschrock
Copy link
Contributor

I will note I'm getting more extraction errors on some repos (such as https://github.com/SpringCloud/spring-cloud-gray) though. I'm not a big Maven dev, so I'm not sure which is valid/right.

Scanning dir /tmp/spring-cloud/
Error during extraction: (extracting as vcs/gitrepo) worktree not available in a bare repository
Error during extraction: (extracting as java/pomxml) failed resolving {Maven:cn.springcloud.gray:spring-cloud-gray-client[Concrete:${revision}] {}}: failed parsing version constraint 'Maven:org.springframework:spring-jcl[Requirement:]': invalid `>` in `>=0.0.0`
Error during extraction: (extracting as java/pomxml) failed resolving {Maven:cn.springcloud.gray:spring-cloud-gray-client-netflix[Concrete:${revision}] {}}: failed parsing version constraint 'Maven:org.springframework:spring-jcl[Requirement:]': invalid `>` in `>=0.0.0`
Scanned /tmp/spring-cloud/spring-cloud-gray-code-component/pom.xml file and found 1 package
Scanned /tmp/spring-cloud/spring-cloud-gray-core/pom.xml file and found 1 package
...

@michaelkedar
Copy link
Member Author

Hmm those errors seems like it would have been introduced in v1.9.1 - might need to look into that further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants