Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

review: perf: avoid collection copy in CtPackageImpl #4848

Merged
merged 8 commits into from
Aug 23, 2022

Conversation

SirYwell
Copy link
Collaborator

I saw that when running this test

most of the time is spent on CtMethodImpl#getTopDefintions(). I did some profiling and encountered this:
lhs

That's an immense amount of time of copying elements just to check if the set is empty. As a solution, I propose to introduce a CtPackage#hasTypes() method. This way, we can ask the ElementNameMap directly instead of doing costly copying.

In my first measurements, this reduced the runtime of the mentioned test from ~20s to ~16s.

(I rebased after that, and the runtime of the test on the current master takes >1m on my machine. With this patch, it is still faster, but I have a second PR ready to get back to the ~16s)

@SirYwell SirYwell changed the title wip: perf: avoid collection copy in CtPackageImpl review: perf: avoid collection copy in CtPackageImpl Aug 19, 2022
@@ -242,7 +242,12 @@ public boolean hasPackageInfo() {

@Override
public boolean isEmpty() {
return getPackages().isEmpty() && getTypes().isEmpty();
return getPackages().isEmpty() && !hasTypes();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't getPackages() have precisely the same problem (allocates a new LinkedHashSet on each invocation)? Perhaps kill two birds with one stone in this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, getPackages() has the same problem, but as it didn't really show up in my profiles (as it's called far less), I didn't include an extra method for the empty check in this PR. If you want, I can do so though.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably not a performance problem because the amount of packages is typically far fewer than the amount of types in any given package.

I think from an API standpoint it makes sense to have both. Since there are methods getTypes() and getPackages(), I probably would expect a hasTypes() to be accompanied by a hasPackages(). What do you think?

Comment on lines 165 to 168
/**
* @return true if the package contains any types.
* @see #getTypes()
*/
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I think about it, we should probably add a requirement here that this is O(1). That's really the reason for it to exist in the first place.

On that note, we maybe should add notes to getTypes() and getPackages() that they may be linear in the amount of types/packages, and one should use isEmpty(), hasPackages() (if we add it) or hasTypes() to check for "emptiness".

@SirYwell
Copy link
Collaborator Author

I added some runtime behavior documentation and the hasPackages() method.

Copy link
Collaborator

@slarse slarse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @SirYwell!

@slarse slarse merged commit 844beaa into INRIA:master Aug 23, 2022
@SirYwell SirYwell deleted the perf/PackageFactory branch August 23, 2022 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants