Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OAK-11556 : removed usage of Guava's Iterables.getFirst with oak-commons API #2146

Merged
merged 2 commits into from
Mar 7, 2025

Conversation

rishabhdaim
Copy link
Contributor

@rishabhdaim rishabhdaim commented Mar 7, 2025

No description provided.

Copy link

github-actions bot commented Mar 7, 2025

Commit-Check ✔️

@@ -89,7 +90,7 @@ public static PropertyState createProperty(
String name, Iterable<Value> values)
throws RepositoryException {
int type = PropertyType.STRING;
Value first = Iterables.getFirst(values, null);
Value first = StreamUtils.toStream(values).findFirst().orElse(null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be introducing a performance regression. Iterables.getFirst does not allocate any object, while StreamUtils.toStream() creates an object internally. And the code path of Iterables.getFirst is much shorter, so potentially faster than the Guava equivalent. This method, createProperty(), is likely performance critical, so I'm worried this could significantly degrade performance of some parts of Oak.
In general, I'm concerned about all these patches removing Guava. We have tests for correctness (and even with those tests, these patches have broken the build several times). But we do not have any performance regression tests, so these patches may be degrading the performance without us realising.

Copy link
Contributor Author

@rishabhdaim rishabhdaim Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @nfsantos for the review.

I measured the performance of 3 implementations (using JMH) for:

  1. Guava's
  2. Stream
  3. Apache commons

Please find the results below:

Code is:

`@Benchmark
public void guava(Blackhole bh) {
   bh.consume(Iterables.getFirst(itr, bh));
}

@Benchmark
public void apache(Blackhole bh) {
    bh.consume(org.apache.commons.collections4.IterableUtils.first(itr));
}

@Benchmark
public void stream(Blackhole bh) {
    bh.consume(StreamSupport.stream(itr.spliterator(), false).findFirst().orElse(0));
}`

Results

Benchmark (size) Mode Cnt Score Error Units
IterableUtils.apache 1 avgt 4 0.001 ± 0.001 us/op
IterableUtils.apache 100 avgt 4 0.001 ± 0.001 us/op
IterableUtils.apache 10000 avgt 4 0.001 ± 0.001 us/op
IterableUtils.guava 1 avgt 4 0.001 ± 0.001 us/op
IterableUtils.guava 100 avgt 4 0.001 ± 0.001 us/op
IterableUtils.guava 10000 avgt 4 0.001 ± 0.001 us/op
IterableUtils.stream 1 avgt 4 0.008 ± 0.003 us/op
IterableUtils.stream 100 avgt 4 0.007 ± 0.002 us/op
IterableUtils.stream 10000 avgt 4 0.007 ± 0.002 us/op

Although the performance with Stream is a bit slow for getting first, that is not going to have a very drastic impact on overall OAK performance.

But, we do have a better alternative in commons-collections4, I think we can go with that as well.

Will update the PR to use Apache commons-collections4.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rishabhdaim, thanks for micro benchmark results. Could you reformat them so they are easier to read?

If I understood correctly, the longest iteration repeated the operation 10000 times, right? This is not much, it would be better to run for at least a few seconds to give time for the JIT to quick in.

But sounds good to use commons-collections4 if its performance is equivalent to Guava's.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added PR #2150 to create Iterables.getFirst() replacement in oak-commons. Once approved/merged, will use that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nfsantos Even after increasing the param size to 1000000, the results remain the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark (size) Mode Cnt Score Error Units
IterableUtils.apache 1 avgt 4 0.001 ± 0.001 us/op
IterableUtils.apache 100 avgt 4 0.001 ± 0.001 us/op
IterableUtils.apache 10000 avgt 4 0.001 ± 0.001 us/op
IterableUtils.apache 1000000 avgt 4 0.001 ± 0.001 us/op
IterableUtils.guava 1 avgt 4 0.001 ± 0.001 us/op
IterableUtils.guava 100 avgt 4 0.001 ± 0.001 us/op
IterableUtils.guava 10000 avgt 4 0.001 ± 0.001 us/op
IterableUtils.guava 1000000 avgt 4 0.001 ± 0.001 us/op
IterableUtils.stream 1 avgt 4 0.007 ± 0.002 us/op
IterableUtils.stream 100 avgt 4 0.007 ± 0.001 us/op
IterableUtils.stream 10000 avgt 4 0.007 ± 0.001 us/op
IterableUtils.stream 1000000 avgt 4 0.007 ± 0.001 us/op

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

review comments addressed in 11a7aca and now using new oak-commons util.

@rishabhdaim rishabhdaim changed the title OAK-11556 : removed usage of Guava's Iterables.getFirst with Streams.… OAK-11556 : removed usage of Guava's Iterables.getFirst with oak-commons API Mar 7, 2025
@rishabhdaim rishabhdaim merged commit e27b02a into trunk Mar 7, 2025
6 checks passed
@rishabhdaim rishabhdaim deleted the OAK-11556 branch March 7, 2025 15:14
daniancu pushed a commit to daniancu/jackrabbit-oak that referenced this pull request Mar 13, 2025
…ons API (apache#2146)

* OAK-11556 : removed usage of Guava's Iterables.getFirst with Streams.findFirst()

* OAK-11556 : removed usage of Guava's Iterables.getFirst with oak-commons

---------

Co-authored-by: Rishabh Kumar <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants