Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use JDK UUID#fromString for Java >= 15 #15

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

andrebrait
Copy link

Explanation on the README

@andrebrait
Copy link
Author

Java 8

Benchmark                                                Mode  Cnt         Score          Error  Units
UUIDBenchmark.benchmarkFastParserToString               thrpt    3  29072210.194 ±  1272384.932  ops/s
UUIDBenchmark.benchmarkUUIDToString                     thrpt    3   3919194.916 ±  3994770.461  ops/s

Benchmark                                                Mode  Cnt         Score          Error  Units
UUIDBenchmark.benchmarkUUIDFromCharSequenceFastParser   thrpt    3  26652839.180 ± 42300562.510  ops/s
UUIDBenchmark.benchmarkUUIDFromFastParser               thrpt    3  28014259.927 ± 14211646.057  ops/s
UUIDBenchmark.benchmarkUUIDFromString                   thrpt    3   2140616.212 ±  3066552.678  ops/s

Java 17

Benchmark                                                Mode  Cnt         Score          Error  Units
UUIDBenchmark.benchmarkFastParserToString               thrpt    3  45713487.632 ±  7374721.402  ops/s
UUIDBenchmark.benchmarkUUIDToString                     thrpt    3  46517026.066 ± 25676934.756  ops/s

Benchmark                                                Mode  Cnt         Score          Error  Units
UUIDBenchmark.benchmarkUUIDFromCharSequenceFastParser   thrpt    3  35336973.563 ±  2908998.623  ops/s
UUIDBenchmark.benchmarkUUIDFromFastParser               thrpt    3  33821627.866 ±  3319481.493  ops/s
UUIDBenchmark.benchmarkUUIDFromString                   thrpt    3  32269917.127 ± 27127704.981  ops/s

@andrebrait
Copy link
Author

Java 17 (before this change):

Benchmark                                               Mode  Cnt         Score         Error  Units
UUIDBenchmark.benchmarkFastParserToString              thrpt    3  44798986.151 ± 8939223.200  ops/s
UUIDBenchmark.benchmarkUUIDToString                    thrpt    3  44832192.867 ± 7551885.052  ops/s

Benchmark                                               Mode  Cnt         Score         Error  Units
UUIDBenchmark.benchmarkUUIDFromCharSequenceFastParser  thrpt    3  31220499.311 ± 3755386.722  ops/s
UUIDBenchmark.benchmarkUUIDFromFastParser              thrpt    3  31234004.484 ± 2334153.214  ops/s
UUIDBenchmark.benchmarkUUIDFromString                  thrpt    3  35230981.215 ± 9501357.140  ops/s

@KevinAtSesam
Copy link

I would second this. Comparing the actual Open JDK 17 implementation with the FastUUID implementation, I see only small differences. As such, I would recommend people to use the build-in version instead.

Perhaps you should go even one step further, and you should declare this utility 'deprecated' for OpenJDK 17 onward.

@andrebrait
Copy link
Author

@KevinAtSesam deprecated for OpenJDK >= 15, though for the foreseeable future, I think projects will still be targeting Java 8 or 11, so this is still a pretty useful library.

@andrebrait
Copy link
Author

@KevinAtSesam also, this library can handle any CharSequence whereas with JDK you'd only be able to handle String, so you'd either have to convert the CharSequence to a String, meaning copying its contents (if it's not a String already).

I mean, pretty minor use-case, though, but still a difference.

@KevinAtSesam
Copy link

Sounds good to me. Just glad to see that Jdk 17 made such an improvement.

Copy link
Owner

@jchambers jchambers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the contribution! I have a couple questions and suggestions, but agree that this is a good thing to be doing in general.

One additional request that doesn't fit neatly on any single line of code: since we're making Java 15 the cutoff for some of these features, could you please share benchmarks form Java 15? (EDIT: to be clear, I really appreciate the benchmark results from Java 17, but matching the benchmarks to the cutoff version would be great for posterity!)

Thanks very much!

<scope>test</scope>
</dependency>

<dependency>
<groupId>org.openjdk.jmh</groupId>
<artifactId>jmh-core</artifactId>
<version>1.21</version>
<version>1.32</version>
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: we should probably pull this out into a property so we don't have to change the version in three different places. Let's definitely leave that for a separate effort, though.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do that, no problem. Will extract when I get the chance to open it again on IntelliJ.

src/benchmark/java/com/eatthepath/UUIDBenchmark.java Outdated Show resolved Hide resolved
* @throws IllegalArgumentException if the given string does not conform to the string representation as
* described in {@link UUID#toString()}
*/
public static UUID parseUUID(final String uuidString) {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm missing something: why do we need this if we also have parseUUID(CharSequence), which also checks if the CharSequence is a String?

Copy link
Author

@andrebrait andrebrait Nov 25, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because JDK's method only works with String.

So we can:

  1. Change the signature of our method to parseUUID(String) and possibly break compatibility;
  2. Call CharSequence#toString() but that will lead to memory copies in any non-String objects that implement CharSequence;
  3. Add a method that uses String (so it can be passed to JDK's in case we're running on >= 15) and keep the CharSequence there too and shared the common code path between the two (for JDKs prior to 15, but CharSequence would always go there since we don't want to copy memory);
  4. Number 3 + we check if the CharSequence is a String so we can pass it to JDK's, but we keep using our implementation in case it's not (since it performs closely anyway).

instanceof has basically no cost on modern JVMs anyway, so I chose to go with 4.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah—I think I see where you're going here. If I can reword it to make sure we're on the same page, I think you're saying that you want to add a String-specific variant because that will lower the cost of passing through to UUID#fromString under Java 15 and newer.

If I'm understanding that right, I'd gently push back that I think it's okay to continue using fast-uuid's parser even under Java 15. My rationale is that if somebody is working with non-String CharSequences, they probably have some specific thing they're doing that means they want to avoid String conversion in the first place, and using the sliiiiightly slower FastUuid#parseUUID(CharSequence) will still be faster for those users than converting to a String, then using the slightly faster UUID#fromString(String).

As you say, using instanceof in FastUuid#parseUUID(CharSequence) is a negligible cost, so most users would still get the benefits of using the built-in parser under Java 15 and newer, and it's only the oddballs (like me!) who have a CharSequence-specific use case that would be staying in the fast-uuid space.

Does that make sense? Am I missing your point?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's more of a compile-time optimization and maybe being able to deal with JVMs where instanceof isn't super-duper cheap.

If you upgrade from existing code to this, your compiler will link your calls that use String-typed references to the new method. So

String uuidString = someMethodThatReturnsString();
// this will now link directly to the String method instead of the CharSequence one without the user having to change any code
UUID id = FastUUID.parseUUID(uuidString);

CharSequence uuidChars = someMethodThatReturnsCharSequence();
// this will still link to the CharSequence method
UUID idFromCharSequence = FastUUID.parseUUID(uuidChars);

If uuidChars happens to actually be a String, we can still leverage the speed boost by checking that and calling JDK's UUID#fromString.

Providing the overload with String allows us to skip that step without the user changing any code because upon compilation, they'd link to the new method if the reference is String-typed.

A CharSequence reference to a CharSequence non-String object will still just use the existing parser just fine. There is no conversion involved.

I see this as a win-win change. The only downside is that FastUUID.parseUUID(null) would not compile anymore. You'd have to cast the null to either CharSequence or String.

For someone using fast-uuid indirectly, the type-check to String inside the CharSequence overload still allows for the performance boost in Java 15 even without recompilation.

@andrebrait
Copy link
Author

Thank you for the contribution! I have a couple questions and suggestions, but agree that this is a good thing to be doing in general.

One additional request that doesn't fit neatly on any single line of code: since we're making Java 15 the cutoff for some of these features, could you please share benchmarks form Java 15? (EDIT: to be clear, I really appreciate the benchmark results from Java 17, but matching the benchmarks to the cutoff version would be great for posterity!)

Thanks very much!

Sure. Will get benchmarks from 15 later today.

README.md Show resolved Hide resolved
README.md Outdated Show resolved Hide resolved
@andrebrait
Copy link
Author

@jchambers

Java 15 (AdoptOpenJDK HotSpot)

Benchmark                                   Mode  Cnt         Score         Error  Units
UUIDBenchmark.benchmarkFastParserToString  thrpt    3  31602194.038 ± 3833604.315  ops/s
UUIDBenchmark.benchmarkUUIDFromFastParser  thrpt    3  26991706.481 ± 3713708.753  ops/s
UUIDBenchmark.benchmarkUUIDFromString      thrpt    3  33169614.473 ± 4664825.488  ops/s
UUIDBenchmark.benchmarkUUIDToString        thrpt    3  42468946.506 ±  718697.780  ops/s

@andrebrait andrebrait requested a review from jchambers December 7, 2021 20:50
@jchambers
Copy link
Owner

Thank you for the Java 15 benchmarks! Let's resolve the conversation about the String/CharSequence thing, then figure out next steps from there.

@andrebrait
Copy link
Author

@jchambers so, any news on that?

@jchambers
Copy link
Owner

Oops—sorry. This got lost in the holiday shuffle. I'll get back to this shortly.

@andrebrait
Copy link
Author

@jchambers hi there :) wanna figure out the situation with the CharSequence stuff?

I can also add the GCProfiler to the benchmark methods so we can have the garbage data as well, but IIRC none of the methods create much garbage anyway.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants