Skip to content

[Performance] Use a faster serialization protocol within security plugin #2780

@parasjain1

Description

@parasjain1

Problem

JDK serialisation used by security plugin to serialize and deserialize various headers is slow.

Proposal

This is a proposal to change the implementation of Base64Helper::serializeObject and Base64Helper::deserializeObject to use a faster serialization protocol. I explored Fast Serialization, Protostuff, Kryo, Avro, and OpenSearch's Custom Serialization as alternatives to JDK serialization and ran a few benchmarks. Results are attached below.

Benchmarking Environment
Framework used - JMH, 1000 warm-up iterations, 30000 test iterations
EC2 InstanceType - c5.2xlarge
JDK - Corretto JDK 11
OS - Amazon Linux 2 x86_64

Type User User User InetSocketAddress InetSocketAddress InetSocketAddress SourceFieldContext SourceFieldContext SourceFieldContext User User User InetSocketAddress InetSocketAddress InetSocketAddress SourceFieldContext SourceFieldContext SourceFieldContext
Operation deserialize deserialize deserialize deserialize deserialize deserialize deserialize deserialize deserialize serialize serialize serialize serialize serialize serialize serialize serialize serialize
Stat Avg Time (ns/op) Error +/- ns/op Diff % Avg Time (ns/op) Error +/- ns/op Diff % Avg Time (ns/op) Error +/- ns/op Diff % Avg Time (ns/op) Error +/- ns/op Diff % Avg Time (ns/op) Error +/- ns/op Diff % Avg Time (ns/op) Error +/- ns/op Diff %
Java 26062.709 847.012   9732.072 309.654   7892.943 333.835   10370.249 319.919   4749.54 168.423   4023.138 146.527  
FST 4299.802 251.09 -83.50209 3957.335 287.201 -59.33718 2168.463 66.373 -72.52656 3104.632 161.298 -70.06213 2578.204 115.172 -45.71676 1427.189 63.018 -64.52548
FST (Pre) 3674.455 133.466 -85.90148 3417.478 134.756 -64.88437 868.976 48.215 -88.99047 2899.691 131.584 -72.03837 2368.224 101.214 -50.13782 756.986 38.476 -81.18419
Proto 808.423 40.851 -96.89816     1003.155 29.785 -87.29048 1423.777 59.772 -86.27056     1138.412 70.829 -71.70338
Custom (OpenSearch) 834.74 56.749 -96.79719     834.987 30.013 -89.42109 1115.154 69.707 -89.2466     1123.486 37.035 -72.07439
Kryo (Pre)       1274.085 20.928 -86.90839             1544.436 55.018 -67.48241 55.018    
  • Though FST is highly performant, simplest to use amongst all, it comes with its own shortcomings. FST no longer seems to be actively maintained with last commit made 2yrs ago and 102 open issues, history of breaking changes even with minor version upgrades.
  • Protostuff too is highly performant, but will need explicit handling for certain classes such as InetSocketAddress by writing Delegates. Protostuff too doesn't seem to be actively maintained, last commit was 1yr ago.
  • Kryo does not work out of the box. Kryo does not work with classes with no zero-arg constructors. We'll have to write serializers. Discovered that for complex objects for eg. java.util.Collections$SynchronizedMap we'll have to register separate serializers. There's a repo kryo-serializers that has many such serializers that we can use. Given we already have highly optimised custom serialization framework (StreamOutput, StreamInput) within OpenSearch, expending effort to integrate with another library seems unnecessary.
  • Custom serialization using OpenSearch's BytesStreamOutput and BytesStreamInput classes is a promising approach. It too is highly performant. For the classes that are defined within security plugin such as User, SourceFieldsContext - Writeable interface can be implemented. For classes such as InetSocketAddress which we cannot change, we'll have to add Writers and Read methods to the StreamOutput and StreamInput classes to be able to use writeGenericObject and readGenericObject methods. This is inline with how OpenSearch deals with third party classes today. [source code]

To conclude, we propose to use custom serialization for headers in security plugin.

Solution

This change is to proposed to be introduced with OS 3.0 with no intention to backport this. We can break down the solution into following action items -

  • Code change in OpenSearch's StreamInput, StreamOutput classes to add Writers and Read methods respectively for third party classes directly involved in serialization within security plugin. [will update the list below]
    • InetSocketAddress
  • Re-implement Base4Helper::serialize and Base64Helper.deserialize methods to use custom serialization.
  • Handle communication b/w old and new nodes during version upgrade
  • Introduce safe class checks for the alternative (de)serialization implementation (this may no longer be needed as unsupported classes will fail to be serialized)
  • End to end testing, especially the version upgrade scenario
  • Run OSB tests to see how the various throughputs/latencies change (exploring different workloads where the impact would be much more pronounced, encountering high variance for the tests already performed)
  • Finalise the OS version in which the change will be released (version code be used in the version upgrade handling logic to identify old nodes)

I've raised an initial draft PR for serialization using protostuff and working towards testing the version upgrade scenario (from OS2.5 to OS2.7). Currently, the change is assumed to be introduced as part of OS2.7 release for testing purpose. We may need to bump up this version.
Will raise another PR with custom serialization.

Next Steps

  • Review the benchmarks and maybe explore any other potential alternatives.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requesthelp wantedCommunity contributions are especially encouraged for these issues.triagedIssues labeled as 'Triaged' have been reviewed and are deemed actionable.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions