add CacheSchemas option to Serializer #1151
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds code to the Serializer that caches the
SchemaInfo
for eachFileDescriptor
, which improves performance by over an order of magnitude by caching the string manipulation bottleneck (see flamegraph here). In order to do this, there was also some refactoring to isolate the "cachable" code (getting SchemaInfo from a proto).Benchmark results:
This is analogous to the approach take in #1128, but the main difference is that this caching is gated behind a config which defaults to false (i.e. default no behavior change). I believe this optimization can be used by most use cases, but there is a specific scenario where it is not appropriate which is why it is gated behind a config:
Since this caching can be enabled for most users (generally users are generating one set of protobuf bindings and creating messages based on that and thus there won't be multiple versions of protobufs within one run of the application) I think it's appropriate to have this upstreamed into the confluent-kafka-go library
Test: ran existing tests with config enabled; added benchmarks