-
Notifications
You must be signed in to change notification settings - Fork 498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Internal] Binary Encoding: Adds Binary Encoding Support for Point Operations #4652
Merged
microsoft-github-policy-service
merged 37 commits into
master
from
users/dkunda/4644_binary_encoding_for_point_ops
Oct 23, 2024
Merged
[Internal] Binary Encoding: Adds Binary Encoding Support for Point Operations #4652
microsoft-github-policy-service
merged 37 commits into
master
from
users/dkunda/4644_binary_encoding_for_point_ops
Oct 23, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kundadebdatta
force-pushed
the
users/dkunda/4644_binary_encoding_for_point_ops
branch
from
August 23, 2024 06:46
7373fbc
to
5173b24
Compare
Microsoft.Azure.Cosmos/src/RequestOptions/ItemRequestOptions.cs
Outdated
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/src/Resource/Container/ContainerCore.Items.cs
Outdated
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/src/Resource/Container/ContainerCore.Items.cs
Outdated
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/src/Serializer/CosmosSerializationUtil.cs
Outdated
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/src/Serializer/CosmosSerializationUtil.cs
Outdated
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/src/Serializer/CosmosJsonDotNetSerializer.cs
Outdated
Show resolved
Hide resolved
kundadebdatta
changed the title
Users/dkunda/4644 binary encoding for point ops
[Internal] Binary Encoding: Adds Binary Encoding Support for Point Operations
Aug 31, 2024
Code changes to update STJ Serializer Adding more tests Fixing item emulator test. Minor cosmetic changes. Adding more tests. Fixing the cdb to newtonsoft serializer. Code changes to fix ns reader. Adding more tests. Minor refactoring. Optimizing some of the serialization code. Code changes to change serializer for patch operations. Modularizating the codebase. Adding summary for serialization utils. Code changest to add changes and tests for patch operation. Adding conversation logic in patch op. Code changes to add tests for patch operation. Code changes to refactor binary conversation logic. Some refactor. Added requred unit tests. Code changes to orginaze serializer and de-serializer. Modified default json serializer. Provide option to request binary from item request options. Code changes to add binary serializer for non stream apis. Changes in request invocation handler. remove unnecessary using Further optimizations. Code changes to refactor serialization and de-serialization logic.
kundadebdatta
force-pushed
the
users/dkunda/4644_binary_encoding_for_point_ops
branch
from
September 18, 2024 19:36
094faa0
to
2cdf16b
Compare
kundadebdatta
requested review from
khdang,
sboshra,
adityasa,
neildsh,
kirankumarkolli,
kirillg and
Pilchie
as code owners
September 18, 2024 21:38
kundadebdatta
commented
Sep 18, 2024
kundadebdatta
commented
Sep 18, 2024
kundadebdatta
added
BinaryEncoding
binary encoding in .NET sdk
and removed
Do Not Review
Marks a PR in "work in progress" state.
labels
Sep 18, 2024
Microsoft.Azure.Cosmos/src/Serializer/CosmosSerializationUtil.cs
Outdated
Show resolved
Hide resolved
Microsoft.Azure.Cosmos/src/Serializer/CosmosSerializationUtil.cs
Outdated
Show resolved
Hide resolved
kirankumarkolli
previously approved these changes
Oct 18, 2024
Microsoft.Azure.Cosmos/src/Serializer/CosmosBufferedStreamWrapper.cs
Outdated
Show resolved
Hide resolved
FabianMeiswinkel
previously approved these changes
Oct 22, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM except for a few minor comments
adityasa
reviewed
Oct 23, 2024
adityasa
reviewed
Oct 23, 2024
adityasa
reviewed
Oct 23, 2024
Microsoft.Azure.Cosmos/src/Serializer/CosmosBufferedStreamWrapper.cs
Outdated
Show resolved
Hide resolved
adityasa
reviewed
Oct 23, 2024
kundadebdatta
dismissed stale reviews from FabianMeiswinkel and kirankumarkolli
via
October 23, 2024 21:53
3245664
kirankumarkolli
approved these changes
Oct 23, 2024
FabianMeiswinkel
approved these changes
Oct 23, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
microsoft-github-policy-service
bot
deleted the
users/dkunda/4644_binary_encoding_for_point_ops
branch
October 23, 2024 23:14
1 task
kirankumarkolli
pushed a commit
that referenced
this pull request
Nov 25, 2024
# Pull Request Template ## Description Recently during our v3 sdk CI rolling runs, we observed some performance regressions on the `ItemStreamAsync()` APIs. They regressed beyond 5%. ![image](https://github.com/user-attachments/assets/66cc4f01-2ec6-47e0-b885-5ad74e02bb63) Upon doing further investigation, we figured out that during the non-binary flow, we end up converting the incoming stream into `CloneableStream` which might be the reason for this regression. Please note that the reason this was not caught during the [original version of the binary encoding PR](#4652) was that the performance test used to capture the benchmark for the original PR, was targeted a real cosmos container, where for the CI runs, we use our mocked containers. This PR skips `CloneableStream` conversation for non-binary encoding flow. With the above change in place, our CI builds started passing: ![image](https://github.com/user-attachments/assets/8293a6e5-6fbc-4953-9de0-37162a081194) ## Type of change Please delete options that are not relevant. - [x] Bug fix (non-breaking change which fixes an issue) ## Closing issues To automatically close an issue: closes #IssueNumber
kundadebdatta
added a commit
that referenced
this pull request
Jan 8, 2025
# Pull Request Template ## Description Recently during our v3 sdk CI rolling runs, we observed some performance regressions on the `ItemStreamAsync()` APIs. They regressed beyond 5%. ![image](https://github.com/user-attachments/assets/66cc4f01-2ec6-47e0-b885-5ad74e02bb63) Upon doing further investigation, we figured out that during the non-binary flow, we end up converting the incoming stream into `CloneableStream` which might be the reason for this regression. Please note that the reason this was not caught during the [original version of the binary encoding PR](#4652) was that the performance test used to capture the benchmark for the original PR, was targeted a real cosmos container, where for the CI runs, we use our mocked containers. This PR skips `CloneableStream` conversation for non-binary encoding flow. With the above change in place, our CI builds started passing: ![image](https://github.com/user-attachments/assets/8293a6e5-6fbc-4953-9de0-37162a081194) ## Type of change Please delete options that are not relevant. - [x] Bug fix (non-breaking change which fixes an issue) ## Closing issues To automatically close an issue: closes #IssueNumber
2 tasks
kirankumarkolli
pushed a commit
that referenced
this pull request
Jan 8, 2025
…#4953) # Pull Request Template ## Description Recently during our v3 sdk CI rolling runs, we observed some performance regressions on the `ItemStreamAsync()` APIs. They regressed beyond 5%. ![image](https://github.com/user-attachments/assets/66cc4f01-2ec6-47e0-b885-5ad74e02bb63) Upon doing further investigation, we figured out that during the non-binary flow, we end up converting the incoming stream into `CloneableStream` which might be the reason for this regression. Please note that the reason this was not caught during the [original version of the binary encoding PR](#4652) was that the performance test used to capture the benchmark for the original PR, was targeted a real cosmos container, where for the CI runs, we use our mocked containers. This PR skips `CloneableStream` conversation for non-binary encoding flow. With the above change in place, our CI builds started passing: ![image](https://github.com/user-attachments/assets/8293a6e5-6fbc-4953-9de0-37162a081194) ## Type of change Please delete options that are not relevant. - [x] Bug fix (non-breaking change which fixes an issue) ## Closing issues To automatically close an issue: closes #IssueNumber # Pull Request Template ## Description Please include a summary of the change and which issue is fixed. Include samples if adding new API, and include relevant motivation and context. List any dependencies that are required for this change. ## Type of change Please delete options that are not relevant. - [x] Bug fix (non-breaking change which fixes an issue) ## Closing issues To automatically close an issue: closes #IssueNumber
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Pull Request Template
Description
This PR introduces binary encoding support on request and responses for different Point operations.
What is Binary Encoding?
As the name suggests, binary encoding is a encoding mechanism through which the request payload will be encoded to binary first and sent to backend for processing. Decoding to Text will happen on the response path. The biggest benefit of binary encoding is to reduce cost on backend storage which helps to reduce the overall COGS.
Scope
The point operations that are currently in and out of scope for binary encoding are given below in tabular format:
CreateItemAsync()
PatchItemAsync()
CreateItemStreamAsync()
PatchItemStreamAsync()
ReadItemAsync()
TransactionalBatches
ReadItemStreamAsync()
Bulk APIs
UpsertItemAsync()
UpsertItemStreamAsync()
RepalceItemAsync()
ReplaceItemStreamAsync()
DeleteItemAsync()
DeleteItemStreamAsync()
How to Enable Binary Encoding?
This PR introduces a new environment variable
AZURE_COSMOS_BINARY_ENCODING_ENABLED
to opt-in or opt-out the binary encoding feature on demand. Setting this environment variable toTrue
will enable Binary encoding support.How Binary Encoding has been Achieved?
The binary encoding in the .NET SDK has been divided into two parts which are applicable differently for
ItemAsync()
andItemStreamAsync()
apis. The details are given below:ItemAsync()
APIs: Currently theCosmosJsonDotNetSerializer
has been refactored to read and write the binary bits directly into the stream. This reduces any conversion of the text stream to binary and vice versa and makes the serialization and de-serialization process even faster.ItemStreamAsync()
APIs: For these APIs, there are literally no serializes involved and the stream is returned directly to the caller. Therefore, this flow converts a Text stream into Binary and does the opposite on the response path. Conversion is a little bit costlier operation, in comparison with directly writing the binary stream using the serializer. Note that, irrespective of the binary encoding feature enabled or disabled, the output stream will always be in Text format, unless otherwise requested explicitly.Are There Any Way to Request Binary Bits on Response?
The answer is yes. We introduced a new internal request option:
EnableBinaryResponseOnPointOperations
in theItemRequestOptions
, and setting this flag toTrue
will not do any Text conversation, and will return the raw binary bits to the caller. However, please note that this option is applicable only for theItemStreamAsync()
APIs and will be helpful for some of the internal teams.Flow Diagrams
To understand the changes better, please take a look at the flow diagrams below for both
ItemAsync()
andItemStreamAsync()
APIs.Flow Diagram for
ItemAsync()
APIs that are in Scope per the Above Table:Flow Diagram for
ItemStreamAsync()
APIs that are in Scope per the Above Table:Performance Testing
Below are the comparison results for the perf testing done on the master branch and the current feature branch with binary encoding disabled:
Benchmark Results with No Binary Encoding on master branch:
Benchmark Results with Binary Encoding Disabled on feature branch:
Benchmark results comparison in terms of percentage between
master
andfeature
branch:Type of change
Please delete options that are not relevant.
Closing issues
To automatically close an issue: closes #4644