Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Internal] Binary Encoding: Adds Binary Encoding Support for Point Op…
…erations (#4652) # Pull Request Template # Description This PR introduces binary encoding support on request and responses for different Point operations. ## What is Binary Encoding? As the name suggests, binary encoding is a encoding mechanism through which the request payload will be encoded to binary first and sent to backend for processing. Decoding to Text will happen on the response path. The biggest benefit of binary encoding is to reduce cost on backend storage which helps to reduce the overall COGS. ## Scope The point operations that are currently in and out of scope for binary encoding are given below in tabular format: | Operations Currently in Scope| Operations Currently Out of Scope| Reason for Out of Scope| | --- | --- | --- | | `CreateItemAsync()` | `PatchItemAsync()` | Operation Currently not Supported in BE | `CreateItemStreamAsync()` | `PatchItemStreamAsync()` | Operation Currently not Supported in BE | `ReadItemAsync()` | `TransactionalBatches` | Operation Currently not Supported in BE | `ReadItemStreamAsync()` | `Bulk APIs` | Operation Currently not Supported in BE | `UpsertItemAsync()` | | | | `UpsertItemStreamAsync()` | | | | `RepalceItemAsync()` | | | | `ReplaceItemStreamAsync()` | | | | `DeleteItemAsync()` | | | | `DeleteItemStreamAsync()` | | | ## How to Enable Binary Encoding? This PR introduces a new environment variable `AZURE_COSMOS_BINARY_ENCODING_ENABLED` to opt-in or opt-out the binary encoding feature on demand. Setting this environment variable to `True` will enable Binary encoding support. ## How Binary Encoding has been Achieved? The binary encoding in the .NET SDK has been divided into two parts which are applicable differently for `ItemAsync()` and `ItemStreamAsync()` apis. The details are given below: - **`ItemAsync()` APIs:** Currently the `CosmosJsonDotNetSerializer` has been refactored to read and write the binary bits directly into the stream. This reduces any conversion of the text stream to binary and vice versa and makes the serialization and de-serialization process even faster. - **`ItemStreamAsync()` APIs:** For these APIs, there are literally no serializes involved and the stream is returned directly to the caller. Therefore, this flow converts a Text stream into Binary and does the opposite on the response path. Conversion is a little bit costlier operation, in comparison with directly writing the binary stream using the serializer. Note that, irrespective of the binary encoding feature enabled or disabled, the output stream will always be in Text format, unless otherwise requested explicitly. ## Are There Any Way to Request Binary Bits on Response? The answer is yes. We introduced a new internal request option: `EnableBinaryResponseOnPointOperations` in the `ItemRequestOptions`, and setting this flag to `True` will not do any Text conversation, and will return the raw binary bits to the caller. However, please note that this option is applicable only for the `ItemStreamAsync()` APIs and will be helpful for some of the internal teams. ## Flow Diagrams ## To understand the changes better, please take a look at the flow diagrams below for both `ItemAsync()` and `ItemStreamAsync()` APIs. **Flow Diagram for `ItemAsync()` APIs that are in Scope per the Above Table:** ```mermaid flowchart TD A[All 'ItemAsync' APIs in Scope] -->|SerializerCore.ToStream| B{Select <br> Serializer} B -->|One| C[CosmosJsonDotNetSerializer] B -->|Two| D[CosmosSystemTextJsonSerializer] B -->|Three| E[Any Custom <br> Serializer] C -->|Serialize to <br> Binary Stream| F[ContainerCore<br>.ProcessItemStreamAsync] D -->|Serialize to <br> Text Stream| F[ContainerCore<br>.ProcessItemStreamAsync] E -->|Stream may or <br> may not be <br> Serialized to Binary| F[ContainerCore<br>.ProcessItemStreamAsync] F --> G{Is Input <br> Stream in <br> Binary ?} G -->|True| I[ProcessResourceOperationStreamAsync] G -->|False| H[Convert Input Text <br> Stream to <br> Binary Stream] H --> I I --> |SendAsync| J[RequestInvokerHandler] J --> |Sets following headers to request response in binary format: x-ms-cosmos-supported-serialization-formats = CosmosBinary x-ms-documentdb-content-serialization-format = CosmosBinary| K[TransportHandler] K --> |Binary Response <br> Stream|L[ContainerCore<br>.ProcessItemStreamAsync] L --> |Note: No explicit conversion to binary stream happens because we let the serializer directly de-serialize the binary stream into text. SerializerCore.FromStream| M{Select <br> Serializer} M -->|One| N[CosmosJsonDotNetSerializer] M -->|Two| O[CosmosSystemTextJsonSerializer] M -->|Three| P[Any Custom <br> Serializer] N -->|De-Serialize to <br> Text Stream| Q[Container<br>.ItemAsync Response] O -->|De-Serialize to <br> Text Stream| Q[Container<br>.ItemAsync Response] P -->|Stream may or <br> may not be <br> De-Serialized to Text| Q[Container<br>.ItemAsync Response <br> in Text] ``` **Flow Diagram for `ItemStreamAsync()` APIs that are in Scope per the Above Table:** ```mermaid flowchart TD A[All 'ItemStreamAsync' APIs in Scope] A -->|Stream may or <br> may not be <br> Serialized to Binary| F[ContainerCore<br>.ProcessItemStreamAsync] F --> G{Is Input <br> Stream in <br> Binary ?} G -->|True| I[ProcessResourceOperationStreamAsync] G -->|False| H[Convert Input Text <br> Stream to <br> Binary Stream] H --> I I --> |SendAsync| J[RequestInvokerHandler] J --> |Sets following headers to get binary response: x-ms-cosmos-supported-serialization-formats = CosmosBinary x-ms-documentdb-content-serialization-format = CosmosBinary| K[TransportHandler] K --> |Binary Response <br> Stream|L[ContainerCore<br>.ProcessItemStreamAsync] L --> M{Is Response <br> Stream in <br> Binary ?} M -->|Yes| N[CosmosSerializationUtil] M -->|No| Q N -->|Convert Binary Stream to <br> Text Stream| Q[Container<br>.ItemAsync Response] ``` ## Performance Testing Below are the comparison results for the perf testing done on the master branch and the current feature branch with binary encoding disabled: ``` ini BenchmarkDotNet=v0.13.5, OS=ubuntu 20.04 Intel Xeon Platinum 8272CL CPU 2.60GHz, 1 CPU, 16 logical and 8 physical cores .NET SDK=6.0.427 [Host] : .NET 6.0.35 (6.0.3524.45918), X64 RyuJIT AVX2 LongRun : .NET 6.0.35 (6.0.3524.45918), X64 RyuJIT AVX2 Job=LongRun IterationCount=100 LaunchCount=3 RunStrategy=Throughput WarmupCount=15 ``` **Benchmark Results with No Binary Encoding on master branch:** ![image](https://github.com/user-attachments/assets/3fa3407d-1fbd-45f9-9be0-e0257fc055da) **Benchmark Results with Binary Encoding Disabled on feature branch:** ![image](https://github.com/user-attachments/assets/dbac9e9a-ba1f-48e7-b231-1fe318967ab5) Benchmark results comparison in terms of percentage between `master` and `feature` branch: ![image](https://github.com/user-attachments/assets/6112d018-0801-4384-bf11-4c093da0ef23) ## Type of change Please delete options that are not relevant. - [x] New feature (non-breaking change which adds functionality) ## Closing issues To automatically close an issue: closes #4644
- Loading branch information