For CosmosDB bulk api added support for splitting of batch based on size.#23987
For CosmosDB bulk api added support for splitting of batch based on size.#23987sajeetharan merged 8 commits intoAzure:mainfrom
Conversation
|
API change check APIView has identified API level changes in this PR and created following API reviews. |
sdk/cosmosdb/cosmos/CHANGELOG.md
Outdated
There was a problem hiding this comment.
This should be added under unreleased 3.17.3
There was a problem hiding this comment.
Ack. Will remove this.
There was a problem hiding this comment.
I have taken this constant from Java and .Net SDK. this is slightly lower than 2Mb but this seems to the value used by both SDKs.
50c8816 to
99e04a4
Compare
witemple-msft
left a comment
There was a problem hiding this comment.
The approach seems sound to me, but I think there are a couple of bugs that are worth fixing.
| * @hidden | ||
| */ | ||
| export function splitBatchBasedOnBodySize(originalBatch: Batch): Batch[] { | ||
| if (originalBatch?.operations === undefined && originalBatch.operations.length < 1) return []; |
There was a problem hiding this comment.
Should this be an OR rather than AND? If no operations are defined OR the operations are of length zero? This condition seems like it will have the opposite of the intended effect because if originalBatch?.operations is undefined then the next condition must throw an error.
| * @hidden | ||
| */ | ||
| export function calculateObjectSizeInBytes(obj: unknown): number { | ||
| return new TextEncoder().encode(bodyFromData(obj as any)).length; |
There was a problem hiding this comment.
Just leaving a note that this feels expensive. You're basically encoding the body into a buffer, which the request pipeline must do anyway, for the sole purpose of measuring the body's length. I'm not sure if there's a great alternative to doing this to be honest, but it sticks out at me as being a costly operation.
Packages impacted by this PR
@azure/cosmos
Issues associated with this PR
#23923
Describe the problem that is addressed by this PR
CosmosDB Items.bulk api doens't honour 2Mb cap imposed on a single batch request. With these changes if size of a batch (cumulative size of it's operations) exceeds 2Mb it is split into smaller batches before sending.
What are the possible designs available to address the problem? If there are more than one possible design, why was the one in this PR chosen?
Are there test cases added in this PR? (If not, why?)
Yes
Provide a list of related PRs (if any)
Command used to generate this PR:**(Applicable only to SDK release request PRs)
Checklists