Merged
Conversation
analogrelay
commented
Feb 10, 2026
tvaron3
reviewed
Feb 10, 2026
…ring guarantee - Replace point-read strategy description with query-based approach (V1: LPK grouping) - Remove originalIndex from indexedItem type - Explicitly document that returned item order is unspecified - Resolve open question #3: Go SDK does not guarantee input order - Update result matching section to reflect no-ordering contract
Replace executeReadManyWithPointReads with executeReadManyWithQueries, which groups items by logical partition key and issues parameterized SQL queries instead of N individual point reads. New files: - cosmos_query_builder.go: three query shapes (ID-only IN, PK+ID IN, OR-of-conjunctions) with support for nested/non-identifier PK paths, null PKs, and hierarchical partition keys - cosmos_query_builder_test.go: 11 unit tests covering all query shapes Changes: - cosmos_container_read_many.go: delete point-read strategy, add executeReadManyWithQueries with concurrent chunk execution, continuation token pagination, and PK-header-based routing - cosmos_container.go: wire new strategy, add empty-ID validation - emulator_cosmos_read_many_items_test.go: add multi-PK integration test, update assertions to not depend on item ordering The architecture is ready for EPK hashing (V2) to coalesce groups that map to the same physical partition range.
Extract five focused functions from the monolithic method: - groupItemsByLogicalPK: groups ItemIdentity values by serialised PK - buildQueryChunks: splits groups into ≤1000-item chunks with SQL - executeQueryChunks: concurrent goroutine pool over chunks - executeOneChunk: single chunk query with continuation paging - collectChunkResults: merges per-chunk results into response Also promotes queryChunk and chunkResult to package-level types.
…titionKey Add the EPK struct and method signature on PartitionKey in preparation for V2 physical-range grouping. The method panics until the MurmurHash implementation is wired in.
Add TestComputeEffectivePartitionKey_Baseline which enumerates all 29 cases across 4 XML baseline files (Singletons, Numbers, Strings, Lists) and tests both V1 and V2 hash versions (58 subtests total). Tests currently panic because computeEffectivePartitionKey is stubbed.
Remove category-based parsing in favor of a single JSON-aware parser. Inputs are either 'UNDEFINED' or valid JSON. Post-unmarshal, magic strings NaN/-Infinity/Infinity are converted to float64 equivalents.
- Add internal/epk package with MurmurHash3 (32+128 bit), hash input encoding, and EPK computation for V1, V2 Hash, and V2 MultiHash - 58 baseline tests from Python SDK covering singletons, numbers, strings, and hierarchical partition keys - Move mock_query_engine.go to internal/mock/ package - Add computeEffectivePartitionKey method on PartitionKey - Smoke test in partition_key_test.go verifying method delegation
…rouping in ReadMany - Group items by physical partition range using EPK hashing instead of logical partition key, issuing fewer queries when multiple logical PKs map to the same physical range. - Route queries with x-ms-documentdb-partitionkeyrangeid header instead of x-ms-documentdb-partitionkey. - Replace indexedItem with ItemIdentity throughout. - Add full query text and parameter assertions in all query builder tests. - Add unit tests for EPK-based grouping and range lookup.
analogrelay
commented
Feb 11, 2026
sdk/data/azcosmos/internal/epk/testdata/PartitionKeyHashBaselineTest.Lists.xml
Show resolved
Hide resolved
Contributor
There was a problem hiding this comment.
Pull request overview
This draft PR advances a Go-native implementation of Cosmos DB ReadMany by adding effective partition key (EPK) hashing utilities and switching the default ReadMany path to a query-based strategy with a small query builder and supporting tests/mocks.
Changes:
- Add
internal/epkpackage (MurmurHash-based) and wirePartitionKeyto compute EPKs. - Implement query-based ReadMany execution that groups items by physical partition key range and executes per-range parameterized queries concurrently.
- Introduce a read-many query builder, mock query engine updates, and additional unit/emulator tests.
Reviewed changes
Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| sdk/data/azcosmos/partition_key.go | Adds internal EPK computation method on PartitionKey. |
| sdk/data/azcosmos/partition_key_test.go | Adds test coverage for EPK computation outputs. |
| sdk/data/azcosmos/internal/epk/epk.go | Implements EPK hashing logic (V1/V2 + multihash). |
| sdk/data/azcosmos/internal/epk/epk_test.go | Baseline-style tests validating hashing outputs against XML fixtures. |
| sdk/data/azcosmos/internal/epk/testdata/*.xml | Adds baseline vectors for EPK hashing verification. |
| sdk/data/azcosmos/cosmos_query_builder.go | Adds query builder for parameterized ReadMany query shapes. |
| sdk/data/azcosmos/cosmos_query_builder_test.go | Adds tests for query shape generation. |
| sdk/data/azcosmos/cosmos_container_read_many.go | Switches default ReadMany to query-based execution and adds EPK→range grouping + concurrency. |
| sdk/data/azcosmos/cosmos_container_read_many_test.go | Adds unit tests for range lookup, grouping, and chunking. |
| sdk/data/azcosmos/internal/mock/mock_query_engine.go | Moves/introduces mock query engine + adds a mock read-many pipeline. |
| sdk/data/azcosmos/emulator_cosmos_read_many_items_test.go | Updates tests for unordered returns and adds multi-logical-PK emulator coverage. |
| sdk/data/azcosmos/cosmos_container_query_engine_test.go | Updates tests to use the new internal/mock query engine package. |
| sdk/data/azcosmos/cosmos_container.go | Adds empty-ID validation and routes default ReadMany to the new query-based implementation. |
…argeting a single PKRange
…FINED The query builder generated IS_DEFINED(field) = false for null partition keys, which fails to match documents where the PK field exists with a null value. Changed to IS_NULL(field) which correctly matches these documents.
Add TestExecuteQueryChunks_CancelledContext to verify that context cancellation mid-execution propagates through collectChunkResults as an operation-level error. Add comment above executeOneChunk call site clarifying the cancellation error flow.
Use context.WithCancel to create a child context in executeQueryChunks. When a chunk fails, cancelChunks() propagates cancellation into in-flight HTTP requests, not just between pagination pages. Removes the done channel parameter from executeOneChunk.
simorenoh
approved these changes
Feb 13, 2026
Member
|
should we add a Changelog entry? |
simorenoh
approved these changes
Feb 24, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I had Copilot start by writing a design spec, which I've deleted from the branch now that the implementation is complete. See here if you want to read it
Replaces the per-item point-read strategy in
ReadManyItemswith a query-based approach that groups items by physical partition range using Effective Partition Key (EPK) hashing. This issues far fewer HTTP round-trips and reduces RU cost, especially when many items share the same physical partition.What changed
New: EPK hashing (
internal/epk/epk.go)PartitionKeyHashBaselineTest.*.xml).New: Query builder (
cosmos_query_builder.go)/idand every PK value equals the item ID.Replaced: ReadMany strategy (
cosmos_container_read_many.go)executeReadManyWithPointReads(N individual point reads).executeReadManyWithQuerieswhich:x-ms-documentdb-partitionkeyrangeidheader routing.API contract note: The order of items in the response is unspecified — callers must not rely on it matching the input order. Missing items (not found) are silently omitted.
Testing
cosmos_query_builder_test.go,cosmos_container_read_many_test.go): Full query text and parameter assertions for all query shapes, EPK range lookup, physical range grouping, and chunk splitting.internal/epk/epk_test.go): Validated against official Cosmos DB hash baseline XML test data.emulator_cosmos_read_many_items_test.go): Existing tests updated and passing.Files changed
internal/epk/epk.gointernal/epk/epk_test.gointernal/epk/testdata/*.xmlcosmos_query_builder.gocosmos_query_builder_test.gocosmos_container_read_many.gocosmos_container_read_many_test.gocosmos_container.gopartition_key.gocomputeEffectivePartitionKeymethodpartition_key_test.goemulator_cosmos_read_many_items_test.go