Skip to content

azcosmos: Go-native ReadMany#26007

Merged
analogrelay merged 18 commits intomainfrom
ashleyst/native-read-many
Feb 24, 2026
Merged

azcosmos: Go-native ReadMany#26007
analogrelay merged 18 commits intomainfrom
ashleyst/native-read-many

Conversation

@analogrelay
Copy link
Copy Markdown
Member

@analogrelay analogrelay commented Feb 10, 2026

I had Copilot start by writing a design spec, which I've deleted from the branch now that the implementation is complete. See here if you want to read it

Replaces the per-item point-read strategy in ReadManyItems with a query-based approach that groups items by physical partition range using Effective Partition Key (EPK) hashing. This issues far fewer HTTP round-trips and reduces RU cost, especially when many items share the same physical partition.

What changed

New: EPK hashing (internal/epk/epk.go)

  • Implements MurmurHash3-based Effective Partition Key computation (V1, V2 Hash, V2 MultiHash) matching the Cosmos DB backend algorithm.
  • Validated against official baseline test data (PartitionKeyHashBaselineTest.*.xml).

New: Query builder (cosmos_query_builder.go)

  • Builds parameterized SQL queries for read-many operations using three shapes:
    • ID-only IN — when PK path is /id and every PK value equals the item ID.
    • PK + ID IN — when all items in a chunk share the same logical PK.
    • OR-of-conjunctions — general case with multiple logical PKs in the same physical range.

Replaced: ReadMany strategy (cosmos_container_read_many.go)

  • Deleted executeReadManyWithPointReads (N individual point reads).
  • Added executeReadManyWithQueries which:
    1. Fetches container properties and partition key ranges.
    2. Computes EPK for each item and groups by physical partition range (binary search over sorted ranges).
    3. Chunks each range's items at 1,000 per query.
    4. Executes queries concurrently using x-ms-documentdb-partitionkeyrangeid header routing.
  • Added input validation: empty item IDs now return an error immediately.

API contract note: The order of items in the response is unspecified — callers must not rely on it matching the input order. Missing items (not found) are silently omitted.

Testing

  • Unit tests (cosmos_query_builder_test.go, cosmos_container_read_many_test.go): Full query text and parameter assertions for all query shapes, EPK range lookup, physical range grouping, and chunk splitting.
  • EPK baseline tests (internal/epk/epk_test.go): Validated against official Cosmos DB hash baseline XML test data.
  • Emulator integration tests (emulator_cosmos_read_many_items_test.go): Existing tests updated and passing.

Files changed

File Change
internal/epk/epk.go New — EPK hash computation
internal/epk/epk_test.go New — EPK baseline tests
internal/epk/testdata/*.xml New — Official baseline test data
cosmos_query_builder.go New — Parameterized query builder
cosmos_query_builder_test.go New — Query builder unit tests
cosmos_container_read_many.go Replaced point-reads with EPK-grouped queries
cosmos_container_read_many_test.go New — Grouping/chunking unit tests
cosmos_container.go Switch default strategy + add ID validation
partition_key.go Add computeEffectivePartitionKey method
partition_key_test.go EPK computation tests
emulator_cosmos_read_many_items_test.go Updated integration tests

…ring guarantee

- Replace point-read strategy description with query-based approach (V1: LPK grouping)
- Remove originalIndex from indexedItem type
- Explicitly document that returned item order is unspecified
- Resolve open question #3: Go SDK does not guarantee input order
- Update result matching section to reflect no-ordering contract
Replace executeReadManyWithPointReads with executeReadManyWithQueries,
which groups items by logical partition key and issues parameterized SQL
queries instead of N individual point reads.

New files:
- cosmos_query_builder.go: three query shapes (ID-only IN, PK+ID IN,
  OR-of-conjunctions) with support for nested/non-identifier PK paths,
  null PKs, and hierarchical partition keys
- cosmos_query_builder_test.go: 11 unit tests covering all query shapes

Changes:
- cosmos_container_read_many.go: delete point-read strategy, add
  executeReadManyWithQueries with concurrent chunk execution, continuation
  token pagination, and PK-header-based routing
- cosmos_container.go: wire new strategy, add empty-ID validation
- emulator_cosmos_read_many_items_test.go: add multi-PK integration test,
  update assertions to not depend on item ordering

The architecture is ready for EPK hashing (V2) to coalesce groups that
map to the same physical partition range.
Extract five focused functions from the monolithic method:
- groupItemsByLogicalPK: groups ItemIdentity values by serialised PK
- buildQueryChunks: splits groups into ≤1000-item chunks with SQL
- executeQueryChunks: concurrent goroutine pool over chunks
- executeOneChunk: single chunk query with continuation paging
- collectChunkResults: merges per-chunk results into response

Also promotes queryChunk and chunkResult to package-level types.
…titionKey

Add the EPK struct and method signature on PartitionKey in preparation
for V2 physical-range grouping. The method panics until the MurmurHash
implementation is wired in.
Add TestComputeEffectivePartitionKey_Baseline which enumerates all 29
cases across 4 XML baseline files (Singletons, Numbers, Strings, Lists)
and tests both V1 and V2 hash versions (58 subtests total).

Tests currently panic because computeEffectivePartitionKey is stubbed.
Remove category-based parsing in favor of a single JSON-aware parser.
Inputs are either 'UNDEFINED' or valid JSON. Post-unmarshal, magic
strings NaN/-Infinity/Infinity are converted to float64 equivalents.
- Add internal/epk package with MurmurHash3 (32+128 bit), hash input
  encoding, and EPK computation for V1, V2 Hash, and V2 MultiHash
- 58 baseline tests from Python SDK covering singletons, numbers,
  strings, and hierarchical partition keys
- Move mock_query_engine.go to internal/mock/ package
- Add computeEffectivePartitionKey method on PartitionKey
- Smoke test in partition_key_test.go verifying method delegation
…rouping in ReadMany

- Group items by physical partition range using EPK hashing instead of
  logical partition key, issuing fewer queries when multiple logical PKs
  map to the same physical range.
- Route queries with x-ms-documentdb-partitionkeyrangeid header instead
  of x-ms-documentdb-partitionkey.
- Replace indexedItem with ItemIdentity throughout.
- Add full query text and parameter assertions in all query builder tests.
- Add unit tests for EPK-based grouping and range lookup.
@analogrelay analogrelay marked this pull request as ready for review February 11, 2026 03:55
@analogrelay analogrelay requested a review from a team as a code owner February 11, 2026 03:55
Copilot AI review requested due to automatic review settings February 11, 2026 03:55
@analogrelay analogrelay changed the title azcosmos: WIP: Go-native ReadMany azcosmos: Go-native ReadMany Feb 11, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This draft PR advances a Go-native implementation of Cosmos DB ReadMany by adding effective partition key (EPK) hashing utilities and switching the default ReadMany path to a query-based strategy with a small query builder and supporting tests/mocks.

Changes:

  • Add internal/epk package (MurmurHash-based) and wire PartitionKey to compute EPKs.
  • Implement query-based ReadMany execution that groups items by physical partition key range and executes per-range parameterized queries concurrently.
  • Introduce a read-many query builder, mock query engine updates, and additional unit/emulator tests.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
sdk/data/azcosmos/partition_key.go Adds internal EPK computation method on PartitionKey.
sdk/data/azcosmos/partition_key_test.go Adds test coverage for EPK computation outputs.
sdk/data/azcosmos/internal/epk/epk.go Implements EPK hashing logic (V1/V2 + multihash).
sdk/data/azcosmos/internal/epk/epk_test.go Baseline-style tests validating hashing outputs against XML fixtures.
sdk/data/azcosmos/internal/epk/testdata/*.xml Adds baseline vectors for EPK hashing verification.
sdk/data/azcosmos/cosmos_query_builder.go Adds query builder for parameterized ReadMany query shapes.
sdk/data/azcosmos/cosmos_query_builder_test.go Adds tests for query shape generation.
sdk/data/azcosmos/cosmos_container_read_many.go Switches default ReadMany to query-based execution and adds EPK→range grouping + concurrency.
sdk/data/azcosmos/cosmos_container_read_many_test.go Adds unit tests for range lookup, grouping, and chunking.
sdk/data/azcosmos/internal/mock/mock_query_engine.go Moves/introduces mock query engine + adds a mock read-many pipeline.
sdk/data/azcosmos/emulator_cosmos_read_many_items_test.go Updates tests for unordered returns and adds multi-logical-PK emulator coverage.
sdk/data/azcosmos/cosmos_container_query_engine_test.go Updates tests to use the new internal/mock query engine package.
sdk/data/azcosmos/cosmos_container.go Adds empty-ID validation and routes default ReadMany to the new query-based implementation.

…FINED

The query builder generated IS_DEFINED(field) = false for null partition
keys, which fails to match documents where the PK field exists with a null
value. Changed to IS_NULL(field) which correctly matches these documents.
Add TestExecuteQueryChunks_CancelledContext to verify that context
cancellation mid-execution propagates through collectChunkResults as
an operation-level error. Add comment above executeOneChunk call site
clarifying the cancellation error flow.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.

Use context.WithCancel to create a child context in executeQueryChunks.
When a chunk fails, cancelChunks() propagates cancellation into in-flight
HTTP requests, not just between pagination pages. Removes the done channel
parameter from executeOneChunk.
Copy link
Copy Markdown
Member

@tvaron3 tvaron3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@simorenoh
Copy link
Copy Markdown
Member

should we add a Changelog entry?

@analogrelay analogrelay merged commit c677338 into main Feb 24, 2026
13 checks passed
@analogrelay analogrelay deleted the ashleyst/native-read-many branch February 24, 2026 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants