Skip to content

Add a es819 codec test to verify tryRead returns null if may contain duplicates#142409

Merged
parkertimmins merged 2 commits intoelastic:mainfrom
parkertimmins:parker/add-codec-test-for-may-contain-duplicates
Feb 23, 2026
Merged

Add a es819 codec test to verify tryRead returns null if may contain duplicates#142409
parkertimmins merged 2 commits intoelastic:mainfrom
parkertimmins:parker/add-codec-test-for-may-contain-duplicates

Conversation

@parkertimmins
Copy link
Copy Markdown
Contributor

@parkertimmins parkertimmins commented Feb 12, 2026

Add a test to es819 codec test to verify changes from #141926 . Just checks that situations which require incoming docs to not contain duplicates, return null on tryRead if passed docs with duplicates. Also, update DenseBinaryDocValues to return null if mayContainDuplicates

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v9.4.0 labels Feb 12, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Copy link
Copy Markdown
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left one question that we can maybe consider in a followup.

boolean binaryMultiValuedFormat
) throws IOException {
if (docs.mayContainDuplicates()) {
// isCompressed assumes there aren't duplicates
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder whether we can move the docs.mayContainDuplicates() to:

} else if (docs.mayContainDuplicates() == false && isDense(firstDocId, lastDocId, count)) {

For bulk reading binary doc values. Bulk decoding really requires dense and no duplicates, but the other less efficient bulk reading maybe not? Maybe we can record the last seen docId and if the current docid is the same then we don't append value?

@parkertimmins parkertimmins merged commit 919ed33 into elastic:main Feb 23, 2026
35 checks passed
jdconrad pushed a commit to jdconrad/elasticsearch that referenced this pull request Feb 24, 2026
…duplicates (elastic#142409)

Add a test to es819 codec test to verify changes from elastic#141926 . Just checks that situations which require incoming docs to not contain duplicates, return null on tryRead if passed docs with duplicates. Also, update DenseBinaryDocValues to return null if mayContainDuplicates
szybia added a commit to szybia/elasticsearch that referenced this pull request Feb 24, 2026
…on-sliced-reindex

* upstream/main:
  Update docs for v9.3.1 release (elastic#142887)
  Update docs for v9.2.6 release (elastic#142888)
  Improves visibility of vector index options and inference configuration (elastic#141653)
  Disable CAE in microsoft-graph-authz plugin (elastic#142848)
  Small improvements to `GetSnapshotsIT#testAllFeatures` (elastic#142825)
  Fix IndexSettingsTests synthetic ID tests (elastic#142654)
  [Test] Unmute tests of SnapshotShutdownIT (elastic#142921)
  Fixing metrics_info.json kibana definition file name (elastic#142813)
  [Packaging] Disable glibc 2.43 malloc huge pages in Wolfi images (elastic#142894)
  Mute org.elasticsearch.xpack.searchablesnapshots.SearchableSnapshotsTSDBSyntheticIdIntegTests testSearchableSnapshot elastic#142918
  Add shard heap usage to ClusterInfo (elastic#139557)
  ESQL: Load script fields row-by-row (elastic#142807)
  ESQL: Consolidate doc values memory tracking (elastic#142816)
  ES-14124  Create Index Count Limit User documentation Page (elastic#142570)
  Add a es819 codec test to verify tryRead returns null if may contain duplicates (elastic#142409)
  Support arithmetic operations for dense_vectors: scalar version (elastic#141060)
  [Transform] Allow project_routing (elastic#142421)
  Refactor query rewrite async actions for knn and sparse_vector queries (elastic#142889)
  Do not mark bulk indexing requests as retried after primary relocations (elastic#142157)
sidosera pushed a commit to sidosera/elasticsearch that referenced this pull request Feb 24, 2026
…duplicates (elastic#142409)

Add a test to es819 codec test to verify changes from elastic#141926 . Just checks that situations which require incoming docs to not contain duplicates, return null on tryRead if passed docs with duplicates. Also, update DenseBinaryDocValues to return null if mayContainDuplicates
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants