Validate individual offset values in BULK_OFFSETS bounds checks by ChrisHegarty · Pull Request #144643 · elastic/elasticsearch

ChrisHegarty · 2026-03-20T12:43:37Z

While working on bulk sparse scoring (#144557), I noticed that checkBulkOffsets and checkBBQBulkOffsets validated segment sizes but not individual offset values. An out-of-range or negative offset would silently read memory beyond the data segment, risking a crash or silently wrong results.

The solution is to replace the sequential size check with per-offset validation that checks each offset points to a valid vector within the data segment. The O(count) loop should be negligible relative to the O(count * dims) native call, but we've made the checks conditional on asserts to avoid any potential negative cost of this, and asserts should be good enough given our testing.

Note: INT4 skips size=2 (packedLen=1) because checkBulkOffsets computes rowBytes = packedLen * 4 / 8 which truncates to 0 via integer division, making the bounds check trivially pass. This is a pre-existing issue with how INT4 passes packed byte length (not element count) as the length parameter to the generic check formula. We can address this separately, if needed.

elasticsearchmachine · 2026-03-20T12:44:07Z

Pinging @elastic/es-search-relevance (Team:Search Relevance)

benwtrent

This would have been a very sneaky bug to track down!

ldematte · 2026-03-20T14:10:15Z

Note: INT4 skips size=2 (packedLen=1) because checkBulkOffsets computes rowBytes = packedLen * 4 / 8 which truncates to 0 via integer division, making the bounds check trivially pass. This is a pre-existing issue with how INT4 passes packed byte length (not element count) as the length parameter to the generic check formula. We can address this separately, if needed.

I think @thecoop already noticed and fixed that?

ldematte

I like the change, but I wonder if we should make that a "assert-like" check, running only with assertions enabled (e.g. in tests)

ChrisHegarty · 2026-03-20T14:32:05Z

I think @thecoop already noticed and fixed that?

It's still broken in main, but I did not try to fix it here. Just test and avoid it for noe.

I like the change, but I wonder if we should make that a "assert-like" check, running only with assertions enabled (e.g. in tests)

I dunno. Maybe. I am worried about dereferencing the wrong memory, so checks are super important. The existing checks are always present, so I just followed the same pattern - tho this is expanding somewhat. I don't think that it will have any real noticeable affect on performance. However we could check afterwards and move these to asserts is needed?

ldematte · 2026-03-20T14:52:03Z

I was already not 100% sure about the existing checks to be honest :)
I'm not sure how "light" they are, even compared with bulk operations. Remember, even the cost of calling a function is visible, with the small-ish bulk sizes we have and the level of optimization we get with SIMD, we are talking about nanoseconds. But you are probably right and I'm worrying for nothing.

I am worried about dereferencing the wrong memory, so checks are super important

Well, that ship has sailed the day we decided to go with native code I think :D It's true that in this case the check is meaningful though.

thecoop · 2026-03-23T09:59:04Z

Do we have info on what performance effect this has? It's changing the check from O(1) to some kind of O(n), so it's going to have some effect.

thecoop · 2026-03-23T10:00:26Z

Could we have a more in-depth check with assertions, and a O(1) top-level sanity check for production? All the code paths should be covered during tests, so there should be no need to run the full checks in production...right?

ldematte · 2026-03-23T11:56:28Z

Could we have a more in-depth check with assertions, and a O(1) top-level sanity check for production? All the code paths should be covered during tests, so there should be no need to run the full checks in production...right?

That could be a good middle ground; my suggestion was along the same lines, but a bit more radical -- assertions should be enough as tests should cover us, and we should have validation somewhere so that it's not possible to generate invalid data (e.g. ordinals that are negative or > the num of vectors).

…o validate-bulk-offset-values

ChrisHegarty · 2026-03-24T10:04:12Z

The O(count) per-offset bounds checks in checkBulkOffsets and checkBBQBulkOffsets are moved into separate validateBulkOffsets / validateBBQBulkOffsets methods, called via assert so they have zero cost in production. The validate methods also add alignment checks on the offsets/result segments and non-negative/positive guards on count, length, and pitch.

libs/native/src/main/java/org/elasticsearch/nativeaccess/jdk/JdkVectorLibrary.java

…o validate-bulk-offset-values

ldematte

LGTM!
But what about Int4? Is that fixed already, or do we want to address it here, or in a separate PR?

ChrisHegarty · 2026-03-26T09:42:39Z

LGTM! But what about Int4? Is that fixed already, or do we want to address it here, or in a separate PR?

I already added an int4 unit test in this PR, so it should be covered and verified. Is there something else that I'm missing?

ldematte · 2026-03-26T09:44:44Z

I already added an int4 unit test in this PR, so it should be covered and verified. Is there something else that I'm missing?

I don't know.. probably not, I was referring to the note in the PR description. If that's solved, probably you just need to update the description.

ChrisHegarty · 2026-03-26T09:53:01Z

I already added an int4 unit test in this PR, so it should be covered and verified. Is there something else that I'm missing?

I don't know.. probably not, I was referring to the note in the PR description. If that's solved, probably you just need to update the description.

Oh yeah. That's a separate issue. If we want to do it at all.

* upstream/main: (146 commits) Revert "[Native] Gradle-related tweaks to improve handling of the simdvec native library (elastic#144539)" Fix ArrayIndexOutOfBoundsException in fetch phase with partial results (elastic#144385) ESQL: Correctly manage NULL data type for SUM (elastic#144942) [ESQL] Fixes GroupedTopNBenchmark not executing (elastic#144944) Fix reader context leak when query response serialization fails (elastic#144708) Validate individual offset values in BULK_OFFSETS bounds checks (elastic#144643) Merge main21 source set into main in simdvec (elastic#144921) [TEST] Unmute TsidExtractingIdFieldMapperTests (elastic#144848) [Native] Gradle-related tweaks to improve handling of the simdvec native library (elastic#144539) Fix `ThreadedActionListenerTests#testRejectionHandling` (elastic#144795) Add new DLM Frozen Tier Transition execution plugin and service (elastic#144595) Prometheus: execute query_range via parsed EsqlStatement plan (elastic#144416) Investigate `testBulkIndexingRequestSplitting` failure (elastic#144766) Add test utility for wrapping directories in FilterDirectory layer (elastic#143563) Fix ES|QL decay tests with negative scale (elastic#144657) Fix circuit breaker leak in percolator query construction (elastic#144827) Use XPerFieldDocValuesFormat in AbstractTSDBSyntheticIdCodec (elastic#144744) [DOCS] Document how reindex work in CPS (elastic#144016) Fix Int4 vector library tests failing on Java 21 (elastic#144830) [DiskBBQ] Fix index sorting on flush (elastic#144938) ...

…tic#144643) While working on bulk sparse scoring (elastic#144557), I noticed that checkBulkOffsets and checkBBQBulkOffsets validated segment sizes but not individual offset values. An out-of-range or negative offset would silently read memory beyond the data segment, risking a crash or silently wrong results. The solution is to replace the sequential size check with per-offset validation that checks each offset points to a valid vector within the data segment. The O(count) loop should be negligible relative to the O(count * dims) native call, but we've made the checks conditional on asserts to avoid any potential negative cost of this, and asserts should be good enough given our testing. Note: INT4 skips size=2 (packedLen=1) because checkBulkOffsets computes rowBytes = packedLen * 4 / 8 which truncates to 0 via integer division, making the bounds check trivially pass. This is a pre-existing issue with how INT4 passes packed byte length (not element count) as the length parameter to the generic check formula. We can address this separately, if needed.

Validate individual offset values in BULK_OFFSETS bounds checks

bfa8cd0

ChrisHegarty requested a review from a team as a code owner March 20, 2026 12:43

ChrisHegarty added >test Issues or PRs that are addressing/adding tests :Search Relevance/Vectors Vector search Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch labels Mar 20, 2026

Merge branch 'main' into validate-bulk-offset-values

ff0c41f

elasticsearchmachine added the v9.4.0 label Mar 20, 2026

[CI] Auto commit changes from spotless

7e9c4de

benwtrent approved these changes Mar 20, 2026

View reviewed changes

ChrisHegarty and others added 3 commits March 20, 2026 13:54

Merge branch 'main' into validate-bulk-offset-values

522db3a

Merge branch 'main' into validate-bulk-offset-values

ee35306

[CI] Auto commit changes from spotless

66637d2

ldematte reviewed Mar 20, 2026

View reviewed changes

ChrisHegarty and others added 3 commits March 24, 2026 10:01

use asserts

9146b84

Merge remote-tracking branch 'chegar/validate-bulk-offset-values' int…

5524685

…o validate-bulk-offset-values

Merge branch 'main' into validate-bulk-offset-values

a5e0ca1

ChrisHegarty requested review from ldematte and thecoop March 24, 2026 10:05

ChrisHegarty added 2 commits March 24, 2026 11:06

Merge branch 'main' into validate-bulk-offset-values

7fa3ebe

Merge branch 'main' into validate-bulk-offset-values

7acab9f

thecoop reviewed Mar 24, 2026

View reviewed changes

libs/native/src/main/java/org/elasticsearch/nativeaccess/jdk/JdkVectorLibrary.java Show resolved Hide resolved

Merge branch 'main' into validate-bulk-offset-values

12f1327

ChrisHegarty and others added 6 commits March 25, 2026 07:27

Merge branch 'main' into validate-bulk-offset-values

5ab6c2b

revert

252fcca

Merge branch 'main' into validate-bulk-offset-values

166f034

Merge branch 'main' into validate-bulk-offset-values

a2e19b0

Merge remote-tracking branch 'chegar/validate-bulk-offset-values' int…

c672716

…o validate-bulk-offset-values

revert

21f78ab

ldematte approved these changes Mar 26, 2026

View reviewed changes

ChrisHegarty merged commit 0106d3e into elastic:main Mar 26, 2026
35 checks passed

ChrisHegarty deleted the validate-bulk-offset-values branch March 26, 2026 11:17

Conversation

ChrisHegarty commented Mar 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Mar 20, 2026

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

ldematte commented Mar 20, 2026

Uh oh!

ldematte left a comment

Choose a reason for hiding this comment

Uh oh!

ChrisHegarty commented Mar 20, 2026

Uh oh!

ldematte commented Mar 20, 2026

Uh oh!

thecoop commented Mar 23, 2026

Uh oh!

thecoop commented Mar 23, 2026

Uh oh!

ldematte commented Mar 23, 2026

Uh oh!

ChrisHegarty commented Mar 24, 2026

Uh oh!

Uh oh!

ldematte left a comment

Choose a reason for hiding this comment

Uh oh!

ChrisHegarty commented Mar 26, 2026

Uh oh!

ldematte commented Mar 26, 2026

Uh oh!

ChrisHegarty commented Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

ChrisHegarty commented Mar 20, 2026 •

edited

Loading