Added Gating so VectorScorerInt4BenchmarkTests and VectorScorerInt4BulkBenchmarkTests Ignored on Java 21#145060
Conversation
|
Pinging @elastic/es-search-relevance (Team:Search Relevance) |
There was a problem hiding this comment.
Pull request overview
This PR addresses intermittent CI failures on Java 21 for the int4 vector scorer benchmark test suites by gating execution based on heap-segment support (JDK >= 22), avoiding code paths that currently NPE due to a null queryScorer.
Changes:
- Added
supportsHeapSegments()gating to avoid constructing/running Lucene/native int4 benchmark scorers when heap segments aren’t supported. - Conditionalized assertions and teardown for Lucene/native implementations in int4 benchmark tests (single and bulk).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| benchmarks/src/test/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerInt4BenchmarkTests.java | Adds heap-segment gating around Lucene/native bench setup/assertions in score and query-score tests. |
| benchmarks/src/test/java/org/elasticsearch/benchmark/vector/scorer/VectorScorerInt4BulkBenchmarkTests.java | Adds heap-segment gating around Lucene/native bench setup/assertions across sequential/random and query/bulk query tests. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| public void testQueryRandomBulk() throws Exception { | ||
|
|
||
| for (int i = 0; i < 100; i++) { | ||
| var vectorData = new VectorScorerInt4BulkBenchmark.VectorData(dims, 1000, 200); | ||
| var scalar = createBench(VectorImplementation.SCALAR, vectorData); | ||
| var lucene = createBench(VectorImplementation.LUCENE, vectorData); | ||
| var nativeBench = createBench(VectorImplementation.NATIVE, vectorData); | ||
| VectorScorerInt4BulkBenchmark lucene = null; | ||
| VectorScorerInt4BulkBenchmark nativeBench = null; | ||
| if (supportsHeapSegments()) { | ||
| lucene = createBench(VectorImplementation.LUCENE, vectorData); | ||
| nativeBench = createBench(VectorImplementation.NATIVE, vectorData); | ||
| } | ||
|
|
||
| try { | ||
| float[] expected = scalar.scoreQueryMultipleRandomBulk(); | ||
| assertArrayEquals("LUCENE queryRandomBulk", expected, lucene.scoreQueryMultipleRandomBulk(), delta); | ||
| assertArrayEquals("NATIVE queryRandomBulk", expected, nativeBench.scoreQueryMultipleRandomBulk(), delta); | ||
| if (supportsHeapSegments()) { | ||
| assertArrayEquals("LUCENE queryRandomBulk", expected, lucene.scoreQueryMultipleRandomBulk(), delta); | ||
| assertArrayEquals("NATIVE queryRandomBulk", expected, nativeBench.scoreQueryMultipleRandomBulk(), delta); | ||
| } |
There was a problem hiding this comment.
For the query-scoring tests (e.g., testQueryRandomBulk()), the method still runs when supportsHeapSegments() is false but then skips all lucene/native assertions and setup. This doesn't mark the test as ignored/skipped on Java 21 and still does the scalar work inside the loop. Consider adding assumeTrue("Only test with heap segments", supportsHeapSegments()) at the start of query-scoring test methods (as done in other scorer benchmark tests) and removing the repeated per-iteration conditional checks.
| import static org.elasticsearch.benchmark.vector.scorer.BenchmarkUtils.supportsHeapSegments; | ||
|
|
||
| public class VectorScorerInt4BenchmarkTests extends ESTestCase { |
There was a problem hiding this comment.
PR title/description says these tests are "ignored on Java 21", but this change only conditionally skips creating lucene/native benches and assertions; the test methods still execute and will report as passing rather than skipped on Java 21. If the intent is to truly ignore/skip on Java 21, consider using an assumption (assumeTrue/assumeFalse) or a JUnit ignore/conditional mechanism so CI reflects the tests as skipped.
| public void testScores() throws Exception { | ||
| for (int i = 0; i < 100; i++) { | ||
| var data = new VectorScorerInt4Benchmark.VectorData(dims); | ||
| var scalar = createBench(VectorImplementation.SCALAR, data); | ||
| var lucene = createBench(VectorImplementation.LUCENE, data); | ||
| var nativeBench = createBench(VectorImplementation.NATIVE, data); | ||
| VectorScorerInt4Benchmark lucene = null; | ||
| VectorScorerInt4Benchmark nativeBench = null; | ||
| if (supportsHeapSegments()) { | ||
| lucene = createBench(VectorImplementation.LUCENE, data); | ||
| nativeBench = createBench(VectorImplementation.NATIVE, data); | ||
| } |
There was a problem hiding this comment.
In testScores(), lucene/native benches are now only created when supportsHeapSegments() is true. But VectorScorerInt4Benchmark.setup() always initializes scorer for LUCENE/NATIVE and only gates queryScorer behind supportsHeapSegments(), so score() comparisons should still be valid on Java 21. This change reduces test coverage for LUCENE/NATIVE on Java 21 and doesn't actually "ignore" the test—only skips assertions. Consider always running the score() comparisons, and gate only the query-scoring tests (e.g., via assumeTrue("Only test with heap segments", supportsHeapSegments())).
| public void testQueryScores() throws Exception { | ||
| for (int i = 0; i < 100; i++) { | ||
| var data = new VectorScorerInt4Benchmark.VectorData(dims); | ||
| var scalar = createBench(VectorImplementation.SCALAR, data); | ||
| var lucene = createBench(VectorImplementation.LUCENE, data); | ||
| var nativeBench = createBench(VectorImplementation.NATIVE, data); | ||
| VectorScorerInt4Benchmark lucene = null; | ||
| VectorScorerInt4Benchmark nativeBench = null; | ||
| if (supportsHeapSegments()) { | ||
| lucene = createBench(VectorImplementation.LUCENE, data); | ||
| nativeBench = createBench(VectorImplementation.NATIVE, data); | ||
| } | ||
|
|
||
| try { | ||
| float expected = scalar.scoreQuery(); | ||
| assertEquals("LUCENE scoreQuery", expected, lucene.scoreQuery(), delta); | ||
| assertEquals("NATIVE scoreQuery", expected, nativeBench.scoreQuery(), delta); | ||
| if(supportsHeapSegments()) { | ||
| assertEquals("LUCENE scoreQuery", expected, lucene.scoreQuery(), delta); | ||
| assertEquals("NATIVE scoreQuery", expected, nativeBench.scoreQuery(), delta); | ||
| } |
There was a problem hiding this comment.
testQueryScores() currently runs even when supportsHeapSegments() is false, but then performs no lucene/native assertions (and still loops 100 times). Other scorer benchmark tests skip query-scoring tests entirely when heap segments aren't supported (e.g., VectorScorerFloatBenchmarkTests.java:69). Consider adding an assumeTrue at the start of this method so it is reported as skipped on Java 21 and avoids unnecessary work.
| float expected = scalar.score(); | ||
| assertEquals("LUCENE score", expected, lucene.score(), delta); | ||
| assertEquals("NATIVE score", expected, nativeBench.score(), delta); | ||
| if(supportsHeapSegments()) { | ||
| assertEquals("LUCENE score", expected, lucene.score(), delta); | ||
| assertEquals("NATIVE score", expected, nativeBench.score(), delta); | ||
| } | ||
| } finally { | ||
| scalar.teardown(); | ||
| lucene.teardown(); | ||
| nativeBench.teardown(); | ||
| if(supportsHeapSegments()) { | ||
| lucene.teardown(); |
There was a problem hiding this comment.
There are several if(supportsHeapSegments()) blocks without a space after if. This is inconsistent with the surrounding style in this file (e.g., if (supportsHeapSegments())) and may fail automated formatting/style checks. Please update to if (supportsHeapSegments()) consistently.
| public void testSequential() throws Exception { | ||
| for (int i = 0; i < 100; i++) { | ||
| var vectorData = new VectorScorerInt4BulkBenchmark.VectorData(dims, 1000, 200); | ||
| var scalar = createBench(VectorImplementation.SCALAR, vectorData); | ||
| var lucene = createBench(VectorImplementation.LUCENE, vectorData); | ||
| var nativeBench = createBench(VectorImplementation.NATIVE, vectorData); | ||
| VectorScorerInt4BulkBenchmark lucene = null; | ||
| VectorScorerInt4BulkBenchmark nativeBench = null; | ||
| if (supportsHeapSegments()) { | ||
| lucene = createBench(VectorImplementation.LUCENE, vectorData); | ||
| nativeBench = createBench(VectorImplementation.NATIVE, vectorData); | ||
| } | ||
|
|
||
| try { | ||
| float[] expected = scalar.scoreMultipleSequential(); | ||
| assertArrayEquals("LUCENE sequential", expected, lucene.scoreMultipleSequential(), delta); | ||
| assertArrayEquals("NATIVE sequential", expected, nativeBench.scoreMultipleSequential(), delta); | ||
| if (supportsHeapSegments()) { | ||
| assertArrayEquals("LUCENE sequential", expected, lucene.scoreMultipleSequential(), delta); | ||
| assertArrayEquals("NATIVE sequential", expected, nativeBench.scoreMultipleSequential(), delta); | ||
| } |
There was a problem hiding this comment.
The new supportsHeapSegments() gating is applied to sequential/random/sequentialBulk/randomBulk tests too. However, in VectorScorerInt4BulkBenchmark.setup(), scorer for LUCENE/NATIVE is always initialized and only queryScorer is gated behind supportsHeapSegments(), so these non-query tests should still be comparable on Java 21. Consider keeping lucene/native coverage for the non-query scoring tests and only skipping the query-scoring tests on Java 21.
ldematte
left a comment
There was a problem hiding this comment.
I think copilot is right; we should just skip the whole thing if on JDK 21.
We can just assumeTrue(supportsHeapSegments(), ..) in a static method.
|
If you want, I fixed the benchmarks as part of #145096 |
|
closed in favor of: #145096 |
Fixes: #145054
Fixes: #145055
Fixes: #145056
Fixes: #145057
Fixes: #145058