Skip certain metric assertions on Windows#144933
Conversation
|
Pinging @elastic/es-core-infra (Team:Core/Infra) |
|
Caution Review failedFailed to post review comments 📝 WalkthroughWalkthroughThis pull request updates infrastructure and vector operations across multiple areas. Pipeline configurations update GCP compute resources to use newer machine types (n4-standard/n4-custom variants) with hyperdisk-balanced storage. Gradle wrapper and native library versions are bumped to 9.4.1 and 1.0.70 respectively. Test infrastructure adds BFloat16 vector scoring implementations with accompanying benchmarks and test utilities. Several APM integration test cases are muted due to metric collection failures. Documentation expands with new ESQL functions (SPARKLINE), updated type support tables for histogram types transitioning to GA, and property support for flattened fields. Native C++ code adds BFloat16 vector operations and sparse bulk scoring primitives. Java libraries extend vector scoring with BFloat16 capability, consolidate heap-segment feature detection, and reorganize scorer implementations for cleaner architecture. ✨ Finishing Touches🧪 Generate unit tests (beta)
⚔️ Resolve merge conflicts
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@server/src/test/java/org/elasticsearch/monitor/metrics/SystemMetricsTests.java`:
- Around line 52-64: The test currently uses a single fdSupported flag; instead
gate each assertion on its respective probe separately: compute boolean
fdUsedSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0 and boolean
fdMaxSupported = ProcessProbe.getMaxFileDescriptorCount() >= 0, then assert
"jvm.fd.used" and "jvm.file_descriptor.count" against fdUsedSupported (and
fdUsedSupported && emitOTelMetrics for the OTel-name check), and assert
"jvm.fd.max" and "jvm.file_descriptor.limit" against fdMaxSupported (and
fdMaxSupported && emitOTelMetrics for the OTel-name check) so each metric is
validated independently of the other.
In
`@test/external-modules/apm-integration/src/javaRestTest/java/org/elasticsearch/test/apmintegration/MetricsApmIT.java`:
- Around line 78-103: The custom TestRule 'rule' must ensure
recordingApmServer.after() runs even if cluster.apply(...).evaluate() throws;
wrap the call to cluster.apply(s, description).evaluate() in an outer
try/finally that calls recordingApmServer.after() in the finally block, while
keeping the existing inner try/finally around base.evaluate() to preserve normal
shutdown ordering; update the lambda that creates the Statement so the outer
finally is the failure-path cleanup guard for recordingApmServer.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Pro
Run ID: aa9dc8ed-5016-4832-826a-c4a3d3bb64fa
📒 Files selected for processing (3)
muted-tests.ymlserver/src/test/java/org/elasticsearch/monitor/metrics/SystemMetricsTests.javatest/external-modules/apm-integration/src/javaRestTest/java/org/elasticsearch/test/apmintegration/MetricsApmIT.java
💤 Files with no reviewable changes (1)
- muted-tests.yml
| boolean fdSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0 && ProcessProbe.getMaxFileDescriptorCount() >= 0; | ||
| assertEquals("jvm.fd.used should be registered if supported", fdSupported, registeredGauges.contains("jvm.fd.used")); | ||
| assertEquals("jvm.fd.max should be registered if supported", fdSupported, registeredGauges.contains("jvm.fd.max")); | ||
| assertEquals( | ||
| "jvm.file_descriptor.count should be registered if emitting OTel metrics", | ||
| registeredGauges.contains("jvm.file_descriptor.count"), | ||
| emitOTelMetrics | ||
| fdSupported && emitOTelMetrics | ||
| ); | ||
| assertEquals( | ||
| "jvm.file_descriptor.limit should be registered if emitting OTel metrics", | ||
| registeredGauges.contains("jvm.file_descriptor.limit"), | ||
| emitOTelMetrics | ||
| fdSupported && emitOTelMetrics | ||
| ); |
There was a problem hiding this comment.
Gate each fd metric on its own probe result.
SystemMetrics registers jvm.fd.used/jvm.file_descriptor.count from getOpenFileDescriptorCount() and jvm.fd.max/jvm.file_descriptor.limit from getMaxFileDescriptorCount() independently. The new shared fdSupported boolean requires both probes to succeed, so this test can still fail or miss regressions when only one side returns -1.
Suggested fix
- boolean fdSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0 && ProcessProbe.getMaxFileDescriptorCount() >= 0;
- assertEquals("jvm.fd.used should be registered if supported", fdSupported, registeredGauges.contains("jvm.fd.used"));
- assertEquals("jvm.fd.max should be registered if supported", fdSupported, registeredGauges.contains("jvm.fd.max"));
+ boolean openFdSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0;
+ boolean maxFdSupported = ProcessProbe.getMaxFileDescriptorCount() >= 0;
+ assertEquals("jvm.fd.used should be registered if supported", openFdSupported, registeredGauges.contains("jvm.fd.used"));
+ assertEquals("jvm.fd.max should be registered if supported", maxFdSupported, registeredGauges.contains("jvm.fd.max"));
assertEquals(
"jvm.file_descriptor.count should be registered if emitting OTel metrics",
- registeredGauges.contains("jvm.file_descriptor.count"),
- fdSupported && emitOTelMetrics
+ openFdSupported && emitOTelMetrics,
+ registeredGauges.contains("jvm.file_descriptor.count")
);
assertEquals(
"jvm.file_descriptor.limit should be registered if emitting OTel metrics",
- registeredGauges.contains("jvm.file_descriptor.limit"),
- fdSupported && emitOTelMetrics
+ maxFdSupported && emitOTelMetrics,
+ registeredGauges.contains("jvm.file_descriptor.limit")
);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@server/src/test/java/org/elasticsearch/monitor/metrics/SystemMetricsTests.java`
around lines 52 - 64, The test currently uses a single fdSupported flag; instead
gate each assertion on its respective probe separately: compute boolean
fdUsedSupported = ProcessProbe.getOpenFileDescriptorCount() >= 0 and boolean
fdMaxSupported = ProcessProbe.getMaxFileDescriptorCount() >= 0, then assert
"jvm.fd.used" and "jvm.file_descriptor.count" against fdUsedSupported (and
fdUsedSupported && emitOTelMetrics for the OTel-name check), and assert
"jvm.fd.max" and "jvm.file_descriptor.limit" against fdMaxSupported (and
fdMaxSupported && emitOTelMetrics for the OTel-name check) so each metric is
validated independently of the other.
| @ClassRule | ||
| // Custom test rule which manually orders test fixtures. A RuleChain won't work here due to the specific order we require, which is: | ||
| // 1. Start the mock APM server | ||
| // 2. Start ES cluster | ||
| // 3. Run the test | ||
| // 4. Stop the mock APM server | ||
| // 5. Stop ES cluster | ||
| public static TestRule rule = (base, description) -> { | ||
| return new Statement() { | ||
| @Override | ||
| public void evaluate() throws Throwable { | ||
| recordingApmServer.before(); | ||
| Statement s = new Statement() { | ||
| @Override | ||
| public void evaluate() throws Throwable { | ||
| try { | ||
| base.evaluate(); | ||
| } finally { | ||
| recordingApmServer.after(); | ||
| } | ||
| } | ||
| }; | ||
| cluster.apply(s, description).evaluate(); | ||
| } | ||
| }; | ||
| }; |
There was a problem hiding this comment.
Add a fallback after() around the whole rule body.
If cluster.apply(...).evaluate() fails during cluster startup, the inner finally never runs because base.evaluate() is never entered. That leaves RecordingApmServer's server/thread alive for the rest of the suite. Since RecordingApmServer.after() is already idempotent, add an outer try/finally as a failure-path cleanup guard and keep the current inner finally to preserve the normal shutdown order.
Suggested fix
public static TestRule rule = (base, description) -> {
return new Statement() {
`@Override`
public void evaluate() throws Throwable {
- recordingApmServer.before();
- Statement s = new Statement() {
- `@Override`
- public void evaluate() throws Throwable {
- try {
- base.evaluate();
- } finally {
- recordingApmServer.after();
- }
- }
- };
- cluster.apply(s, description).evaluate();
+ try {
+ recordingApmServer.before();
+ Statement s = new Statement() {
+ `@Override`
+ public void evaluate() throws Throwable {
+ try {
+ base.evaluate();
+ } finally {
+ recordingApmServer.after();
+ }
+ }
+ };
+ cluster.apply(s, description).evaluate();
+ } finally {
+ recordingApmServer.after();
+ }
}
};
};📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| @ClassRule | |
| // Custom test rule which manually orders test fixtures. A RuleChain won't work here due to the specific order we require, which is: | |
| // 1. Start the mock APM server | |
| // 2. Start ES cluster | |
| // 3. Run the test | |
| // 4. Stop the mock APM server | |
| // 5. Stop ES cluster | |
| public static TestRule rule = (base, description) -> { | |
| return new Statement() { | |
| @Override | |
| public void evaluate() throws Throwable { | |
| recordingApmServer.before(); | |
| Statement s = new Statement() { | |
| @Override | |
| public void evaluate() throws Throwable { | |
| try { | |
| base.evaluate(); | |
| } finally { | |
| recordingApmServer.after(); | |
| } | |
| } | |
| }; | |
| cluster.apply(s, description).evaluate(); | |
| } | |
| }; | |
| }; | |
| `@ClassRule` | |
| // Custom test rule which manually orders test fixtures. A RuleChain won't work here due to the specific order we require, which is: | |
| // 1. Start the mock APM server | |
| // 2. Start ES cluster | |
| // 3. Run the test | |
| // 4. Stop the mock APM server | |
| // 5. Stop ES cluster | |
| public static TestRule rule = (base, description) -> { | |
| return new Statement() { | |
| `@Override` | |
| public void evaluate() throws Throwable { | |
| try { | |
| recordingApmServer.before(); | |
| Statement s = new Statement() { | |
| `@Override` | |
| public void evaluate() throws Throwable { | |
| try { | |
| base.evaluate(); | |
| } finally { | |
| recordingApmServer.after(); | |
| } | |
| } | |
| }; | |
| cluster.apply(s, description).evaluate(); | |
| } finally { | |
| recordingApmServer.after(); | |
| } | |
| } | |
| }; | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@test/external-modules/apm-integration/src/javaRestTest/java/org/elasticsearch/test/apmintegration/MetricsApmIT.java`
around lines 78 - 103, The custom TestRule 'rule' must ensure
recordingApmServer.after() runs even if cluster.apply(...).evaluate() throws;
wrap the call to cluster.apply(s, description).evaluate() in an outer
try/finally that calls recordingApmServer.after() in the finally block, while
keeping the existing inner try/finally around base.evaluate() to preserve normal
shutdown ordering; update the lambda that creates the Statement so the outer
finally is the failure-path cleanup guard for recordingApmServer.
| // 3. Run the test | ||
| // 4. Stop the mock APM server | ||
| // 5. Stop ES cluster | ||
| public static TestRule rule = (base, description) -> { |
There was a problem hiding this comment.
Should we refactor this into a TestRule we can reuse in both places?
|
I'll hold off on this so it goes after #143517 if it's still necessary |
| if (Constants.WINDOWS) { | ||
| // JVM file descriptor metrics are not emitted on Windows. | ||
| valueAssertions.remove("jvm.fd.used"); | ||
| valueAssertions.remove("jvm.fd.max"); |
Gate SystemMetricsTests FD gauge expectations on ProcessProbe support and fix assertEquals expected/actual order for OTel FD metrics. Skip jvm.fd.used / jvm.fd.max assertions on Windows in AbstractMetricsIT. Remove stale MetricsApmIT mute entries (superseded by ApmAgentMetricsIT and OtelMetricsIT). Closes elastic#144166 Closes elastic#144167 Closes elastic#144282 Made-with: Cursor
prdoyle
left a comment
There was a problem hiding this comment.
One question, but I'll just approve because this is a pretty low-consequence PR since it's only fixing tests.
| ) | ||
| ); | ||
|
|
||
| if (Constants.WINDOWS) { |
There was a problem hiding this comment.
Why does this check for Windows, while the other one checks ProcessProbe? I would have expected both of them to check the same thing.
…rics
* upstream/main: (21 commits)
Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:external-basic.topSnippetsFunction} elastic#145353
Mute org.elasticsearch.xpack.esql.qa.mixed.MixedClusterEsqlSpecIT test {csv-spec:external-basic.scoreFunction} elastic#145352
[DiskBBQ] Fix bug in NeighborQueue#popRawAndAddRaw (elastic#145324)
Fix dense_vector default index options when using BFLOAT16 (elastic#145202)
Use checked exceptions in entitlement constructor rules (elastic#145234)
ESQL: DS: datasource file plugins should not return TEXT types (elastic#145334)
Plumb DLM error store through to DlmFrozenTransition classes (elastic#145243)
Make Settings.Builder.remove() fluent (elastic#145294)
Add FLS tests for METRICS_INFO and TS_INFO (elastic#145211)
Fix flaky SecurityFeatureResetTests (elastic#145063)
[DOCS] Fix conflict markers in ESQL processing command list (elastic#145338)
Skip certain metric assertions on Windows (elastic#144933)
[ES|QL] Add schema reconciliation for multi-file external sources (elastic#145220)
Simplify DiskBBQ dynamic visit ratio to linear (elastic#142784)
ESQL: Disallow unmapped_fields=load with partial non-KEYWORD (elastic#144109)
[Transform] Track Linked Projects (elastic#144399)
Fix bulk scoring to process last batch instead of falling through to scalar tail (elastic#145316)
Clean up TickerScheduleEngineTests (elastic#145303)
[CI] ShardBulkInferenceActionFilterIT testRestart - Ensuring that secrets-inference index is available after full restart and unmuting test (elastic#145317)
Add CRUD doc to the DistributedArchitectureGuide (elastic#144710)
...
Make sure we don't check for metrics that are not enabled on Windows.
Closes: #144166
Closes: #144167
Closes: #144282
Closes: #144976
Closes: #145152