Skip to content

Synthetic id upgrade test in serverless#142471

Merged
burqen merged 23 commits intoelastic:mainfrom
burqen:ap/2026.02.13.synthetic-id-rolling-upgrade-in-serverless
Mar 6, 2026
Merged

Synthetic id upgrade test in serverless#142471
burqen merged 23 commits intoelastic:mainfrom
burqen:ap/2026.02.13.synthetic-id-rolling-upgrade-in-serverless

Conversation

@burqen
Copy link
Copy Markdown
Contributor

@burqen burqen commented Feb 13, 2026

Move TSDBSyntheticIdUpgradeIT to x-pack/plugin/logsdb/qa/rolling-upgrade. This module is shadowed in serverless which means the test will also run in serverless environment.

New parent class does not take care of upgrading the nodes, so this is added to TSDBSyntheticIdUpgradeIT, otherwise the test logic is unchanged.

Relates ES-14000
Continuation of #141525

Move TSDBSyntheticIdUpgradeIT to x-pack/plugin/logsdb/qa/rolling-upgrade
This module is shadowed in serverless which means the test will also run
in serverless environment.

New parent class does not take care of upgrading the nodes, so this is
added to TSDBSyntheticIdUpgradeIT, otherwise the test logic is
unchanged.
@burqen burqen added >test Issues or PRs that are addressing/adding tests :StorageEngine/TSDB You know, for Metrics v9.4.0 labels Feb 13, 2026
@burqen burqen requested review from fcofdez, martijnvg and tlrx February 13, 2026 12:55
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Moves the TSDBSyntheticIdUpgradeIT rolling-upgrade coverage into the LogsDB rolling-upgrade QA module so it also runs in the serverless-shadowed variant, and adapts the test to perform node upgrades itself (since the new base class doesn’t parameterize upgrade phases).

Changes:

  • Relocated/adapted TSDBSyntheticIdUpgradeIT to extend AbstractLogsdbRollingUpgradeTestCase and run upgrades in-test via upgradeNode(i).
  • Added getClusterIndexVersion() helper to AbstractLogsdbRollingUpgradeTestCase to determine the (uniform) cluster index version before upgrades.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
x-pack/plugin/logsdb/qa/rolling-upgrade/src/javaRestTest/java/org/elasticsearch/xpack/logsdb/TSDBSyntheticIdUpgradeIT.java Switches to LogsDB rolling-upgrade base class and rewrites test flow to perform rolling upgrades directly.
x-pack/plugin/logsdb/qa/rolling-upgrade/src/javaRestTest/java/org/elasticsearch/xpack/logsdb/AbstractLogsdbRollingUpgradeTestCase.java Adds a helper to read/assert a uniform cluster IndexVersion from _nodes for pre-upgrade branching logic.
Comments suppressed due to low confidence (2)

x-pack/plugin/logsdb/qa/rolling-upgrade/src/javaRestTest/java/org/elasticsearch/xpack/logsdb/TSDBSyntheticIdUpgradeIT.java:51

  • Comment grammar: "Cluster support" should be "Cluster supports" (or rephrase) to read correctly.
    x-pack/plugin/logsdb/qa/rolling-upgrade/src/javaRestTest/java/org/elasticsearch/xpack/logsdb/TSDBSyntheticIdUpgradeIT.java:35
  • Avoid referencing the inherited static field cluster directly; prefer getCluster() (or qualify with AbstractLogsdbRollingUpgradeTestCase.cluster) so the access is explicit and future overrides/test harness changes are easier.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


public void testRollingUpgrade() throws IOException {
IndexVersion oldClusterIndexVersion = getOldClusterIndexVersion();
IndexVersion oldClusterIndexVersion = getClusterIndexVersion();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a test node feature for synthetic id? Some other bwc intregation tests do:

assumeTrue("...", oldClusterHasFeature(MY_CONSTANT));

If there isn't a node feature, then I think you can use something like: gte_v9.4.0? But maybe it is better to use a node feature, this makes not running integration tests in other contexts easier.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a test node feature in this commit ba04200

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the node feature and relying on for this branch makes the test fail when run against main (9.4.0-snapshot) because main doesn't have the node feature but has support for synthetic id so test expect an exception but cluster happily create index.

I will add the node feature in a different branch and then update this test once the node feature has been merged to avoid this in between state.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 , so when that is merged, then this test can just do:

assumeTrue("old cluster needs to support synthetic id", oldClusterHasFeature(IndexFeatures.TIME_SERIES_SYNTHETIC_ID));

and then there should be no need to check index version and the getClusterIndexVersion() helper method.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most important part of this test is to verify the behavior when upgrading from a version without to a version with the feature and make sure we fail in a way that is expected (fail to create index with reasonable exception message). So assumeTrue would be counter productive 😌

The only thing the version is used for otherwise is the exception message verification but maybe that's not a good enough reason to introduce the getClusterIndexVersion? I'll see what I can do to remove it 👍

burqen and others added 5 commits February 16, 2026 10:35
Add IndexFeatures.TIME_SERIES_SYNTHETIC_ID and use it in
TSDBSyntheticIdUpgradeIT instead of checking index version, so BWC
tests can use oldClusterHasFeature() like other rolling upgrade tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
…exVersion

- Branch test on oldClusterHasFeature(IndexFeatures.TIME_SERIES_SYNTHETIC_ID)
- Relax assertNoWriteIndex: verify illegal_argument_exception and setting
  rejection message without depending on index version text
- Remove getClusterIndexVersion and getOldClusterVersion from base class

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @burqen!

…tatus

randomLongBetween(0, 1000) can yield 0, which triggers the assertion
in IndexShardSnapshotStatus.moveToDone that startTimeMillis != 0.
Use randomLongBetween(1, 1000) so the test satisfies the invariant.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I left a minor comment.

switch (stage) {
case DONE -> {
final long startTimeMillis = randomLongBetween(0, 1000);
final long startTimeMillis = randomLongBetween(1, 1000);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this change intended?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this fixes a test failure. From commit message (beautifully composed by Cursor):

randomLongBetween(0, 1000) can yield 0, which triggers the assertion
in IndexShardSnapshotStatus.moveToDone that startTimeMillis != 0.
Use randomLongBetween(1, 1000) so the test satisfies the invariant.

Copy link
Copy Markdown
Contributor

@fcofdez fcofdez Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can move this into its own commit/PR?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

assertIndexRead("second-mixed-cluster-index");
int numNodes = getCluster().getNumNodes();

if (oldClusterHasFeature(IndexFeatures.TIME_SERIES_SYNTHETIC_ID)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature only guards the test execution?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a substitute for checking index version directly used by other tsdb tests and suggested by @martijnvg . The node feature was introduced in #142581 and only exist in 9.4 and later which makes it work the same way as checking index version directly. Maybe Martijn can elaborate on why this is preferred?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think index versions can also be used, but cluster features are a more straight forward way to determine whether a cluster supports a specific feature? No need to introduce logic to extract index versions and the cluster feature testing infrastructure (e.g. oldClusterHasFeature(...)) can just be reused.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for clarifying Martijn!

@burqen
Copy link
Copy Markdown
Contributor Author

burqen commented Feb 20, 2026

Waiting for #142581 to be promoted to production until merging this.

@fcofdez
Copy link
Copy Markdown
Contributor

fcofdez commented Feb 24, 2026

can't we merge this @burqen?

@burqen
Copy link
Copy Markdown
Contributor Author

burqen commented Feb 24, 2026

can't we merge this @burqen?

It looks like the node feature has finally been promoted to the whole serverless pipeline so we are close! The syntehtic id index feature (#142581) need to be promoted all the way to prod first. Otherwise the test will fail with "expected to throw but nothing was thrown" when testing against serverless prod.

I triggered a new build so check, but I don't think it has been promoted yet.

EDIT: Still waiting for promotion

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 6, 2026

Important

Review skipped

Auto reviews are limited based on label configuration.

🏷️ Required labels (at least one) (2)
  • Team:Delivery
  • Team:Search - Inference

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Path: .coderabbit.yml

Review profile: CHILL

Plan: Pro

Run ID: 0e269b49-07ba-4a98-b842-e44afd5a3fe6

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Tip

Try Coding Plans. Let us write the prompt for your AI agent so you can ship faster (with fewer bugs).
Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@burqen burqen merged commit cc5852b into elastic:main Mar 6, 2026
36 checks passed
spinscale pushed a commit to spinscale/elasticsearch that referenced this pull request Mar 6, 2026
Move TSDBSyntheticIdUpgradeIT to x-pack/plugin/logsdb/qa/rolling-upgrade
This module is shadowed in serverless which means the test will also test rolling
upgrade in serverless.

* Use TIME_SERIES_SYNTHETIC_ID node feature in upgrade test, remove IndexVersion
* Avoid analyse disk size in serverless, because that api is not available
* Add logging to track failure
* Add writing to all existing indices throughout the rolling upgrade

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
szybia added a commit to szybia/elasticsearch that referenced this pull request Mar 6, 2026
…locations

* upstream/main: (153 commits)
  ES|QL: Update docs for TOP_SNIPPETS and DECAY (elastic#143739)
  Correctly include endpoint id in log msg in AuthorizationPoller (elastic#143743)
  Bar searching or sorting on _seq_no when disabled (elastic#143600)
  Generalize `testClientCancellation` test (elastic#143586)
  JSON_EXTRACT: zero-copy byte slicing for object, array, and number extraction (elastic#143702)
  Track recycler pages in circuit breaker (elastic#143738)
  [ESQL] Enable distributed pipeline breakers for external sources via FragmentExec (elastic#143696)
  Adding 'mode' and 'codec' fields to ES monitoring template (elastic#143673)
  [ESQL] Columnar I/O and vectorized block conversion for external sources (elastic#143703)
  Fix flaky MMR diversification YAML tests (elastic#143706)
  ES|QL codegen: check builder arguments for vector support (elastic#143724)
  Add Views Security Model (elastic#141050)
  ESQL: Prevent pushdown of unmapped fields in filters and sorts (elastic#143460)
  Don't run seq_no pruning tests in release CI (elastic#143725)
  ESQL: Support intra-row field references in ROW command (elastic#140217)
  ES|QL: Remove implicit limit in FORK branches in CSV tests (elastic#143601)
  IndexRoutingTests with and without synthetic id (elastic#143566)
  Synthetic id upgrade test in serverless (elastic#142471)
  Disable "Review skipped" comments for PRs without specified labels (elastic#143728)
  Cleanup ES|QL T-Digest code duplication, add memory accounting (elastic#143662)
  ...
sidosera pushed a commit to sidosera/elasticsearch that referenced this pull request Mar 6, 2026
Move TSDBSyntheticIdUpgradeIT to x-pack/plugin/logsdb/qa/rolling-upgrade
This module is shadowed in serverless which means the test will also test rolling
upgrade in serverless.

* Use TIME_SERIES_SYNTHETIC_ID node feature in upgrade test, remove IndexVersion
* Avoid analyse disk size in serverless, because that api is not available
* Add logging to track failure
* Add writing to all existing indices throughout the rolling upgrade

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:StorageEngine/TSDB You know, for Metrics Team:StorageEngine >test Issues or PRs that are addressing/adding tests v9.4.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants