Skip to content

Rolling upgrade test for synthetic id#141525

Merged
burqen merged 11 commits intoelastic:mainfrom
burqen:ap/2026.01.29.synthetic-id-rolling-upgrade-test
Feb 3, 2026
Merged

Rolling upgrade test for synthetic id#141525
burqen merged 11 commits intoelastic:mainfrom
burqen:ap/2026.01.29.synthetic-id-rolling-upgrade-test

Conversation

@burqen
Copy link
Copy Markdown
Contributor

@burqen burqen commented Jan 29, 2026

Relates ES-14000

We expect clusters on an index version that doesn't have support for synthetic id to throw an exception when trying to create such index. If cluster has support, it should be possible to create the index.

Because synthetic id was partially introduced already in 9.3 we need to introduce a new IndexVersion and introduce a new Setting (change name of existing setting).

By introducing a new Setting we effectively isolate the 9.3 version of the synthetic id feature to the 9.3 branch and re-introduce it here in 9.4. We need to do this to not activate the previous version of the feature in 9.3 in a mixed cluster setting. For now we use a temporary name for this setting with the intent to rename it to the final name when feature is ready for production. The setting is given its final name in this PR, index.mapping.synthetic_id.

The new IndexVersion will prevent the renamed setting from being used until the whole cluster has support for it.

Depending on cluster indexVersion we expect index with
index.mapping.use_synthetic_id=true to either be refused by throwing
exception on creation or be accepted and be available for indexing
documents.
Previous IndexVersion used by synthetic id
was introduced in 9.3 and in order for upgrade
tests to work as we expect we need a new
index version introduced in 9.4.

Also remove assertion in favour for real exception
for more details about failure in tests.
public static final IndexVersion STORE_PATTERN_TEXT_FIELDS_IN_BINARY_DOC_VALUES = def(9_068_0_00, Version.LUCENE_10_3_2);
public static final IndexVersion DISK_BBQ_QUANTIZE_BITS = def(9_069_0_00, Version.LUCENE_10_3_2);
public static final IndexVersion ID_FIELD_USE_ES812_POSTINGS_FORMAT = def(9_070_0_00, Version.LUCENE_10_3_2);
public static final IndexVersion TIME_SERIES_USE_SYNTHETIC_ID_94 = def(9_069_0_00, Version.LUCENE_10_3_2);
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming, one of the hard problems in CS. More than open to suggestions!

@burqen
Copy link
Copy Markdown
Contributor Author

burqen commented Jan 29, 2026

Test still fails on 9.3 because I haven't found a way to turn the feature flag off for 9.3 nodes. To be continued...

public static final boolean TSDB_SYNTHETIC_ID_FEATURE_FLAG = new FeatureFlag("tsdb_synthetic_id").isEnabled();
public static final Setting<Boolean> USE_SYNTHETIC_ID = Setting.boolSetting(
"index.mapping.use_synthetic_id",
"index.mapping.use_synthetic_id_placeholder", // todo: please think long and hard about this name
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just picked a random name for now to be able to move forward. My suggestion is that we use this (or some other) temporary name for now and then when we flip/remove the feature flag we change the real setting name. In this way we get a clean atomic commit when the feature is activated and we avoid the previous trouble of setting being used across versions.

This setting also exist in 9.3 under the previous name. By renaming the
setting we effectively remove the 9.3 feature and reintroduce it under
a different name. We need to do this to not activate the incomplete
feature in 9.3 in a mixed cluster setting.

The new name is a placeholder for now and should be updated when feature
 is ready for production.
@burqen burqen force-pushed the ap/2026.01.29.synthetic-id-rolling-upgrade-test branch from b90143c to e3e000c Compare January 30, 2026 10:51
public static final boolean TSDB_SYNTHETIC_ID_FEATURE_FLAG = new FeatureFlag("tsdb_synthetic_id").isEnabled();
public static final Setting<Boolean> USE_SYNTHETIC_ID = Setting.boolSetting(
"index.mapping.use_synthetic_id",
"index.mapping.use_synthetic_id_placeholder", // todo: Placeholder name, update when feature flag is removed
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just picked a random name for now to be able to move forward. My suggestion is that we use this (or some other) temporary name for now and then when we flip/remove the feature flag we change the real setting name. In this way we get a clean atomic commit when the feature is activated and we avoid the previous trouble of setting being used across versions.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use_synthetic_id_doc_values?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little too internal I think. Our users don't care about the doc values details. How about one of these:

  • enable_synthetic_id
  • synthetic_id (especially since they will be used as "synthetic_id": "true")
  • with_synthetic_id (I don't like this one so much tbh)

In any case, do you agree with the approach to keep the placeholder name for now and then make the switch when we are happy and done with the feature to make it available in production?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming is hard :D, maybe we can pick synthetic_id? I would prefer to avoid changing the setting name more than once

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I've pushed a new commit with the name change

@burqen burqen marked this pull request as ready for review January 30, 2026 10:52
@elasticsearchmachine elasticsearchmachine added the needs:triage Requires assignment of a team area label label Jan 30, 2026
@burqen burqen requested review from fcofdez and tlrx January 30, 2026 10:54
@burqen burqen added >enhancement :StorageEngine/TSDB You know, for Metrics labels Jan 30, 2026
@elasticsearchmachine elasticsearchmachine added Team:StorageEngine and removed needs:triage Requires assignment of a team area label labels Jan 30, 2026
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Hi @burqen, I've created a changelog YAML for you.

Copy link
Copy Markdown
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! I left a few minor comments

private static void assertIndexCanBeCreated(String indexName) throws IOException {
CreateIndexResponse response = null;
try {
response = createSyntheticIdIndex(indexName);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we assert that the index settings include the synthetic id setting?

}

public void testRollingUpgrade() throws IOException {
IndexVersion oldClusterIndexVersion = getOldClusterIndexVersion();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we need to guard this test with assumeTrue("Test should only run with feature flag", IndexSettings.TSDB_SYNTHETIC_ID_FEATURE_FLAG);?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The feature flag(s) that control if the setting can be used or not is in the nodes different processes and might not align with what we see in the test-process. So I don't think we should check it here.

}

if (isUpgradedCluster()) {
assertWriteIndex("upgraded-cluster-index");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we check that once we've upgraded the cluster the disk usage for the inverted index is 0?

import java.util.Locale;
import java.util.StringJoiner;

public class TSDBSyntheticIdUpgradeIT extends AbstractRollingUpgradeTestCase {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we test against the live commits in the serverless environments for test suites in this module. I think we should have that additional test coverage. Many logsdb en tsdb tests are now in x-pack/plugin/logsdb/qa/rolling-upgrade module. This test against stateless versions like the current module , but this module is also mirrored in serverless project.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! I would like to address this in a later PR because it will probably take quite a bit of time for me to dig into this (so much to learn), if you are ok with that? Do you have any further pointers to where I can read more about how we test against live commits in serverless?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is good with me. I think it is a matter of moving this test suite to the x-pack/plugin/logsdb/qa/rolling-upgrade module and adjusting how the tests upgrades nodes. See other test suites in that module.

When new node is master the failure message will be different. The
underlying reason for the failure is that indexVersion is too low. This
manifests as IndexMetadata and IndexSettings having different opinions
about use_synthetic_id.
@burqen
Copy link
Copy Markdown
Contributor Author

burqen commented Feb 2, 2026

Thanks for the feedback @fcofdez I've addressed your comments now, either through code fixes or by answering in the comment.

And also thank you @martijnvg, I commented on your suggestion to improve testing also for serverless.

@burqen burqen requested a review from a team as a code owner February 3, 2026 08:11
@burqen burqen requested a review from fcofdez February 3, 2026 08:12
Copy link
Copy Markdown
Contributor

@fcofdez fcofdez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for all the work to get this through!

@burqen burqen merged commit c1f336b into elastic:main Feb 3, 2026
35 checks passed
lukewhiting pushed a commit to lukewhiting/elasticsearch that referenced this pull request Feb 3, 2026
* Rolling upgrade test for synthetic id

Depending on cluster indexVersion we expect index with
index.mapping.use_synthetic_id=true to either be refused by throwing
exception on creation or be accepted and be available for indexing
documents.

* New IndexVersion for synthetic id

Previous IndexVersion used by synthetic id
was introduced in 9.3 and in order for upgrade
tests to work as we expect we need a new
index version introduced in 9.4.

* Rename `use_synthetic_id` to `synthetic_id`

This setting also exist in 9.3 under the previous name. By renaming the
setting we effectively remove the 9.3 feature and reintroduce it under
a different name. We need to do this to not activate the incomplete
feature in 9.3 in a mixed cluster setting.
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Feb 4, 2026
* Rolling upgrade test for synthetic id

Depending on cluster indexVersion we expect index with
index.mapping.use_synthetic_id=true to either be refused by throwing
exception on creation or be accepted and be available for indexing
documents.

* New IndexVersion for synthetic id

Previous IndexVersion used by synthetic id
was introduced in 9.3 and in order for upgrade
tests to work as we expect we need a new
index version introduced in 9.4.

* Rename `use_synthetic_id` to `synthetic_id`

This setting also exist in 9.3 under the previous name. By renaming the
setting we effectively remove the 9.3 feature and reintroduce it under
a different name. We need to do this to not activate the incomplete
feature in 9.3 in a mixed cluster setting.
@burqen
Copy link
Copy Markdown
Contributor Author

burqen commented Feb 13, 2026

Also test in serverless #142471

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants