-
Notifications
You must be signed in to change notification settings - Fork 25.7k
Convert full cluster restart tests to new rest testing framework #93062
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert full cluster restart tests to new rest testing framework #93062
Conversation
|
Pinging @elastic/es-delivery (Team:Delivery) |
|
I'm blocked currently on an issue where the tests are asserting that the request will result in a warning about trying to access a system index, but no such warning is returned. My knowledge of system indicies is basically nil but as I understand it there was no such thing prior to 7.10, so when upgrading from those versions we also upgrade those old indexes to "system" indexes. As far as I can tell this is done asynchronously after startup so my hunch is that this hasn't completed before we go to run our next test, causing it to fail. I believe the reason why this is just now happening is because we're simply running our tests much sooner after an upgrade because there is no longer the overhead is starting a new test task, forking a jvm, etc after the upgrade completes. I've "confirmed" this by adding a sleep before the failing test that and causes the failures to stop. So what's the best way forward here? Should we add a small delay before these tests? Is there a way to ask ES if the system indicies upgrade has completed? Should I just retry the assertion some number of times? |
|
@droberts195 I'm hitting this error occasionally. I'm thinking this is similar to above, where on "upgrade" there are some processes that happen asynchronously and one side effect of the refactoring of how the upgrade tests work is they now run much sooner after upgrade as there's less overhead between cluster orchestration and test execution. Does this sound right? If so, would wrapping those hidden index assertions in an |
# Conflicts: # test/test-clusters/src/main/java/org/elasticsearch/test/cluster/local/AbstractLocalSpecBuilder.java # test/test-clusters/src/main/java/org/elasticsearch/test/cluster/local/DefaultLocalClusterSpecBuilder.java # test/test-clusters/src/main/java/org/elasticsearch/test/cluster/local/LocalClusterFactory.java # test/test-clusters/src/main/java/org/elasticsearch/test/cluster/local/LocalClusterSpec.java # test/test-clusters/src/main/java/org/elasticsearch/test/cluster/local/LocalSpecBuilder.java
When a cluster is upgraded from a version that didn't have hidden indices/aliases to one that does, we make the relevant ML indices and aliases hidden. However, the steps to do this are split over 5 actions that run sequentially, so the work might take a second or two after the upgraded cluster starts up. This can cause the test that asserts on the final state to fail if it runs very quickly after the upgraded cluster startup. The solution is to retry the test for a few seconds to give a chance for all the upgrade actions to complete. Relates elastic#93062
Yes, definitely. There are 5 steps that happen in sequence, and it's failing on the 3rd one in that build scan you linked: elasticsearch/x-pack/plugin/ml/src/main/java/org/elasticsearch/xpack/ml/MlInitializationService.java Line 203 in cbd615f
Yes, I have opened #93148 to do this. |
The issue with this test failure is actually that we were silently failing to install the plugin under test into the cluster. The root cause here was the FIPS security policy file was not copied into cluster config directory before we attempting to run the plugin installer. Since we pass the FIPS JVM arguments to all CLI tools as well this caused plugin installation to fail. We now ensure that these files are copied before we attempt to run _any_ ES tools. Closes elastic#93303
It's pretty common to run a block of code in a `try ... catch` block that just passes exceptions off to a listener's `onFailure` method. This commit adds a small utility to encapsulate this, enabling some one-liners.
For skipping replication: * ReplicationTracker and Group filter shards that are promotable to primary * Remove unpromotable shards from in sync allocations in metadata * There is a new Refresh action for unpromotable replica shards Fixes ES-4861 For skipping peer recovery: * Unpromotable shards pass directly to STARTED skipping some intermediate peer recovery stages and messages Fixes ES-5257
* Add a section about token-based authentication It took me a considerable time to figure out the syntax for a token-based authentication, and I said why not add it to the documentation * Update x-pack/docs/en/watcher/input/http.asciidoc * Update x-pack/docs/en/watcher/input/http.asciidoc --------- Co-authored-by: Abdon Pijpelink <[email protected]>
This PR adds support for building roles for remote_access authentication instances, under the new remote cluster security model. This change is stand-alone and not wired up to active code flows yet. A proof of concept in elastic#92089 highlights how the model change in this PR fits into the broader context of the fulfilling cluster processing cross cluster requests.
Change ilm and data streams dependencies to be test dependencies.
…lastic#93383) In elastic#93160, we never set the forced_refresh flag in the response. With this change, the bulk response now correctly reflects what happened. It also unblocks a bunch of YAML tests for Stateless. Relates ES-5292
cache potentially duped values in the `DateProcessor`, avoiding the creation of disposable objects during the different executions
This change introduces the configuration option `ignore_missing_component_templates` as discussed in elastic#92426 The implementation [option 6](elastic#92426 (comment)) was picked with a slight adjustment meaning no patterns are allowed. ## Implementation During the creation of an index template, the list of component templates is checked if all component templates exist. This check is extended to skip any component templates which are listed under `ignore_missing_component_templates`. An index template that skips the check for the component template `logs-foo@custom` looks as following: ``` PUT _index_template/logs-foo { "index_patterns": ["logs-foo-*"], "data_stream": { }, "composed_of": ["logs-foo@package", "logs-foo@custom"], "ignore_missing_component_templates": ["logs-foo@custom"], "priority": 500 } ``` The component template `logs-foo@package` has to exist before creation. It can be created with: ``` PUT _component_template/logs-foo@custom { "template": { "mappings": { "properties": { "host.ip": { "type": "ip" } } } } } ``` ## Testing For manual testing, different scenarios can be tested. To simplify testing, the commands from `.http` file are added. Before each test run, a clean cluster is expected. ### New behaviour, missing component template With the new config option, it must be possible to create an index template with a missing component templates without getting an error: ``` ### Add logs-foo@package component template PUT http://localhost:9200/ _component_template/logs-foo@package Authorization: Basic elastic password Content-Type: application/json { "template": { "mappings": { "properties": { "host.name": { "type": "keyword" } } } } } ### Add logs-foo index template PUT http://localhost:9200/ _index_template/logs-foo Authorization: Basic elastic password Content-Type: application/json { "index_patterns": ["logs-foo-*"], "data_stream": { }, "composed_of": ["logs-foo@package", "logs-foo@custom"], "ignore_missing_component_templates": ["logs-foo@custom"], "priority": 500 } ### Create data stream PUT http://localhost:9200/ _data_stream/logs-foo-bar Authorization: Basic elastic password Content-Type: application/json ### Check if mappings exist GET http://localhost:9200/ logs-foo-bar Authorization: Basic elastic password Content-Type: application/json ``` It is checked if all templates could be created and data stream mappings are correct. ### Old behaviour, with all component templates In the following, a component template is made optional but it already exists. It is checked, that it will show up in the mappings: ``` ### Add logs-foo@package component template PUT http://localhost:9200/ _component_template/logs-foo@package Authorization: Basic elastic password Content-Type: application/json { "template": { "mappings": { "properties": { "host.name": { "type": "keyword" } } } } } ### Add logs-foo@custom component template PUT http://localhost:9200/ _component_template/logs-foo@custom Authorization: Basic elastic password Content-Type: application/json { "template": { "mappings": { "properties": { "host.ip": { "type": "ip" } } } } } ### Add logs-foo index template PUT http://localhost:9200/ _index_template/logs-foo Authorization: Basic elastic password Content-Type: application/json { "index_patterns": ["logs-foo-*"], "data_stream": { }, "composed_of": ["logs-foo@package", "logs-foo@custom"], "ignore_missing_component_templates": ["logs-foo@custom"], "priority": 500 } ### Create data stream PUT http://localhost:9200/ _data_stream/logs-foo-bar Authorization: Basic elastic password Content-Type: application/json ### Check if mappings exist GET http://localhost:9200/ logs-foo-bar Authorization: Basic elastic password Content-Type: application/json ``` ### Check old behaviour Ensure, that the old behaviour still exists when a component template is used that is not part of `ignore_missing_component_templates`: ``` ### Add logs-foo index template PUT http://localhost:9200/ _index_template/logs-foo Authorization: Basic elastic password Content-Type: application/json { "index_patterns": ["logs-foo-*"], "data_stream": { }, "composed_of": ["logs-foo@package", "logs-foo@custom"], "ignore_missing_component_templates": ["logs-foo@custom"], "priority": 500 } ``` Co-authored-by: Lee Hinman <[email protected]>
# Conflicts: # test/test-clusters/src/main/java/org/elasticsearch/test/cluster/local/LocalClusterFactory.java
💔 Backport failed
You can use sqren/backport to manually backport by running |
…ork (elastic#93062)" This reverts commit 013b2e5.
This PR converts all existing full cluster restart tests (except the ingest-geoip module) to use the new junit rule-based test clusters framework. The geoip module relies on a existing docker-compose test fixture and we want to move away from those to using TestContainers. That'll get done in a follow up.
The main takeaway here is the logic in build scripts is drastically reduced. This boilerplate is no longer needed since there's only a single test task per BWC version and the upgrade is done inside the test execution itself.