Mandatory data tier preference

Using data tiers allows allocating indices to dedicated tiers of nodes. Such nodes would typically have different characteristics, either physically (storage type, RAM:storage ratio) or from a usage standpoint (my hot tier is expected to respond fast).

Using data tiers is optional in that using the `data` role will assign all data tiers to the node. However, if a cluster is using separate data tiers it is desirable to be explicit about where a specific index belongs.

Today we allow `index.routing.allocation.include._tier_preference` to be unspecified for an index. This prevents Elasticsearch and its clients from relying on which tier an index/shard is located on, affecting following:

* Autoscaling does not know which data tier to scale up.
* The `_tier` query will not know the tier of an index/shard.

Futhermore, it allows us to rely on this for future developments, such as balancing of shards, UI, monitoring and more. There is no known good use case for a tier-less index and allowing it only adds complexity for ourselves and users and can be considered bad data.

The proposal here is to work towards having `index.routing.allocation.include._tier_preference` be mandatory for all indices in following steps:

- [x] Add a cluster setting to signal that creating new indices should always result in a tier preference. When set, creating an index should add the default tier preference if no explicit preference was given in the request. This will be default off in 7.x, default on in 8.
  - [x] And, in fact, only on allowed in 8. (edit: it's always **treated** as on, in that we disregard the value of the setting)
- [x] Add a deprecation info and warnings in 7.x, only for clusters that have data nodes without all data roles.
  - [x] Add deprecation info for indices without a tier preference set.
  - [x] Add deprecation warning when creating an index results in no tier preference set. This should include create index, rollover and create data stream.
- [x] Make ILM migrate action mandatory in 8.0, regardless of allocate action.
- [x] On 7.x, change the [`migrate_to_data_tiers`](https://www.elastic.co/guide/en/elasticsearch/reference/7.14/ilm-migrate-to-data-tiers.html) API to apply the default data tier preference to any index that results in no tier preference otherwise and set the cluster setting mentioned in the first work item to ensure new indices are assigned a tier preference.

In a future release (possibly 9.0) we should close the loop and:

* Remove the flag from cluster settings.
* Enforce not setting tier preference to null (we could consider doing this in 8.0 too).
* Evaluate at what point we need/want to drop the [`migrate_to_data_tiers`](https://www.elastic.co/guide/en/elasticsearch/reference/7.14/ilm-migrate-to-data-tiers.html) API from the code (8.x? 9.x?)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mandatory data tier preference #76147

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Mandatory data tier preference #76147

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions