-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Description
Using data tiers allows allocating indices to dedicated tiers of nodes. Such nodes would typically have different characteristics, either physically (storage type, RAM:storage ratio) or from a usage standpoint (my hot tier is expected to respond fast).
Using data tiers is optional in that using the data role will assign all data tiers to the node. However, if a cluster is using separate data tiers it is desirable to be explicit about where a specific index belongs.
Today we allow index.routing.allocation.include._tier_preference to be unspecified for an index. This prevents Elasticsearch and its clients from relying on which tier an index/shard is located on, affecting following:
- Autoscaling does not know which data tier to scale up.
- The
_tierquery will not know the tier of an index/shard.
Futhermore, it allows us to rely on this for future developments, such as balancing of shards, UI, monitoring and more. There is no known good use case for a tier-less index and allowing it only adds complexity for ourselves and users and can be considered bad data.
The proposal here is to work towards having index.routing.allocation.include._tier_preference be mandatory for all indices in following steps:
- Add a cluster setting to signal that creating new indices should always result in a tier preference. When set, creating an index should add the default tier preference if no explicit preference was given in the request. This will be default off in 7.x, default on in 8.
- And, in fact, only on allowed in 8. (edit: it's always treated as on, in that we disregard the value of the setting)
- Add a deprecation info and warnings in 7.x, only for clusters that have data nodes without all data roles.
- Add deprecation info for indices without a tier preference set.
- Add deprecation warning when creating an index results in no tier preference set. This should include create index, rollover and create data stream.
- Make ILM migrate action mandatory in 8.0, regardless of allocate action.
- On 7.x, change the
migrate_to_data_tiersAPI to apply the default data tier preference to any index that results in no tier preference otherwise and set the cluster setting mentioned in the first work item to ensure new indices are assigned a tier preference.
In a future release (possibly 9.0) we should close the loop and:
- Remove the flag from cluster settings.
- Enforce not setting tier preference to null (we could consider doing this in 8.0 too).
- Evaluate at what point we need/want to drop the
migrate_to_data_tiersAPI from the code (8.x? 9.x?)