Skip to content

Conversation

@andreidan
Copy link
Contributor

@andreidan andreidan commented Nov 10, 2020

This adds support for the searchable_snapshot ILM action in the hot phase.

We define a series of actions that cannot be executed after the index has been
mounted as a searchable snapshot. Namely: freeze, forcemerge, shrink,
and searchable_snapshot (also available in the cold phase).

If by virtue of snapshot/restoring a managed index or updating an ILM policy while it
is executing for an index, these actions could get to be executed on an index that was
mounted as searchable snapshot in the hot phase. If this happens the actions will
skip entirely. ILM will not move into the ERROR step.

Resolves #64656

"allowed after the managed index was mounted as searchable snapshot"));
}

public void testUpdatePolicyToAddPhasesYieldsInvalidActionsToBeSkipped() throws Exception {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakrone do you think we should aim to prevent this case when updating the policy?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think so, I think we should allow a user to remove searchable snapshots from the policy as well as adding other actions (post searchable snapshot for instance) that should run just fine.

StepKey countKey = new StepKey(phase, NAME, SegmentCountStep.NAME);

BranchingStep conditionalSkipShrinkStep = new BranchingStep(preForceMergeBranchingKey, checkNotWriteIndex, nextStepKey,
(index, clusterState) -> clusterState.getMetadata().index(index).getSettings().get("index.store.snapshot.index_name") != null);
Copy link
Contributor Author

@andreidan andreidan Nov 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dakrone there are several setting we configure for a searchable snapshot. I wonder if we should be more strict here and check the index name is configured for this setting? I chose not to as that's an invariant that should be guaranteed by the searchable-snapshots module. Is there a better setting to use here?

What do you think?

@andreidan andreidan requested a review from dakrone November 10, 2020 18:02
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Andrei, I think this is looking good. I left some comments, and I think as followup for this, we should open a separate issue to decide whether we want to add this check behavior for closed indices (I believe we can check the index state for the INDEX_CLOSED block) and skip those.

Another thing that is missing is that we need to add the check in the SearchableSnapshotAction itself, so if a user specifies searchable snapshot in the cold phase but the index is already a searchable snapshot, we skip past it instead of trying to snapshot our snapshot index :)

"allowed after the managed index was mounted as searchable snapshot"));
}

public void testUpdatePolicyToAddPhasesYieldsInvalidActionsToBeSkipped() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No I don't think so, I think we should allow a user to remove searchable snapshots from the policy as well as adding other actions (post searchable snapshot for instance) that should run just fine.

@andreidan
Copy link
Contributor Author

@elasticmachine update branch

@andreidan andreidan marked this pull request as ready for review November 11, 2020 17:03
@andreidan andreidan added :Data Management/ILM+SLM Index and Snapshot lifecycle management v7.11.0 v8.0.0 labels Nov 11, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Nov 11, 2020
@andreidan andreidan requested a review from dakrone November 11, 2020 18:23
Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for iterating on this Andrei! I left a few really tiny comments, but nothing big

public static final String SLM_RETENTION_DURATION = "slm.retention_duration";
public static final String SLM_MINIMUM_INTERVAL = "slm.minimum_interval";

public static final String SNAPSHOT_INDEX_NAME = "index.store.snapshot.index_name";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super minor, but can you add a comment about what this is for (since it's not actually an ILM setting, just one we check)


IMPORTANT: If the `searchable_snapshot` action is used in the `hot` phase the
subsequent phases cannot define any of the `shrink`, `forcemerge`, `freeze` or
`searchable_snapshot` (also available in the cold phase) actions.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a blurb similar to what we have for forcemerge related to requiring the "rollover" action:

"To use the XYZ action in the hot phase, the rollover action must be present. If no rollover action is configured, ILM will reject the policy."

* A {@link LifecycleAction} which shrinks the index.
*/
public class ShrinkAction implements LifecycleAction {
private final Logger logger = LogManager.getLogger(ShrinkAction.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super minor, but this can be static (it looks like it used to be but was dropped when it was moved)

@andreidan andreidan merged commit 7d45355 into elastic:master Nov 12, 2020
andreidan added a commit to andreidan/elasticsearch that referenced this pull request Nov 12, 2020
…lastic#64883)

This adds support for the searchable_snapshot ILM action in the hot phase.

We define a series of actions that cannot be executed after the index has been
mounted as a searchable snapshot. Namely: freeze, forcemerge, shrink,
and searchable_snapshot (also available in the cold phase).

If by virtue of snapshot/restoring a managed index or updating an ILM policy while it
is executing for an index, these actions could get to be executed on an index that was
mounted as searchable snapshot in the hot phase. If this happens the actions will
skip entirely. ILM will not move into the ERROR step.

(cherry picked from commit 7d45355)
Signed-off-by: Andrei Dan <[email protected]>
andreidan added a commit that referenced this pull request Nov 12, 2020
…phase (#64883) (#64988)

This adds support for the searchable_snapshot ILM action in the hot phase.

We define a series of actions that cannot be executed after the index has been
mounted as a searchable snapshot. Namely: freeze, forcemerge, shrink,
and searchable_snapshot (also available in the cold phase).

If by virtue of snapshot/restoring a managed index or updating an ILM policy while it
is executing for an index, these actions could get to be executed on an index that was
mounted as searchable snapshot in the hot phase. If this happens the actions will
skip entirely. ILM will not move into the ERROR step.

(cherry picked from commit 7d45355)
Signed-off-by: Andrei Dan <[email protected]>

* Use org.elasticsearch.common.collect collections (and Map)
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Jun 4, 2021
This changes the `CopyExecutionStateStep` to copy all of the existing state from the original to the
new index after a shrink or searchable snapshot ILM action.

It also has two other changes
First, it fixes issues in `LifecycleExecutionState.equals` and `hashCode` where parts were missing,
and modifies the tests better so they would catch these issues better.

Second, it moves the lifecycle policy validation out of the constructor to a new method. This is
required because `PolicyStepsRegistry` constructs a new `LifecyclePolicy` out of the cached phase
JSON combined with the policy from the cluster state. In the event that the cached JSON contains an
action that prevents subsequent actions from being defined, we need to leniently allow this
behavior, because those steps will be ignored (see elastic#64883 for this original implementation). The
policy is still validated when loaded elsewhere however.

Resolves elastic#73791
dakrone added a commit that referenced this pull request Jun 7, 2021
This changes the CopyExecutionStateStep to copy all of the existing state from the original to the
new index after a shrink or searchable snapshot ILM action.

It also has two other changes
First, it fixes issues in LifecycleExecutionState.equals and hashCode where parts were missing,
and modifies the tests better so they would catch these issues better.

Second, it moves the lifecycle policy validation out of the constructor to a new method. This is
required because PolicyStepsRegistry constructs a new LifecyclePolicy out of the cached phase
JSON combined with the policy from the cluster state. In the event that the cached JSON contains an
action that prevents subsequent actions from being defined, we need to leniently allow this
behavior, because those steps will be ignored (see #64883 for this original implementation). The
policy is still validated when loaded elsewhere however.

Resolves #73791
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Jun 7, 2021
This changes the CopyExecutionStateStep to copy all of the existing state from the original to the
new index after a shrink or searchable snapshot ILM action.

It also has two other changes
First, it fixes issues in LifecycleExecutionState.equals and hashCode where parts were missing,
and modifies the tests better so they would catch these issues better.

Second, it moves the lifecycle policy validation out of the constructor to a new method. This is
required because PolicyStepsRegistry constructs a new LifecyclePolicy out of the cached phase
JSON combined with the policy from the cluster state. In the event that the cached JSON contains an
action that prevents subsequent actions from being defined, we need to leniently allow this
behavior, because those steps will be ignored (see elastic#64883 for this original implementation). The
policy is still validated when loaded elsewhere however.

Resolves elastic#73791
# Conflicts:
#	x-pack/plugin/ilm/qa/multi-node/src/javaRestTest/java/org/elasticsearch/xpack/ilm/actions/SearchableSnapshotActionIT.java
dakrone added a commit to dakrone/elasticsearch that referenced this pull request Jun 7, 2021
This changes the CopyExecutionStateStep to copy all of the existing state from the original to the
new index after a shrink or searchable snapshot ILM action.

It also has two other changes
First, it fixes issues in LifecycleExecutionState.equals and hashCode where parts were missing,
and modifies the tests better so they would catch these issues better.

Second, it moves the lifecycle policy validation out of the constructor to a new method. This is
required because PolicyStepsRegistry constructs a new LifecyclePolicy out of the cached phase
JSON combined with the policy from the cluster state. In the event that the cached JSON contains an
action that prevents subsequent actions from being defined, we need to leniently allow this
behavior, because those steps will be ignored (see elastic#64883 for this original implementation). The
policy is still validated when loaded elsewhere however.

Resolves elastic#73791
# Conflicts:
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/LifecyclePolicy.java
#	x-pack/plugin/ilm/qa/multi-node/src/javaRestTest/java/org/elasticsearch/xpack/ilm/actions/SearchableSnapshotActionIT.java
dakrone added a commit that referenced this pull request Jun 7, 2021
…73849)

This changes the CopyExecutionStateStep to copy all of the existing state from the original to the
new index after a shrink or searchable snapshot ILM action.

It also has two other changes
First, it fixes issues in LifecycleExecutionState.equals and hashCode where parts were missing,
and modifies the tests better so they would catch these issues better.

Second, it moves the lifecycle policy validation out of the constructor to a new method. This is
required because PolicyStepsRegistry constructs a new LifecyclePolicy out of the cached phase
JSON combined with the policy from the cluster state. In the event that the cached JSON contains an
action that prevents subsequent actions from being defined, we need to leniently allow this
behavior, because those steps will be ignored (see #64883 for this original implementation). The
policy is still validated when loaded elsewhere however.

Resolves #73791
# Conflicts:
#	x-pack/plugin/core/src/main/java/org/elasticsearch/xpack/core/ilm/LifecyclePolicy.java
#	x-pack/plugin/ilm/qa/multi-node/src/javaRestTest/java/org/elasticsearch/xpack/ilm/actions/SearchableSnapshotActionIT.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement Team:Data Management Meta label for data/management team v7.11.0 v8.0.0-alpha1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add searchable snapshot ILM action to the hot phase

5 participants