11[role="xpack"]
22[testenv="basic"]
33[[getting-started-snapshot-lifecycle-management]]
4- === Configure snapshot lifecycle policies
4+ === Tutorial: Automate backups with {slm-init}
55
6- Let's get started with {slm} ( {slm-init}) by working through a
7- hands-on scenario. The goal of this example is to automatically back up {es}
8- indices using the <<snapshot-restore,snapshots>> every day at a particular
9- time. Once these snapshots have been created, they are kept for a configured
10- amount of time and then deleted per a configured retention policy .
6+ This tutorial demonstrates how to automate daily backups of {es} indices using an {slm-init} policy.
7+ The policy takes <<modules-snapshots, snapshots>> of all indices in the cluster
8+ and stores them in a local repository.
9+ It also defines a retention policy and automatically deletes snapshots
10+ when they are no longer needed .
1111
12- [float]
12+ To manage snapshots with {slm-init}, you:
13+
14+ . <<slm-gs-register-repository, Register a repository>>.
15+ . <<slm-gs-create-policy, Create an {slm-init} policy>>.
16+
17+ To test the policy, you can manually trigger it to take an initial snapshot.
18+
19+ [discrete]
1320[[slm-gs-register-repository]]
1421==== Register a repository
1522
16- Before we can set up an SLM policy, we'll need to set up a
17- snapshot repository where the snapshots will be
18- stored. Repositories can use {plugins}/repository.html[many different backends],
19- including cloud storage providers. You'll probably want to use one of these in
20- production, but for this example we'll use a shared file system repository:
23+ To use {slm-init}, you must have a snapshot repository configured.
24+ The repository can be local (shared filesystem) or remote (cloud storage).
25+ Remote repositories can reside on S3, HDFS, Azure, Google Cloud Storage,
26+ or any other platform supported by a {plugins}/repository.html[repository plugin].
27+ Remote repositories are generally used for production deployments.
28+
29+ For this tutorial, you can register a local repository from
30+ {kibana-ref}/snapshot-repositories.html[{kib} Management]
31+ or use the put repository API:
2132
2233[source,console]
2334-----------------------------------
@@ -30,19 +41,26 @@ PUT /_snapshot/my_repository
3041}
3142-----------------------------------
3243
33- [float ]
44+ [discrete ]
3445[[slm-gs-create-policy]]
35- ==== Setting up a snapshot policy
46+ ==== Set up a snapshot policy
3647
37- Now that we have a repository in place, we can create a policy to automatically
38- take snapshots. Policies are written in JSON and will define when to take
39- snapshots, what the snapshots should be named, and which indices should be
40- included, among other things. We'll use the <<slm-api-put-policy>> API
41- to create the policy.
48+ Once you have a repository in place,
49+ you can define an {slm-init} policy to take snapshots automatically.
50+ The policy defines when to take snapshots, which indices should be included,
51+ and what to name the snapshots.
52+ A policy can also specify a <<slm-retention,retention policy>> and
53+ automatically delete snapshots when they are no longer needed.
4254
43- When configurating a policy, retention can also optionally be configured. See
44- the <<slm-retention,SLM retention>> documentation for the full documentation of
45- how retention works.
55+ TIP: Don't be afraid to configure a policy that takes frequent snapshots.
56+ Snapshots are incremental and make efficient use of storage.
57+
58+ You can define and manage policies through {kib} Management or with the put policy API.
59+
60+ For example, you could define a `nightly-snapshots` policy
61+ to back up all of your indices daily at 2:30AM UTC.
62+
63+ A put policy request defines the policy configuration in JSON:
4664
4765[source,console]
4866--------------------------------------------------
@@ -62,66 +80,64 @@ PUT /_slm/policy/nightly-snapshots
6280}
6381--------------------------------------------------
6482// TEST[continued]
65- <1> when the snapshot should be taken, using
66- <<schedule-cron,Cron syntax>>, in this
67- case at 1:30AM each day
68- <2> whe name each snapshot should be given, using
69- <<date-math-index-names,date math>> to include the current date in the name
70- of the snapshot
71- <3> the repository the snapshot should be stored in
72- <4> the configuration to be used for the snapshot requests (see below)
73- <5> which indices should be included in the snapshot, in this case, every index
74- <6> Optional retention configuration
75- <7> Keep snapshots for 30 days
76- <8> Always keep at least 5 successful snapshots
77- <9> Keep no more than 50 successful snapshots, even if they're less than 30 days old
78-
79- This policy will take a snapshot of every index each day at 1:30AM UTC.
80- Snapshots are incremental, allowing frequent snapshots to be stored efficiently,
81- so don't be afraid to configure a policy to take frequent snapshots.
82-
83- In addition to specifying the indices that should be included in the snapshot,
84- the `config` field can be used to customize other aspects of the snapshot. You
85- can use any option allowed in <<snapshots-take-snapshot,a regular snapshot
86- request>>, so you can specify, for example, whether the snapshot should fail in
87- special cases, such as if one of the specified indices cannot be found.
88-
89- [float]
83+ <1> When the snapshot should be taken in
84+ <<schedule-cron,Cron syntax>>: daily at 2:30AM UTC
85+ <2> How to name the snapshot: use
86+ <<date-math-index-names,date math>> to include the current date in the snapshot name
87+ <3> Where to store the snapshot
88+ <4> The configuration to be used for the snapshot requests (see below)
89+ <5> Which indices to include in the snapshot: all indices
90+ <6> Optional retention policy: keep snapshots for 30 days,
91+ retaining at least 5 and no more than 50 snapshots regardless of age
92+
93+ You can specify additional snapshot configuration options to customize how snapshots are taken.
94+ For example, you could configure the policy to fail the snapshot
95+ if one of the specified indices is missing.
96+ For more information about snapshot options, see <<snapshots-take-snapshot,snapshot requests>>.
97+
98+ [discrete]
9099[[slm-gs-test-policy]]
91100==== Test the snapshot policy
92101
93- While snapshots taken by SLM policies can be viewed through the standard snapshot
94- API, SLM also keeps track of policy successes and failures in ways that are a bit
95- easier to use to make sure the policy is working. Once a policy has executed at
96- least once, when you view the policy using the <<slm-api-get-policy>>,
97- some metadata will be returned indicating whether the snapshot was sucessfully
98- initiated or not.
102+ A snapshot taken by {slm-init} is just like any other snapshot.
103+ You can view information about snapshots in {kib} Management or
104+ get info with the <<snapshots-monitor-snapshot-restore, snapshot APIs>>.
105+ In addition, {slm-init} keeps track of policy successes and failures so you
106+ have insight into how the policy is working. If the policy has executed at
107+ least once, the <<slm-api-get-policy, get policy>> API returns additional metadata
108+ that shows if the snapshot succeeded.
109+
110+ You can manually execute a snapshot policy to take a snapshot immediately.
111+ This is useful for taking snapshots before making a configuration change,
112+ upgrading, or to test a new policy.
113+ Manually executing a policy does not affect its configured schedule.
99114
100- Instead of waiting for our policy to run, let's tell SLM to take a snapshot
101- as using the configuration from our policy right now instead of waiting for
102- 1:30AM.
115+ For example, the following request manually triggers the `nightly-snapshots` policy:
103116
104117[source,console]
105118--------------------------------------------------
106119POST /_slm/policy/nightly-snapshots/_execute
107120--------------------------------------------------
108121// TEST[skip:we can't easily handle snapshots from docs tests]
109122
110- This request will kick off a snapshot for our policy right now, regardless of
111- the schedule in the policy. This is useful for taking snapshots before making
112- a configuration change, upgrading, or for our purposes, making sure our policy
113- is going to work successfully. The policy will continue to run on its configured
114- schedule after this execution of the policy.
123+
124+ After forcing the `nightly-snapshots` policy to run,
125+ you can retrieve the policy to get success or failure information.
115126
116127[source,console]
117128--------------------------------------------------
118129GET /_slm/policy/nightly-snapshots?human
119130--------------------------------------------------
120131// TEST[continued]
121132
122- This request will return a response that includes the policy, as well as
123- information about the last time the policy succeeded and failed, as well as the
124- next time the policy will be executed.
133+ Only the most recent success and failure are returned,
134+ but all policy executions are recorded in the `.slm-history*` indices.
135+ The response also shows when the policy is scheduled to execute next.
136+
137+ NOTE: The response shows if the policy succeeded in _initiating_ a snapshot.
138+ However, that does not guarantee that the snapshot completed successfully.
139+ It is possible for the initiated snapshot to fail if, for example, the connection to a remote
140+ repository is lost while copying files.
125141
126142[source,console-result]
127143--------------------------------------------------
@@ -143,44 +159,19 @@ next time the policy will be executed.
143159 "max_count": 50
144160 }
145161 },
146- "last_success": { <1>
147- "snapshot_name": "nightly-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a", <2 >
148- "time_string": "2019-04-24T16:43:49.316Z",
162+ "last_success": {
163+ "snapshot_name": "nightly-snap-2019.04.24-tmtnyjtrsxkhbrrdcgg18a", <1 >
164+ "time_string": "2019-04-24T16:43:49.316Z", <2>
149165 "time": 1556124229316
150166 } ,
151- "last_failure": { <3>
152- "snapshot_name": "nightly-snap-2019.04.02-lohisb5ith2n8hxacaq3mw",
153- "time_string": "2019-04-02T01:30:00.000Z",
154- "time": 1556042030000,
155- "details": "{\"type\":\"index_not_found_exception\",\"reason\":\"no such index [important]\",\"resource.type\":\"index_or_alias\",\"resource.id\":\"important\",\"index_uuid\":\"_na_\",\"index\":\"important\",\"stack_trace\":\"[important] IndexNotFoundException[no such index [important]]\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.indexNotFoundException(IndexNameExpressionResolver.java:762)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.innerResolve(IndexNameExpressionResolver.java:714)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver$WildcardExpressionResolver.resolve(IndexNameExpressionResolver.java:670)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndices(IndexNameExpressionResolver.java:163)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:142)\\n\\tat org.elasticsearch.cluster.metadata.IndexNameExpressionResolver.concreteIndexNames(IndexNameExpressionResolver.java:102)\\n\\tat org.elasticsearch.snapshots.SnapshotsService$1.execute(SnapshotsService.java:280)\\n\\tat org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:47)\\n\\tat org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:687)\\n\\tat org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:310)\\n\\tat org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:210)\\n\\tat org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:142)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:150)\\n\\tat org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:188)\\n\\tat org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:688)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:252)\\n\\tat org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:215)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)\\n\\tat java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)\\n\\tat java.base/java.lang.Thread.run(Thread.java:834)\\n\"}"
156- } ,
157- "next_execution": "2019-04-24T01:30:00.000Z", <4>
158- "next_execution_millis": 1556048160000
167+ "next_execution": "2019-04-24T01:30:00.000Z", <3>
168+ "next_execution_millis": 1556048160000
159169 }
160170}
161171--------------------------------------------------
162172// TESTRESPONSE[skip:the presence of last_failure and last_success is asynchronous and will be present for users, but is untestable]
163173
164- <1> information about the last time the policy successfully initated a snapshot
165- <2> the name of the snapshot that was successfully initiated
166- <3> information about the last time the policy failed to initiate a snapshot
167- <4> the next time the policy will execute
168-
169- NOTE: This metadata only indicates whether the request to initiate the snapshot was
170- made successfully or not - after the snapshot has been successfully started, it
171- is possible for the snapshot to fail if, for example, the connection to a remote
172- repository is lost while copying files.
173-
174- If you're following along, the returned SLM policy shouldn't have a `last_failure`
175- field - it's included above only as an example. You should, however, see a
176- `last_success` field and a snapshot name. If you do, you've successfully taken
177- your first snapshot using SLM!
178-
179- While only the most recent sucess and failure are available through the Get Policy
180- API, all policy executions are recorded to a history index, which may be queried
181- by searching the index pattern `.slm-history*`.
174+ <1> The name of the last snapshot that was succesfully initiated by the policy
175+ <2> When the snapshot was initiated
176+ <3> When the policy will initiate the next snapshot
182177
183- That's it! We have our first SLM policy set up to periodically take snapshots
184- so that our backups are always up to date. You can read more details in the
185- <<snapshot-lifecycle-management-api,SLM API documentation>> and the
186- <<modules-snapshots,general snapshot documentation.>>
0 commit comments