From 49f51ff1387aee79fe46911519579e34107df727 Mon Sep 17 00:00:00 2001 From: "Jungtaek Lim (HeartSaVioR)" Date: Thu, 30 Jan 2020 16:23:37 +0900 Subject: [PATCH 1/5] [SPARK-30481][CORE][FOLLOWUP] Document event log compaction into new section of monitor.md --- docs/monitoring.md | 49 ++++++++++++++++++++++++++++++++++------------ 1 file changed, 36 insertions(+), 13 deletions(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index 31bf1ebdecad3..c687ad38e65fe 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -95,6 +95,40 @@ The history server can be configured as follows: +### Applying compaction of old event log files + +A long-running streaming application can bring a huge single event log file which may cost a lot to maintain and +also requires bunch of resource to replay per each update in Spark History Server. + +Enabling spark.eventLog.rolling.enabled and spark.eventLog.rolling.maxFileSize would +let you have multiple event log files instead of single huge event log file which may help some scenarios on its own, +but it still doesn't help you reducing the overall size of logs. + +Spark History Server can apply 'compaction' on the rolling event log files to reduce the overall size of +logs, via setting the configuration spark.history.fs.eventLog.rolling.maxFilesToRetain on the +Spark History Server. + +When the compaction happens, History Server lists the all available event log files, and considers the event log files older than +retained log files as a target of compaction. For example, if the application A has 5 event log files and +spark.history.fs.eventLog.rolling.maxFilesToRetain is set to 2, first 3 log files will be selected to be compacted. + +Once it selects the files, it analyzes these files to figure out which events can be excluded, and rewrites these files +into one compact file with discarding some events. Once rewriting is done, original log files will be deleted. + +The compaction tries to exclude the events which point to the outdated things like jobs, and so on. As of now, below describes +the candidates of events to be excluded: + +* Events for the job which is finished, and related stage/tasks events +* Events for the executor which is terminated +* Events for the SQL execution which is finished, and related job/stage/tasks events + +but the details can be changed afterwards. + +Please note that Spark History Server may not compact the old event log files if figures out not a lot of space +would be reduced during compaction. For streaming query (including Structured Streaming) we normally expect compaction +will run as each micro-batch will trigger one or more jobs which will be finished shortly, but compaction won't run +in many cases for batch query. + ### Spark History Server Configuration Options Security options for the Spark History Server are covered more detail in the @@ -305,19 +339,8 @@ Security options for the Spark History Server are covered more detail in the Int.MaxValue The maximum number of event log files which will be retained as non-compacted. By default, - all event log files will be retained.
- Please note that compaction will happen in Spark History Server, which means this configuration - should be set to the configuration of Spark History server, and the same value will be applied - across applications which are being loaded in Spark History Server. This also means compaction - and cleanup would require running Spark History Server.
- Please set the configuration in Spark History Server, and spark.eventLog.rolling.maxFileSize - in each application accordingly if you want to control the overall size of event log files. - The event log files older than these retained files will be compacted into single file and - deleted afterwards.
- NOTE: Spark History Server may not compact the old event log files if it figures - out not a lot of space would be reduced during compaction. For streaming query - (including Structured Streaming) we normally expect compaction will run, but for - batch query compaction won't run in many cases. + all event log files will be retained. The lowest value is 1 for technical reason.
+ Please read the section of "Applying compaction of old event log files" for more details. From 90a3f82b828ad7e33207e3e2812d42e036334e66 Mon Sep 17 00:00:00 2001 From: "Jungtaek Lim (HeartSaVioR)" Date: Thu, 30 Jan 2020 17:29:54 +0900 Subject: [PATCH 2/5] Add caution message as suggested from https://github.com/apache/spark/pull/27208#discussion_r369677061 --- docs/monitoring.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/docs/monitoring.md b/docs/monitoring.md index c687ad38e65fe..8b096914c92ff 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -129,6 +129,10 @@ would be reduced during compaction. For streaming query (including Structured St will run as each micro-batch will trigger one or more jobs which will be finished shortly, but compaction won't run in many cases for batch query. +Please also note that this is a new feature introduced in Spark 3.0, and may not be completely stable. In some circumstance, +the compaction may exclude more events than you expect, leading some UI issues on History Server for the application. +Use with caution. + ### Spark History Server Configuration Options Security options for the Spark History Server are covered more detail in the From 3e74e05d2f2aad3bb4c37c03c2571e995a97d7c3 Mon Sep 17 00:00:00 2001 From: "Jungtaek Lim (HeartSaVioR)" Date: Fri, 31 Jan 2020 15:01:20 +0900 Subject: [PATCH 3/5] reflect review comments partially --- docs/monitoring.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index 8b096914c92ff..2ae99b7e11630 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -98,7 +98,7 @@ The history server can be configured as follows: ### Applying compaction of old event log files A long-running streaming application can bring a huge single event log file which may cost a lot to maintain and -also requires bunch of resource to replay per each update in Spark History Server. +also requires a bunch of resource to replay per each update in Spark History Server. Enabling spark.eventLog.rolling.enabled and spark.eventLog.rolling.maxFileSize would let you have multiple event log files instead of single huge event log file which may help some scenarios on its own, @@ -108,7 +108,7 @@ Spark History Server can apply 'compaction' on the rolling event log files to re logs, via setting the configuration spark.history.fs.eventLog.rolling.maxFilesToRetain on the Spark History Server. -When the compaction happens, History Server lists the all available event log files, and considers the event log files older than +When the compaction happens, History Server lists all the available event log files, and considers the event log files older than retained log files as a target of compaction. For example, if the application A has 5 event log files and spark.history.fs.eventLog.rolling.maxFilesToRetain is set to 2, first 3 log files will be selected to be compacted. From 803663fd3e8f6d73ee5731f0f5a0228e4f65d776 Mon Sep 17 00:00:00 2001 From: "Jungtaek Lim (HeartSaVioR)" Date: Sun, 16 Feb 2020 14:07:02 +0900 Subject: [PATCH 4/5] Reflect review comments --- docs/monitoring.md | 29 +++++++++++++++++------------ 1 file changed, 17 insertions(+), 12 deletions(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index 2ae99b7e11630..64974e596f0dc 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -95,25 +95,29 @@ The history server can be configured as follows: -### Applying compaction of old event log files +### Applying compaction on rolling event log files -A long-running streaming application can bring a huge single event log file which may cost a lot to maintain and +A long-running application (e.g. streaming) can bring a huge single event log file which may cost a lot to maintain and also requires a bunch of resource to replay per each update in Spark History Server. Enabling spark.eventLog.rolling.enabled and spark.eventLog.rolling.maxFileSize would -let you have multiple event log files instead of single huge event log file which may help some scenarios on its own, +let you have rolling event log files instead of single huge event log file which may help some scenarios on its own, but it still doesn't help you reducing the overall size of logs. Spark History Server can apply 'compaction' on the rolling event log files to reduce the overall size of logs, via setting the configuration spark.history.fs.eventLog.rolling.maxFilesToRetain on the Spark History Server. -When the compaction happens, History Server lists all the available event log files, and considers the event log files older than -retained log files as a target of compaction. For example, if the application A has 5 event log files and -spark.history.fs.eventLog.rolling.maxFilesToRetain is set to 2, first 3 log files will be selected to be compacted. +Details will be described below, but please note in prior that 'compaction' is LOSSY operation. +'Compaction' will discard some events which will be no longer seen on UI - you may want to check which events will be discarded +before enabling the option. -Once it selects the files, it analyzes these files to figure out which events can be excluded, and rewrites these files -into one compact file with discarding some events. Once rewriting is done, original log files will be deleted. +When the compaction happens, the History Server lists all the available event log files for the application, and considers +the event log files having less index than the file with smallest index which will be retained as target of compaction. +For example, if the application A has 5 event log files and spark.history.fs.eventLog.rolling.maxFilesToRetain is set to 2, then first 3 log files will be selected to be compacted. + +Once it selects the target, it analyzes them to figure out which events can be excluded, and rewrites them +into one compact file with discarding events which are decided to exclude. The compaction tries to exclude the events which point to the outdated things like jobs, and so on. As of now, below describes the candidates of events to be excluded: @@ -122,16 +126,17 @@ the candidates of events to be excluded: * Events for the executor which is terminated * Events for the SQL execution which is finished, and related job/stage/tasks events -but the details can be changed afterwards. +Once rewriting is done, original log files will be deleted, via best-effort manner. The History Server may not be able to delete +the original log files, but it will not affect the operation of the History Server. Please note that Spark History Server may not compact the old event log files if figures out not a lot of space -would be reduced during compaction. For streaming query (including Structured Streaming) we normally expect compaction +would be reduced during compaction. For streaming query we normally expect compaction will run as each micro-batch will trigger one or more jobs which will be finished shortly, but compaction won't run in many cases for batch query. -Please also note that this is a new feature introduced in Spark 3.0, and may not be completely stable. In some circumstance, +Please also note that this is a new feature introduced in Spark 3.0, and may not be completely stable. Under some circumstances, the compaction may exclude more events than you expect, leading some UI issues on History Server for the application. -Use with caution. +Use it with caution. ### Spark History Server Configuration Options From a95a4a4c10f3cd1a51cff93d9f19a2a5bbd0a2df Mon Sep 17 00:00:00 2001 From: "Jungtaek Lim (HeartSaVioR)" Date: Tue, 18 Feb 2020 07:39:15 +0900 Subject: [PATCH 5/5] Reflect review comments --- docs/monitoring.md | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index 64974e596f0dc..44406eded70d1 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -104,12 +104,12 @@ Enabling spark.eventLog.rolling.enabled and spark.eventLog.ro let you have rolling event log files instead of single huge event log file which may help some scenarios on its own, but it still doesn't help you reducing the overall size of logs. -Spark History Server can apply 'compaction' on the rolling event log files to reduce the overall size of +Spark History Server can apply compaction on the rolling event log files to reduce the overall size of logs, via setting the configuration spark.history.fs.eventLog.rolling.maxFilesToRetain on the Spark History Server. -Details will be described below, but please note in prior that 'compaction' is LOSSY operation. -'Compaction' will discard some events which will be no longer seen on UI - you may want to check which events will be discarded +Details will be described below, but please note in prior that compaction is LOSSY operation. +Compaction will discard some events which will be no longer seen on UI - you may want to check which events will be discarded before enabling the option. When the compaction happens, the History Server lists all the available event log files for the application, and considers @@ -119,8 +119,7 @@ For example, if the application A has 5 event log files and spark.history. Once it selects the target, it analyzes them to figure out which events can be excluded, and rewrites them into one compact file with discarding events which are decided to exclude. -The compaction tries to exclude the events which point to the outdated things like jobs, and so on. As of now, below describes -the candidates of events to be excluded: +The compaction tries to exclude the events which point to the outdated data. As of now, below describes the candidates of events to be excluded: * Events for the job which is finished, and related stage/tasks events * Events for the executor which is terminated