From d39ceada7b8b8d9a6ec8db7675eec6f7388a595b Mon Sep 17 00:00:00 2001 From: Luca Canali Date: Fri, 21 Feb 2020 10:54:02 +0100 Subject: [PATCH 1/5] updated following SPARK-30812 --- docs/monitoring.md | 54 +++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 49 insertions(+), 5 deletions(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index c30aa9967939..45fe994ad69c 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -657,16 +657,60 @@ A list of the available metrics, with a short description: ### Executor Metrics -Executor-level metrics are sent from each executor to the driver as part of the Heartbeat to describe the performance metrics of Executor itself like JVM heap memory, GC information. -Executor metric values and their measured peak values per executor are exposed via the REST API at the end point `/applications/[app-id]/executors`. -In addition, aggregated per-stage peak values of the executor metrics are written to the event log if `spark.eventLog.logStageExecutorMetrics` is true. -Executor metrics are also exposed via the Spark metrics system based on the Dropwizard metrics library. -A list of the available metrics, with a short description: +Executor-level metrics are sent from each executor to the driver as part of the Heartbeat to describe +the performance metrics of Executor itself like JVM heap memory, GC information. +Executor metric values and their measured memory peak values per executor are exposed via the REST API in JSON and Prometheus format. +The JSON end point is exposed at: `/applications/[app-id]/executors`, the Prometheus end point at: `/metrics/executors/prometheus`. +The Prometheus end point is conditional to a configuration parameter: `spark.ui.prometheus.enabled=true` (the default is `false`). +In addition, aggregated per-stage peak values of the executor memory metrics are written to the event log if +`spark.eventLog.logStageExecutorMetrics` is true. +Executor memory metrics are also exposed via the Spark metrics system based on the Dropwizard metrics library. +A list of the available executor metrics, with a short description: + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
Executor Level Metric name Short description
rddBlocksRDD blocks in the block manager of this executor.
memoryUsedStorage memory used by this executor.
diskUsedDisk space used for RDD storage by this executor.
totalCoresNumber of cores available in this executor.
maxTasksMaximum number of tasks that can run concurrently in this executor.
activeTasksNumber of tasks currently executing.
failedTasksNumber of tasks that have failed in this executor.
completedTasksNumber of tasks that have completed in this executor.
totalTasksTotal number of tasks (running, failed and completed) in this executor.
totalDurationElapsed time the JVM spent executing tasks in this executor. + The value is expressed in milliseconds.
totalGCTime Elapsed time the JVM spent in garbage collection summed in this Executor. From d79b840a9292cbfb4dfc529cdd601828148c6e96 Mon Sep 17 00:00:00 2001 From: Luca Canali Date: Mon, 30 Mar 2020 15:05:14 +0200 Subject: [PATCH 2/5] Address review comments. --- docs/monitoring.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index 45fe994ad69c..0abdf133cbd5 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -657,15 +657,14 @@ A list of the available metrics, with a short description: ### Executor Metrics -Executor-level metrics are sent from each executor to the driver as part of the Heartbeat to describe -the performance metrics of Executor itself like JVM heap memory, GC information. -Executor metric values and their measured memory peak values per executor are exposed via the REST API in JSON and Prometheus format. -The JSON end point is exposed at: `/applications/[app-id]/executors`, the Prometheus end point at: `/metrics/executors/prometheus`. -The Prometheus end point is conditional to a configuration parameter: `spark.ui.prometheus.enabled=true` (the default is `false`). +Executor-level metrics are sent from each executor to the driver as part of the Heartbeat to describe the performance metrics of Executor itself like JVM heap memory, GC information. +Executor metric values and their measured memory peak values per executor are exposed via the REST API in JSON format and in Prometheus format. +The JSON end point is exposed at: `/applications/[app-id]/executors`, and the Prometheus endpoint at: `/metrics/executors/prometheus`. +The Prometheus endpoint is conditional to a configuration parameter: `spark.ui.prometheus.enabled=true` (the default is `false`). In addition, aggregated per-stage peak values of the executor memory metrics are written to the event log if `spark.eventLog.logStageExecutorMetrics` is true. Executor memory metrics are also exposed via the Spark metrics system based on the Dropwizard metrics library. -A list of the available executor metrics, with a short description: +A list of the available metrics, with a short description: @@ -679,9 +678,10 @@ A list of the available executor metrics, with a short description: - + - + + From 3ae4a14191bae9f357dee4f6fe56f481f8f22bd8 Mon Sep 17 00:00:00 2001 From: Luca Canali Date: Mon, 30 Mar 2020 15:06:47 +0200 Subject: [PATCH 3/5] typo --- docs/monitoring.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index 0abdf133cbd5..b2fef2a0a085 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -678,7 +678,7 @@ A list of the available metrics, with a short description: - From e48afd6dc5935eb595af27e545cb8e054a2a6397 Mon Sep 17 00:00:00 2001 From: Luca Canali Date: Mon, 30 Mar 2020 15:46:49 +0200 Subject: [PATCH 4/5] Minor additional corrections to typos and use of case. --- docs/monitoring.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index b2fef2a0a085..ff64edf5e3e9 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -718,15 +718,15 @@ A list of the available metrics, with a short description: - + - + - + From 0fc3f2ae1dfe67e9544bad6a8408659bae979484 Mon Sep 17 00:00:00 2001 From: Luca Canali Date: Mon, 30 Mar 2020 15:53:40 +0200 Subject: [PATCH 5/5] Fixed use of case. --- docs/monitoring.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/monitoring.md b/docs/monitoring.md index ff64edf5e3e9..bfa0d3afb349 100644 --- a/docs/monitoring.md +++ b/docs/monitoring.md @@ -713,7 +713,7 @@ A list of the available metrics, with a short description: -
Executor Level Metric namememoryUsed Storage memory used by this executor.
diskUsedDisk space used for RDD storage by this executor.
Disk space used for RDD storage by this executor.
totalCores Number of cores available in this executor.memoryUsed Storage memory used by this executor.
+
diskUsed Disk space used for RDD storage by this executor.
totalInputBytesTotal input bytes summed in this Executor.Total input bytes summed in this executor.
totalShuffleReadTotal shuffer read bytes summed in this Executor.Total shuffle read bytes summed in this executor.
totalShuffleWriteTotal shuffer write bytes summed in this Executor.Total shuffle write bytes summed in this executor.
maxMemory
totalGCTimeElapsed time the JVM spent in garbage collection summed in this Executor. + Elapsed time the JVM spent in garbage collection summed in this executor. The value is expressed in milliseconds.