Skip to content

Conversation

@LucaCanali
Copy link
Contributor

@LucaCanali LucaCanali commented Oct 30, 2019

What changes were proposed in this pull request?

The Spark metrics system produces many different metrics and not all of them are used at the same time. This proposes to introduce a configuration parameter to allow disabling the registration of metrics in the "static sources" category.

Why are the changes needed?

This allows to reduce the load and clutter on the sink, in the cases when the metrics in question are not needed. The metrics registerd as "static sources" are under the namespaces CodeGenerator and HiveExternalCatalog and can produce a significant amount of data, as they are registered for the driver and executors.

Does this PR introduce any user-facing change?

It introduces a new configuration parameter spark.metrics.register.static.sources.enabled

How was this patch tested?

Manually tested.

$ cat conf/metrics.properties
*.sink.prometheusServlet.class=org.apache.spark.metrics.sink.PrometheusServlet
*.sink.prometheusServlet.path=/metrics/prometheus
master.sink.prometheusServlet.path=/metrics/master/prometheus
applications.sink.prometheusServlet.path=/metrics/applications/prometheus

$ bin/spark-shell

$ curl -s http://localhost:4040/metrics/prometheus/ | grep Hive
metrics_local_1573330115306_driver_HiveExternalCatalog_fileCacheHits_Count 0
metrics_local_1573330115306_driver_HiveExternalCatalog_filesDiscovered_Count 0
metrics_local_1573330115306_driver_HiveExternalCatalog_hiveClientCalls_Count 0
metrics_local_1573330115306_driver_HiveExternalCatalog_parallelListingJobCount_Count 0
metrics_local_1573330115306_driver_HiveExternalCatalog_partitionsFetched_Count 0

$ bin/spark-shell --conf spark.metrics.static.sources.enabled=false
$ curl -s http://localhost:4040/metrics/prometheus/ | grep Hive

@dongjoon-hyun
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Nov 5, 2019

Test build #113241 has finished for PR 26320 at commit 57c713f.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@LucaCanali
Copy link
Contributor Author

Thank you @dongjoon-hyun for reviewing this.

@SparkQA
Copy link

SparkQA commented Nov 5, 2019

Test build #113251 has finished for PR 26320 at commit 7e9e8bb.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Retest this please.

@SparkQA
Copy link

SparkQA commented Nov 7, 2019

Test build #113403 has finished for PR 26320 at commit 7e9e8bb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 7, 2019

Test build #113405 has finished for PR 26320 at commit 5554489.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Nov 8, 2019

Test build #113464 has finished for PR 26320 at commit 1a03124.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class SourceConfigSuite extends SparkFunSuite with LocalSparkContext

assert (metricsSystem.getSourcesByName("CodeGenerator").isEmpty)
assert (metricsSystem.getSourcesByName("HiveExternalCatalog").isEmpty)

sc.stop()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this one too.

val sc = new SparkContext("local", "test", conf)
val metricsSystem = sc.env.metricsSystem

// Static sources should be registered
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be -> should not be.

@dongjoon-hyun
Copy link
Member

Thank you for adding UTs, @LucaCanali .

@SparkQA
Copy link

SparkQA commented Nov 8, 2019

Test build #113473 has finished for PR 26320 at commit 564ec07.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.

I tested this with http://localhost:4040/metrics/prometheus/ manually like the PR description, too.

.createOptional

private[spark] val METRICS_STATIC_SOURCES_ENABLED =
ConfigBuilder("spark.metrics.static.sources.enabled")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain why they are static and the others are not static?

Copy link
Member

@dongjoon-hyun dongjoon-hyun Nov 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apache Spark defines like the following.

private[spark] object StaticSources {
  /**
   * The set of all static sources. These sources may be reported to from any class, including
   * static classes, without requiring reference to a SparkEnv.
   */
  val allSources = Seq(CodegenMetrics, HiveCatalogMetrics)
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spark.metrics.staticSources.enabled might look better. Using static as the name space looks weird. cc @cloud-fan @Ngone51

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

staticSources sounds better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that staticsSources sounds better.
BTW there is probably room for a naming convention for the "switch parameters" for enabling/disabling metrics? Currently we have:
spark.metrics.static.sources.enabled
spark.app.status.metrics.enabled
spark.sql.streaming.metricsEnabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also #26692

@green2k
Copy link

green2k commented Aug 21, 2020

Is there any chance to have this backported to a minor version of Spark 2.x.x (2.4.7) ?

@dongjoon-hyun
Copy link
Member

Since SPARK-29654 is an improvement JIRA, we have no plan for backporting, @green2k . Sorry about that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants