-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-29654][CORE] Add configuration to allow disabling registration of static sources to the metrics system #26320
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-29654][CORE] Add configuration to allow disabling registration of static sources to the metrics system #26320
Conversation
|
ok to test |
core/src/main/scala/org/apache/spark/internal/config/package.scala
Outdated
Show resolved
Hide resolved
|
Test build #113241 has finished for PR 26320 at commit
|
|
Thank you @dongjoon-hyun for reviewing this. |
|
Test build #113251 has finished for PR 26320 at commit
|
|
Retest this please. |
core/src/main/scala/org/apache/spark/internal/config/package.scala
Outdated
Show resolved
Hide resolved
|
Test build #113403 has finished for PR 26320 at commit
|
|
Test build #113405 has finished for PR 26320 at commit
|
|
Test build #113464 has finished for PR 26320 at commit
|
core/src/test/scala/org/apache/spark/metrics/source/SourceConfigSuite.scala
Show resolved
Hide resolved
| assert (metricsSystem.getSourcesByName("CodeGenerator").isEmpty) | ||
| assert (metricsSystem.getSourcesByName("HiveExternalCatalog").isEmpty) | ||
|
|
||
| sc.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one too.
| val sc = new SparkContext("local", "test", conf) | ||
| val metricsSystem = sc.env.metricsSystem | ||
|
|
||
| // Static sources should be registered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be -> should not be.
|
Thank you for adding UTs, @LucaCanali . |
|
Test build #113473 has finished for PR 26320 at commit
|
dongjoon-hyun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM. Merged to master.
I tested this with http://localhost:4040/metrics/prometheus/ manually like the PR description, too.
| .createOptional | ||
|
|
||
| private[spark] val METRICS_STATIC_SOURCES_ENABLED = | ||
| ConfigBuilder("spark.metrics.static.sources.enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why they are static and the others are not static?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apache Spark defines like the following.
private[spark] object StaticSources {
/**
* The set of all static sources. These sources may be reported to from any class, including
* static classes, without requiring reference to a SparkEnv.
*/
val allSources = Seq(CodegenMetrics, HiveCatalogMetrics)
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spark.metrics.staticSources.enabled might look better. Using static as the name space looks weird. cc @cloud-fan @Ngone51
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
staticSources sounds better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that staticsSources sounds better.
BTW there is probably room for a naming convention for the "switch parameters" for enabling/disabling metrics? Currently we have:
spark.metrics.static.sources.enabled
spark.app.status.metrics.enabled
spark.sql.streaming.metricsEnabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also #26692
|
Is there any chance to have this backported to a minor version of Spark 2.x.x (2.4.7) ? |
|
Since SPARK-29654 is an improvement JIRA, we have no plan for backporting, @green2k . Sorry about that. |
What changes were proposed in this pull request?
The Spark metrics system produces many different metrics and not all of them are used at the same time. This proposes to introduce a configuration parameter to allow disabling the registration of metrics in the "static sources" category.
Why are the changes needed?
This allows to reduce the load and clutter on the sink, in the cases when the metrics in question are not needed. The metrics registerd as "static sources" are under the namespaces CodeGenerator and HiveExternalCatalog and can produce a significant amount of data, as they are registered for the driver and executors.
Does this PR introduce any user-facing change?
It introduces a new configuration parameter
spark.metrics.register.static.sources.enabledHow was this patch tested?
Manually tested.