Skip to content

Conversation

@LucaCanali
Copy link
Contributor

@LucaCanali LucaCanali commented Nov 27, 2019

What changes were proposed in this pull request?

This proposes to introduce a naming convention for Spark metrics configuration parameters used to enable/disable metrics source reporting using the Dropwizard metrics library: spark.metrics.sourceNameCamelCase.enabled and update 2 parameters to use this naming convention.

Why are the changes needed?

Currently Spark has a few parameters to enable/disable metrics reporting. Their naming pattern is not uniform and this can create confusion. Currently we have:
spark.metrics.static.sources.enabled
spark.app.status.metrics.enabled
spark.sql.streaming.metricsEnabled

Does this PR introduce any user-facing change?

Update parameters for enabling/disabling metrics reporting new in Spark 3.0: spark.metrics.static.sources.enabled -> spark.metrics.staticSources.enabled, spark.app.status.metrics.enabled -> spark.metrics.appStatusSource.enabled.
Note: spark.sql.streaming.metricsEnabled is left unchanged as it is already in use in Spark 2.x.

How was this patch tested?

Manually tested

@cloud-fan
Copy link
Contributor

ok to test


val STREAMING_METRICS_ENABLED =
buildConf("spark.sql.streaming.metricsEnabled")
buildConf("spark.metrics.sparkStreamingSource.enabled")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a breaking change to rename an already released config. Can we leave it as it is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or, what about renaming together at 3.0? We still can support legacy config names together, cannot we?
Also, cc @gatorsmile since he suggest renaming.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old config would have to be deprecated and be an alias for the new one, at least.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. +1 for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fully agree on the need to deprecate the old parameter and adding an alias.
Another point I see about renaming spark.sql.streaming.metricsEnabled to spark.metrics.sparkStreamingSource.enabled, is that the configuration lives in [[SQLConf]] where all configuration parameters appear to have spark.sql as their prefix. For example, would using spark.sql.metrics.sparkStreamingSource.enabled be a better choice? I see pros and cons to that. Any advice?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding the voice of name name here: we may want to differentiate DStream and Structured Streaming. Before the change it is clear that it's for Structured Streaming (sql.streaming) which new name is not clear about that. If we want to only mention streaming it should also mention sql. If we are OK to have structuredStreaming as a part of name, that's fine.

@SparkQA
Copy link

SparkQA commented Nov 27, 2019

Test build #114531 has finished for PR 26692 at commit dd089f0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-30060][CORE][SS][DOC] Implement naming convention for Spark metrics configuration parameters used to enable/disable metrics sources Rename metrics enable/disable configs Nov 27, 2019
@dongjoon-hyun dongjoon-hyun changed the title Rename metrics enable/disable configs [SPARK-30060][CORE][SS] Rename metrics enable/disable configs Nov 27, 2019
.createWithDefault(true)

val STREAMING_METRICS_ENABLED =
buildConf("spark.sql.streaming.metricsEnabled")
Copy link
Member

@HyukjinKwon HyukjinKwon Nov 29, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you want to change this existing naming space, you should change all others consistently, e.g. spark.sql.streaming.fileSource.schema.forceNullable. sql.streaming looks pretty clear to me.

You should also consider moving from SQLConf to SparkConf if you want to change the name space consistently. (e.g., spark.pyspark vs spark.sql...pyspark)

It doesn't look like worth changing to me given the overhead. Let's keep this name space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that changing spark.sql.streaming.metricsEnabled appears to open a can of worms. Maybe we should just stick to the other 2 parameters that are easier to change.

@LucaCanali LucaCanali force-pushed the uniformNamingMetricsEnableParameters branch from dd089f0 to 852e865 Compare November 29, 2019 08:26
@SparkQA
Copy link

SparkQA commented Nov 29, 2019

Test build #114622 has finished for PR 26692 at commit 852e865.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

.intConf
.createWithDefault(Int.MaxValue)

val APP_STATUS_METRICS_ENABLED =
Copy link
Member

@dongjoon-hyun dongjoon-hyun Dec 3, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we are consolidating this, shall we update APP_STATUS_METRICS_ENABLED -> METRICS_APP_STATUS_SOURCE_ENABLED together more consistently?

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me too except @dongjoon-hyun's comment. It should be best to keep PR description updated.

@LucaCanali LucaCanali changed the title [SPARK-30060][CORE][SS] Rename metrics enable/disable configs [SPARK-30060][CORE] Rename metrics enable/disable configs Dec 3, 2019
@SparkQA
Copy link

SparkQA commented Dec 3, 2019

Test build #114756 has finished for PR 26692 at commit 49052de.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

retest this please

@SparkQA
Copy link

SparkQA commented Dec 3, 2019

Test build #114763 has finished for PR 26692 at commit 49052de.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM. Merged to master.
Thank you all!

attilapiros pushed a commit to attilapiros/spark that referenced this pull request Dec 6, 2019
### What changes were proposed in this pull request?
This proposes to introduce a naming convention for Spark metrics configuration parameters used to enable/disable metrics source reporting using the Dropwizard metrics library:   `spark.metrics.sourceNameCamelCase.enabled` and update 2 parameters to use this naming convention.

### Why are the changes needed?
Currently Spark has a few parameters to enable/disable metrics reporting. Their naming pattern is not uniform and this can create confusion.  Currently we have:
`spark.metrics.static.sources.enabled`
`spark.app.status.metrics.enabled`
`spark.sql.streaming.metricsEnabled`

### Does this PR introduce any user-facing change?
Update parameters for enabling/disabling metrics reporting new in Spark 3.0: `spark.metrics.static.sources.enabled` -> `spark.metrics.staticSources.enabled`, `spark.app.status.metrics.enabled`  -> `spark.metrics.appStatusSource.enabled`.
Note: `spark.sql.streaming.metricsEnabled` is left unchanged as it is already in use in Spark 2.x.

### How was this patch tested?
Manually tested

Closes apache#26692 from LucaCanali/uniformNamingMetricsEnableParameters.

Authored-by: Luca Canali <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants