-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-21195][CORE] Dynamically register metrics from sources as they are reported #47352
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the delay in getting to this @robert3005 - somehow it got dropped from my todo list: and thanks a lot for continuing to work on this !
Given this is a long standing issue, I am inclined to merge this into 4.0
+CC @jiangxb1987 who has reviewed an old iteration of this (perhaps ?)
Also +CC @LuciferYang, @dongjoon-hyun who have committed/reviewed some aspects of this - the original contributors are no longer active, so I want to make sure there some additional set of eyes on this.
| private val sinks = new mutable.ArrayBuffer[Sink] | ||
| private val sources = new mutable.ArrayBuffer[Source] | ||
| private val registry = new MetricRegistry() | ||
| private val sourcesWithListeners = new mutable.HashMap[Source, MetricRegistryListener] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| private val sourcesWithListeners = new mutable.HashMap[Source, MetricRegistryListener] | |
| private val sourcesWithListeners = new mutable.ArrayBuffer[(Source, MetricRegistryListener)] |
Let us preserve the current semantics and not introduce assumption that Source has consistent equals/hashCode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for the above comment.
|
Thank you for pinging me, @mridulm . BTW, given that there exist many PRs in the past, I'm wondering if there is a critical objection on this idea before. Do you have to know why this is not accepted before, @robert3005 ? |
|
Afaik there was not, I think we were lacking reviewers. I will address the comments in next couple of days |
|
Thank you for the reply, @robert3005 . Take your time because we still have enough time before the official feature freeze for Apache Spark 4.0.0. Just FYI, I'm the release manager of Apache Spark 4.0.0-preview2. RC1 will arrive next Monday. While I'm looking at the candidate for RC1, I visited here. |
|
Agree with @robert3005, this just did not get enough reviewer attention - not due to objections. |
c37fed4 to
67b5285
Compare
|
I had just realized that this pr is no longer necessary. As of November 2019 when spark updated to dropwizard metrics 4.1.1 the feature that I added here is part of dropwizard metrics. After I had made this pr back in dropwizard metrics 3 days I haven't been verifying that it does produce a noticable difference. I had updated the pr to just include the test (and the constructor change for ease of testing) |
|
Thank you for checking that. Yes, +1 for adding a test case, @robert3005 . |
|
We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable. |
What changes were proposed in this pull request?
MetricSystem picks up new metrics from sources that are added throughout execution. If you do measurements via dynamic proxies you might not want to redeclare all metrics that the proxies will create and you'd prefer them to get populated as they're being produced. Right now all sources are processed only onceat startup and metrics are picked up only if they have been registered statically at compile time. Behaviour I am proposing lets you not have to declare metrics in two places.
This had been previously suggested in #18406, #29980, #31267, #35357, #36889, #38209, #39755, #41120, #42684, #44315 and #45883.
Why are the changes needed?
Currently there's no way to access MetricRegistry that MetricsSystem uses to hold its state and as such it's not possible to reprocess a source. MetricsSystem throws if any metric had already been registered previously.
n.b. the MetricRegistry is added as a constructor argument to make testing easier but could as well be accessed via reflection as a private variable.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Added tests
Was this patch authored or co-authored using generative AI tooling?
No