-
Notifications
You must be signed in to change notification settings - Fork 378
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add openmetrics and exemplars support #482
Conversation
Very exciting, thanks for working on this! |
example/server.js
Outdated
@@ -91,6 +92,8 @@ setInterval(() => { | |||
|
|||
server.get('/metrics', async (req, res) => { | |||
try { | |||
// register.contentType = Registry.OPENMETRICS_CONTENT_TYPE; | |||
register.contentType = Registry.PROMETHEUS_CONTENT_TYPE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should go through setContentType
I guess?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yeah there is a lot of inconsistency in how that flag is set, I'll iron them out.
bb10a72
to
6ae459f
Compare
e9492ce
to
e9f6c44
Compare
Hi @zbjornson could you please trigger the tests again? The PR should be ready for review :). |
thanks @voltbit & @zbjornson! we are excited about this feature! |
Hi @zbjornson. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skimmed through, looks good! Lots of code, but existent tests seems to pass, so safe enough?
CHANGELOG.md
Outdated
- feat: new option for calling `observe()` and `inc()` methods on the histogram | ||
and counter metric types that can be passed an object of format | ||
`{labels: a, value: b, exemplarLabels: c}` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be a separate PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the implementation is quite tightly coupled to the feature for exemplars. I though about making this more generic from the beginning but I was worried about the performance impact. I tried to provide 100% backwards compatibility and minimal performance impact with the change.
Right now the call with inc({value, labels, exemplarLabels})
can only be done if the metric was created with exemplars enabled.
If I make this call more generic - allow inc() and observe() to be called with an object - I would need to differentiate between the inc({value, labels, exemplarLabels})
and the call that already exists inc({labelNames})
. This might not be that bad, but it implies a check on the keys of the object every time inc/obs are used. I should also add it to all metrics, not just counter and histogram for the sake of consistency.
I am not sure which is the better aproach, let me know if you have feedback about this.
So the alternative to what we have now in the PR is: implement an object type check on all metric types and allow them to receive a single object containing all the relevant data for ins/obs operations. This might be useful on the long run, but it may as well be a feature that slows down the library for years until we need more fields like we do with exemplars (which are kind of a fringe feature anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the alternative to what we have now in the PR is: implement an object type check on all metric types and allow them to receive a single object containing all the relevant data for ins/obs operations. This might be useful on the long run, but it may as well be a feature that slows down the library for years until we need more fields like we do with exemplars (which are kind of a fringe feature anyway).
One thing we could do is drop support for passing labels
(inc({labelNames})
like you mention). That way you either pass a single value or an object. I don't typeof arg === 'number'
adds much overhead, and then the object shape would be what we want.
Would require a major version of course, but that's not an issue. We wanna drop old nodes anyways
Merging registries of different types is undefined. The user needs to make sure | ||
all used registries have the same type (Prometheus or OpenMetrics versions). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, can we throw a usweful error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Docs should be updated to say it throws, no? Not undefined behavior
Hi guys. Are there, by chance, any updates on how this is progressing and an estimate, hopefully? |
Hi @dnutels I will start working again on this in a couple of weeks, I will implement the changes requested around mid May, but I cant work on it earlier. |
d16917e
to
5e10b87
Compare
New runs for benchmarks with the latest changes. Expand for benchmark resultsSummary:✗ registry ➭ getMetricsAsJSON#1 with 64 is 11.43% slower. Summary:✓ registry ➭ getMetricsAsJSON#1 with 64 is 1.990% faster. Summary:✗ registry ➭ getMetricsAsJSON#1 with 64 is 12.81% slower. |
d24e4fc
to
ebdbc91
Compare
Hi @zbjornson. any news regarding the PR ? |
Bumping for interest |
Bumping for interest again :) |
Bumping for interest as well, this would save my organization so much pain! |
bump for intrest. i really need this |
bump for interest. really important task |
bumping for interest! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this generally LGTM. Would love another set of eyes, tho.
Thanks for a fantastic contribution, @voltbit!
@@ -0,0 +1,66 @@ | |||
import * as prom from '../index'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need both a JS and a TS example - could you remove this one?
@@ -150,16 +185,29 @@ interface MetricConfiguration<T extends string> { | |||
name: string; | |||
help: string; | |||
labelNames?: T[] | readonly T[]; | |||
registers?: Registry[]; | |||
registers?: Registry<PrometheusContentType | OpenMetricsContentType>[]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
registers?: Registry<PrometheusContentType | OpenMetricsContentType>[]; | |
registers?: (Registry<PrometheusContentType> | Registry<OpenMetricsContentType>)[]; |
right? A registry itself cannot contain either
@@ -84,19 +84,30 @@ class AggregatorRegistry extends Registry { | |||
}); | |||
} | |||
|
|||
get contentType() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this needed? won't the class just inherit from super
?
@@ -27,14 +43,21 @@ class Histogram extends Metric { | |||
return acc; | |||
}, {}); | |||
|
|||
this.bucketExemplars = this.upperBounds.reduce((acc, upperBound) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this be created even if config.enableExemplars
isn't true
?
@zbjornson @siimon PTAL 🙂 |
Thanks for all the feedback and all the interest shown! I'll rebase and check the comments as soon as possible (this weekend most likely). |
@voltbit is there anything we can do to help you? |
- Added support for OpenMetrics format, including exemplars - Added support for exemplars with OpenTelemetry tracing data on default metrics - Added the option of passing params as one object to observe() and inc() methods
ebdbc91
to
bd807c6
Compare
@@ -1,6 +1,8 @@ | |||
'use strict'; | |||
|
|||
const OtelApi = require('@opentelemetry/api'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be added to package.json
regContentType !== Registry.PROMETHEUS_CONTENT_TYPE && | ||
regContentType !== Registry.OPENMETRICS_CONTENT_TYPE | ||
) { | ||
throw new TypeError('Content type unsupported'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw new TypeError('Content type unsupported'); | |
throw new TypeError(`Content type ${regContentType} is unsupported`); |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as spam.
This comment was marked as spam.
Superseded by #544 |
Overview
The main purpose of the PR is to enable the use of Exemplars for NodeJS codebase.
Relevant resources:
Grafana exemplars
Prometheus exemplars
OpenMetics spec
Prometheus format
Design
The OpenTelemetry standard is very close to the original Prometheus format. In order to keep the library as backwards compatible as possible the default format is kept unchanged (Prometheus) and the use of exemplars is disabled.
The new features can be toggled:
Each registry instance has an attribute (contentType) that will decide the format.
The two possible formats are defined by the constants
OPENMETRICS_CONTENT_TYPE
andPROMETHEUS_CONTENT_TYPE
which contain the HTTP content type.Future versions should default to the
1.0.0
version.Each metric has a flag for enabling the exemplar, the flag is put on the metrics supeclass for simplicity, but out of the currently implemented metric types only histograms and counters can have exemplars.
The biggest change to the code is the creation of separate functions for Counter increment and Histogram observe. Because the functions need to support a third optional parameter (exemplar labels) I have changed the way parameters are passed to the functions. Instead of using plain
(label, value)
the users will need to provide a single object with the format({labels, value, exemplarLabels})
.The change should not impact existing users, but users who want to use exemplars will need to use the new call format.
Exemplar object
Timestamp - is the time when the exemplar was created
Reference from the Golang client: https://github.com/prometheus/client_golang/blob/1b145cad6847a692bd07e872d64b7102d33213c6/prometheus/histogram.go#L432.
There is a hard 128 UTF-8 character limit on exemplar length.
The labels use for out of the box traces are
traceId
andspanId
, it feels more like JavaScript to me, there is no other reason for the name choice. The golang implementation seems to be usingtraceID
here and the Java impl. usestrace_id
here. The label used for exemplars can be changed in Grafana.Counters in OpenMetrics
Counters have a brekaing change in the form of an enforced
_total
suffix, it is not just a convention anymore. Examples:Prometheus
OpenMetrics
Prometheus ignores the comments related to name and type, but the name of the metrics changes too and has the potential to break dashboards/alerts etc. The current implementation follows the same approach as the Java implementation here:
https://github.com/prometheus/client_java/blob/master/simpleclient/src/main/java/io/prometheus/client/Counter.java#L72-L108
However, instead of applying the suffix at the level of the Counter object, this implementation applyes the change in the Registry object. The disadvantage is that the code is less elegant. The advantage is that the change is not breaking in any way for the existing users - the
_total
suffix will only be enforced by OpenMetrics registries, not Prometheus registries.In the future, when OpenMetrics becomes more widely adopted, the behaviour can be moved inside Counter object and made mandatory.
Benchmarks
Benchmark tests were not changed. They are using the default registry type (Prometheus) and no exemplars, so it is a check to see the impact for current users of the library. Ran 4 tests (results in gists bellow). The highest impact was on the registry benchmark with a ~10-15% performance hit.
https://gist.github.com/voltbit/55bfdafccb5a0458d0b2aff9703dae43
https://gist.github.com/voltbit/1e1097e6400638334e11da52fefcd5d4
https://gist.github.com/voltbit/41828df848a7132c1aad196414ea2d69
https://gist.github.com/voltbit/70539929453a2b95d4f9ac2df6707a9b
TODO
Strategy for registry merge when there are different registry formats_total
suffix on countersNot implemented
_created
suffix on any metrics