-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(outputs.stackdriver): Options to use official path and types #13454
Conversation
|
||
var kind string | ||
switch m.Type() { | ||
case telegraf.Gauge, telegraf.Untyped: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is saying that when a metric is untyped to send it in as a gauge. From our experience with tons of customers using GMP, this logic will definitely lead to incorrect behavior. Tons of metrics get untyped at the source for reasons unknown to me, and we have to account for this because people really don't want to change their exporters.
The way we handle this with our other collectors is, when a metric is untyped, to send it twice: once with the suffix "unknown" and once with the suffix "unknown:counter". Then we do some heuristic magic on the query side to choose the gauge (unknown) or the counter (unknown:counter) depending on our best guess based on the query functions you use. If you then type the metric, we union the data.
This is a bit larger of a change but without it this commit will 100% lead to unhappy customers on our (Google's) end.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look at the PR and giving insight into how this ends up getting used.
The way we handle this with our other collectors is, when a metric is untyped, to send it twice
Because the two points would have different metric types, both can be sent as part of the same timeseries, the only difference being the suffix of the metric type?
This is a bit larger of a change but without it this commit will 100% lead to unhappy customers on our (Google's) end.
fwiw this change will not be the default behavior and is opt-in. We do not want to change things for existing users, who as of now are not even setting these types.
Tons of metrics get untyped at the source for reasons unknown to me
My understanding is any metric from Telegraf not read from a Prometheus or using the Prometheus parser is internally stored as an untyped metric.
Thanks again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look at the PR and giving insight into how this ends up getting used.
The way we handle this with our other collectors is, when a metric is untyped, to send it twice
Because the two points would have different metric types, both can be sent as part of the same timeseries, the only difference being the suffix of the metric type?
Yeah - you can send the two metrics in the same call for sure. They'd basically be identical time series with just the suffix of the metric name changed. In our system they're treated as two separate metrics (as they have different metric names due to the suffix), so there are no collisions or anything if you send both together.
This is a bit larger of a change but without it this commit will 100% lead to unhappy customers on our (Google's) end.
fwiw this change will not be the default behavior and is opt-in. We do not want to change things for existing users, who as of now are not even setting these types.
Ack - makes sense, thanks.
Tons of metrics get untyped at the source for reasons unknown to me
My understanding is any metric from Telegraf not read from a Prometheus or using the Prometheus parser is internally stored as an untyped metric.
Yeah - we're running into this issue with a different customer using influxdb. One of our official collectors isn't doing the write-twice thing by accident (bug), and they're unable to query counter metrics properly as a result. There also doesn't seem to be a way for them to change it at the source without modifying the exporter code which they're loathe to do, so they're stuck until we can make an upstream change to our collector.
Totally understandable from them and a very common situation, which is why handling untyped metrics properly at ingestion time is so important on our end.
Thanks again!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, I have pushed a change to:
- Mark untyped metrics with "unknown"
- If we are using the official metric name format, and we run across a metric with an unknown kind, then we will create another timeseries, identical, but with the "unknown:counter" kind.
I need to add some additional test cases, but wanted to get this in folks hands quickly to try out.
I have been running this for about 20 hours and everything seems to be working correctly. |
Awesome, thank you for the feedback!
Histograms and summary types are not supported by the plugin and will produce an error. I am sure we could look at changing that in the future with some help. |
No problem, do we have anything outstanding before we merge? @powersj |
I need to update the README with a bit more info and get a final review from @srebhan. I'll work on the README now. Thanks again for the feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Just one small typo...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commited the typo fix... Thanks @powersj!
Download PR build artifacts for linux_amd64.tar.gz, darwin_amd64.tar.gz, and windows_amd64.zip. 📦 Click here to get additional PR build artifactsArtifact URLs |
) (cherry picked from commit 45f9942)
fixes: #13303