-
Notifications
You must be signed in to change notification settings - Fork 8
allow sending network traffic usage for app metrics #82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
allow sending network traffic usage for app metrics #82
Conversation
client.go
Outdated
| loggregator.WithGaugeValue("memory_quota", float64(m.MemoryBytesQuota), "bytes"), | ||
| loggregator.WithGaugeValue("disk_quota", float64(m.DiskBytesQuota), "bytes"), | ||
| loggregator.WithGaugeValue("rx_bytes", float64(m.RxBytes), "bytes"), | ||
| loggregator.WithGaugeValue("tx_bytes", float64(m.TxBytes), "bytes"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the demo of this, it looked like rx_bytes + tx_bytes were cumulative counters (value increasing over the life of the interface/container), rather than gauges (point-in-time values of usage/consumption).
Does it make sense to submit these via loggregator's EmitCounter logic instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the metrics side we prefer modeling everything as counters instead of gauges when possible. When using a counter a lost datapoint results in lower precision, while when using a gauge a lost datapoint is lost forever.
Specifically in this case it's often useful to know how much data a container has transmitted over a given window, which counters make possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good points from both of you!
We've changed the tx_bytes and rx_bytes to be sent as counters 👍 . See https://github.com/cloudfoundry/diego-logging-client/pull/82/files#diff-4b667feae66c9d46b21b9ecc19e8958cf4472d162ce0a47ac3e8386af8bbd8cfR246-R253
Example from having this deployed to one of our dev landscapes:
2023-08-10T13:27:50.27+0000 [ping/0] COUNTER rx_bytes:24842
2023-08-10T13:27:50.27+0000 [ping/0] COUNTER tx_bytes:24306
|
With this feature, users that deploy this version of diego-release would see significant increase in their foundation's total emitted metrics. This is not always ideal when the cost of consuming those metrics goes up significantly after upgrading to this version. One suggestion would be to move to a model where operators can choose what app-metrics to consume. We can always default to what's being emitted currently and allow the option for anyone to overwrite the default value. |
…-traffic-metrics-when-they-are-set-and-emit-them-in-dedicated-envelopes
Very good point @winkingturtle-vmw! We implemented a feature flag for turning on/off these metrics, see |
What is this change about?
Include network traffic usage in the when sending App metrics.
What problem it is trying to solve?
Stakeholders can't observe the network traffic usage for a particular app.
What is the impact if the change is not made?
cf tail -c metrics <app-guid> -fwill not showrx_bytesandtx_bytes.How should this change be described in diego-release release notes?
Container network traffic is now being sent to the logging stack.
Please provide any contextual information.
Tag your pair, your PM, and/or team!