Skip to content

Conversation

@geigerj0
Copy link
Contributor

@geigerj0 geigerj0 commented Jul 14, 2023

What is this change about?

Include network traffic usage in the when sending App metrics.

What problem it is trying to solve?

Stakeholders can't observe the network traffic usage for a particular app.

What is the impact if the change is not made?

cf tail -c metrics <app-guid> -f will not show rx_bytes and tx_bytes.

How should this change be described in diego-release release notes?

Container network traffic is now being sent to the logging stack.

Please provide any contextual information.

Tag your pair, your PM, and/or team!

@geigerj0 geigerj0 changed the title include network traffic usage when sending app metrics allow sending network traffic usage for app metrics Jul 14, 2023
@geigerj0 geigerj0 marked this pull request as ready for review July 18, 2023 06:50
@geigerj0 geigerj0 requested a review from a team as a code owner July 18, 2023 06:50
client.go Outdated
loggregator.WithGaugeValue("memory_quota", float64(m.MemoryBytesQuota), "bytes"),
loggregator.WithGaugeValue("disk_quota", float64(m.DiskBytesQuota), "bytes"),
loggregator.WithGaugeValue("rx_bytes", float64(m.RxBytes), "bytes"),
loggregator.WithGaugeValue("tx_bytes", float64(m.TxBytes), "bytes"),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the demo of this, it looked like rx_bytes + tx_bytes were cumulative counters (value increasing over the life of the interface/container), rather than gauges (point-in-time values of usage/consumption).

Does it make sense to submit these via loggregator's EmitCounter logic instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the metrics side we prefer modeling everything as counters instead of gauges when possible. When using a counter a lost datapoint results in lower precision, while when using a gauge a lost datapoint is lost forever.

Specifically in this case it's often useful to know how much data a container has transmitted over a given window, which counters make possible.

Copy link
Contributor Author

@geigerj0 geigerj0 Aug 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good points from both of you!

We've changed the tx_bytes and rx_bytes to be sent as counters 👍 . See https://github.com/cloudfoundry/diego-logging-client/pull/82/files#diff-4b667feae66c9d46b21b9ecc19e8958cf4472d162ce0a47ac3e8386af8bbd8cfR246-R253

Example from having this deployed to one of our dev landscapes:

2023-08-10T13:27:50.27+0000 [ping/0] COUNTER rx_bytes:24842
2023-08-10T13:27:50.27+0000 [ping/0] COUNTER tx_bytes:24306

@winkingturtle-vmw
Copy link
Contributor

With this feature, users that deploy this version of diego-release would see significant increase in their foundation's total emitted metrics. This is not always ideal when the cost of consuming those metrics goes up significantly after upgrading to this version. One suggestion would be to move to a model where operators can choose what app-metrics to consume. We can always default to what's being emitted currently and allow the option for anyone to overwrite the default value.

geigerj0 and others added 2 commits August 9, 2023 12:47
@geigerj0
Copy link
Contributor Author

geigerj0 commented Aug 10, 2023

With this feature, users that deploy this version of diego-release would see significant increase in their foundation's total emitted metrics. This is not always ideal when the cost of consuming those metrics goes up significantly after upgrading to this version. One suggestion would be to move to a model where operators can choose what app-metrics to consume. We can always default to what's being emitted currently and allow the option for anyone to overwrite the default value.

Very good point @winkingturtle-vmw!

We implemented a feature flag for turning on/off these metrics, see

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants