Report resource usage counts by handling heartbeat events#35968
Conversation
|
The PR changelog entry failed validation: Changelog entry not found in the PR body. Please add a "no-changelog" label to the PR, or changelog lines starting with |
457c248 to
f37ed08
Compare
There was a problem hiding this comment.
Do we want to abort the listing operation if there is one bad report in storage? Can we log the failure and keep trying the rest of the reports instead?
There was a problem hiding this comment.
Potentially, but if we make that change we should do it for user activity reports too, so for now I think it's best to be consistent with current behavior.
There was a problem hiding this comment.
Probably something we want to look into. I know we've had various bugs caused by getting resources failing due to one bad resource aborting the entire operation.
There was a problem hiding this comment.
Failing usage data submission will result in a cluster alert, which will hopefully prompt the customer into calling us. That'd still work if we skipped over invalid data, but we would need to tweak the logic around creating and deleting the cluster alerts (to still create one), since we don't want to keep ignoring some logic bug.
f37ed08 to
a03f93d
Compare
espadolini
left a comment
There was a problem hiding this comment.
do-not-merge because the .protos need to be updated in cloud master first, then copied here, other than that LGTM.
Protos were already merged in cloud master (see https://github.com/gravitational/cloud/pull/6823) but I pulled the latest in here (only differences were in comments). |
6e930ff to
89c35c9
Compare
|
cc @timothyb89 - this pulls in some of your new bot protos. Let me know if that's okay. |
I think it's fine, it'll just be competing with #35881 so one of us will have a conflict to resolve 🙂 |
…se-anon-key * origin/master: (344 commits) Undelete CreateHostUserMode_HOST_USER_MODE_DROP (gravitational#36273) allow cwd to be changed in difftest (gravitational#35946) Auth device list component (gravitational#36235) make unified resources responsive (gravitational#35961) Support running Teleport in a "hot reload" mode (gravitational#35040) Prevent deleting enum values, allow deleting enum reservations in types.proto (gravitational#36248) Remove support for legacy (Amazon Linux 2) AMIs (gravitational#36153) Bump version(s) used for teleport-lab and teleport-quickstart (gravitational#36167) Allow Reconciler update handler to examine old value during update (gravitational#36171) Validate the user still exists during account reset (gravitational#35676) ButtonTextWithAddIcon shared component (gravitational#36103) Refactor hostname resolution for SSH connections via the WebUI (gravitational#35773) add structuredClone to jest JSDOMEnvironment (gravitational#36213) fix flaky `lib/auth` cache-enabled tests (gravitational#36216) Report resource usage counts by handling heartbeat events (gravitational#35968) Reviewer bot should use the stable version of Go (gravitational#36242) RFD 0153 Resource Guidelines (gravitational#34103) Use cmp and cmpots properly in operator tests (gravitational#36215) Relax Kubernetes CRD discovery when building cache (gravitational#36214) Add Access List messages to TAG protobuf (gravitational#36176) ...
Buddy PR for #34954
Closes #34954