Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve cloud build failure monitoring & alerting rule #48

Open
hyangah opened this issue Oct 25, 2024 · 0 comments
Open

improve cloud build failure monitoring & alerting rule #48

hyangah opened this issue Oct 25, 2024 · 0 comments

Comments

@hyangah
Copy link
Contributor

hyangah commented Oct 25, 2024

Since cloud build does not export success/failure metrics (https://issuetracker.google.com/issues/248609607)
the current rule uses the cloud build log and creates an incident when it sees "ERROR" level log message.
Unfortunately, that doesn't distinguish cases like

  • The internal CB trigger misbehaves and creates duplicate builds on the same commit simultaneously. In that case, only one of them will succeed thank to our own deployment locking mechanism. But the failed builds will fire alerts. (e.g. build id = b3d258c6).

  • When build is cancelled in favor of the following build request (e.g. cls submitted back-to-back??), the cancelled build produces "ERROR" level log and triggers alerts. (e.g. build id = d451fa13).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant