Skip to content

Enable metrics#324

Merged
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
Danil-Grigorev:enable-metrics
Jul 8, 2020
Merged

Enable metrics#324
openshift-merge-robot merged 2 commits intoopenshift:masterfrom
Danil-Grigorev:enable-metrics

Conversation

@Danil-Grigorev
Copy link

@Danil-Grigorev Danil-Grigorev commented May 21, 2020

Working on https://issues.redhat.com/browse/OCPCLOUD-784

This PR is adding support for reporting following prometheus metrics and also starting controller-runtime metrics server to make these metrics available for prometheus servers:

  • mapi_instance_create_failed: Total count of "create" cloud api errors
  • mapi_instance_update_failed: Total count of "update" cloud api errors
  • mapi_instance_delete_failed: Total count of "delete" cloud api errors

Labels on these metrics (for AWS):

prometheus.Labels{
  "name": machine.Name,
  "namespace": machine.Namespace,
  "reason": error.Reason
}

Depends on PR openshift/machine-api-operator#609 introducing metrics support

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2020
@openshift-ci-robot
Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@Danil-Grigorev Danil-Grigorev marked this pull request as ready for review May 21, 2020 15:25
@openshift-ci-robot openshift-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 21, 2020
@Danil-Grigorev Danil-Grigorev force-pushed the enable-metrics branch 7 times, most recently from 0d6bac3 to 889f784 Compare May 22, 2020 14:11
@Danil-Grigorev
Copy link
Author

Waits for PR openshift/machine-api-operator#590 to merge.

@openshift-ci-robot openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label May 30, 2020
@Danil-Grigorev Danil-Grigorev force-pushed the enable-metrics branch 3 times, most recently from 80b7172 to 2303036 Compare June 4, 2020 10:10
@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 4, 2020
go.mod Outdated

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer this to be moved back to master openshift before merge
Hold until we can identify whether this is absolutely necessary
/hold

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 4, 2020
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this ignoring this error now?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method always returns nil as an error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this related to this commit?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not, but I didn't like it 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how are this metrics actually being exposed?
wouldn't this need to metrics.Registry.MustRegister(failedInstanceCreateCount) or anything?
https://github.com/kubernetes-sigs/controller-runtime/blob/c0438568a706ec61de31b92f4d76e7fb7e1007b9/pkg/internal/controller/metrics/metrics.go#L50

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we need to update the vendor directory as part of this PR?

@enxebre Did you want to separate the providerID change so that it's a separate commit/PR to make it more obvious?

@Danil-Grigorev
Copy link
Author

Did we need to update the vendor directory as part of this PR?

@enxebre Did you want to separate the providerID change so that it's a separate commit/PR to make it more obvious?

Created #331

@Danil-Grigorev
Copy link
Author

/hold Waiting on openshift/machine-api-operator#590

@openshift-ci-robot openshift-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 18, 2020
@JoelSpeed
Copy link

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 18, 2020
@JoelSpeed
Copy link

/retest

1 similar comment
@Danil-Grigorev
Copy link
Author

/retest

Copy link

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

go.mod Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed at all? gcp is not even a fork

@enxebre
Copy link
Member

enxebre commented Jul 1, 2020

Can you please include a link to the mao counter PR in the PR desc and the commit desc?
Other than the above and #324 (comment) lgtm

@openshift-ci-robot openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label Jul 2, 2020
Danil-Grigorev added 2 commits July 6, 2020 19:01
- Report instance creation and deletion failures
- Report loadBalancer operation failures
- Include PR openshift/machine-api-operator#609 introducing metrics support
@Danil-Grigorev
Copy link
Author

@enxebre @JoelSpeed @alexander-demichev I'd like to merge those today. Please approve and LGTM

Copy link

@elmiko elmiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jul 7, 2020
Copy link

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 8, 2020
@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 34dd2b6 into openshift:master Jul 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. release/4.6

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants