-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DFBUGS-789: Fix 'ceph_disk_occupation' query expressions #2812
DFBUGS-789: Fix 'ceph_disk_occupation' query expressions #2812
Conversation
@weirdwiz , @jmolmo , @umangachapagain please take a look. |
0b43c92
to
7f6fa87
Compare
I think that the change is ok. As you can see, this metric never had the label "exported_instance", So the change in the label name probably comes from the ODF side. Probably you will need to check and understand when and why this label changed. And after that review that it does not impact in another metrics. |
7f6fa87
to
2d544db
Compare
@aruniiird Would this be a blocker in 4.17? |
Correct @jmolmo . Checked in the ODF / OCS side, couldn't find much. There might be a chance that this records/alerts where not working for a long time. Current changes are working (with this PR) thus enabling those named records and alerts from now on wards. |
@malayparida2000 , this won't be a blocker (as the query may not have worked for some time), but this is a good candidate for a 4.17 z-stream release and for newer (4.18) releases |
We have a customer BZ: DFBUGS-789 , related to this. Can we prioritize this? @malayparida2000 , @umangachapagain , please take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also probably also move these changes to the ceph-mixin repo, if we're keeping that up to date.
/cherry-pick release-4.18 |
@weirdwiz: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/cherry-pick release-4.17 |
@weirdwiz: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/cherry-pick release-4.16 |
@weirdwiz: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@aruniiird: This pull request references [Jira Issue DFBUGS-789](https://issues.redhat.com//browse/DFBUGS-789), which is invalid:
Comment In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Need to address changes in 'ceph_disk_occupation' metric labels. What is the change in 'ceph_disk_occupation' metric? 'ceph_disk_occupation' result no longer has 'exported_instance' label, instead it has 'instance' label. What is the issue we are facing because of it? We are hitting 'PrometheusRuleFailures' due to this new label changes in our alerts / rules, where this metric is used. Second issue is that we are not seeing any results for some of the query expressions. What is the solution? Update the query expressions, change 'exported_instance' to 'instance'. Any 'label_replace' action which changes 'exported_instance' label to 'instance' label is no longer required (as the 'instance' label is directly available now) Signed-off-by: Arun Kumar Mohan <[email protected]>
2d544db
to
a81e357
Compare
@umangachapagain , @malayparida2000 , please take a look. Customers are asking for a solution... |
/jira refresh |
@aruniiird: This pull request references [Jira Issue DFBUGS-789](https://issues.redhat.com//browse/DFBUGS-789), which is valid. 3 validation(s) were run on this bug
In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aruniiird, umangachapagain The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d62b2a2
into
red-hat-storage:main
@aruniiird: [Jira Issue DFBUGS-789](https://issues.redhat.com//browse/DFBUGS-789): All pull requests linked via external trackers have merged: [Jira Issue DFBUGS-789](https://issues.redhat.com//browse/DFBUGS-789) has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@weirdwiz: new pull request created: #2933 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@weirdwiz: new pull request created: #2934 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@weirdwiz: new pull request created: #2935 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Need to address changes in 'ceph_disk_occupation' metric labels.
What is the change in 'ceph_disk_occupation' metric?
'ceph_disk_occupation' result no longer has 'exported_instance' label, instead it has 'instance' label.
What is the issue we are facing because of it?
We are hitting 'PrometheusRuleFailures' due to this new label change in our alerts / rules.
Second issue is that we are not seeing any results for some of the query expressions.
What is the solution?
Update the query expressions, change 'exported_instance' to 'instance'. Any 'label_replace' action which changes 'exported_instance' label to 'instance' label is no longer required (as the 'instance' label is directly available now)