Skip to content

add ClusterResourceQuota prometheus alert for overprovisioning#1311

Merged
openshift-merge-robot merged 2 commits intored-hat-storage:masterfrom
anmolsachan:overprovision_alert
Aug 30, 2021
Merged

add ClusterResourceQuota prometheus alert for overprovisioning#1311
openshift-merge-robot merged 2 commits intored-hat-storage:masterfrom
anmolsachan:overprovision_alert

Conversation

@anmolsachan
Copy link
Contributor

@anmolsachan anmolsachan commented Aug 24, 2021

Depends on #1282 . This PR is required to alert users about the thresholds being reached set via the functionality in #1282 .

Signed-off-by: Anmol Sachan anmol13694@gmail.com

@openshift-ci openshift-ci bot requested review from davidvossel and jarrpa August 24, 2021 09:02
@anmolsachan anmolsachan force-pushed the overprovision_alert branch 3 times, most recently from fc5aa97 to 93c4f30 Compare August 24, 2021 10:33
@anmolsachan
Copy link
Contributor Author

@umangachapagain @synarete Please review.

@anmolsachan anmolsachan changed the title add ClusterResourceQuota alert add ClusterResourceQuota prometheus alert for overprovisioning Aug 24, 2021
@umangachapagain umangachapagain added this to the OCS 4.9 milestone Aug 25, 2021
@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 25, 2021
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 26, 2021
@jarrpa jarrpa added mvp Required for the next minimum viable product. priority/1-high labels Aug 26, 2021
Copy link
Member

@jarrpa jarrpa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks sane at first glance, but I'm not an expert in any of this. Will let others LGTM.


// Duration to raise various Alerts
clusterObjectStoreStateAlertTime: '15s',
clusterResourceQuotaAlertTime: '0s',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0s feels too aggressive. Any reason why it can't be 5s or 10s?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the PVC requests are absolute like 50GB or 100GB, and won't grow gradually like usage.

Also taking an example :

If the set quota is 1TB, and already provisioned is 750GB, then if at this moment, another PVC of 100 GB is requested, the total will go to 850G. Then users should immediately be informed about it because all PVC requests above the remaining 150GB will be failed immediately by the ClusterResourceQuota.

Copy link
Contributor

@umangachapagain umangachapagain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good to merge but holding off until dependent PR is merged.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 27, 2021
@umangachapagain
Copy link
Contributor

/hold

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 27, 2021
@jarrpa
Copy link
Member

jarrpa commented Aug 27, 2021

Dependent PR merged.

/hold cancel

@openshift-ci openshift-ci bot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Aug 27, 2021
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 30, 2021
This commit adds prometheus alert to notify the users when the PVC
request for a particular storageclass goes beyond 80% of the limit
set by the user through the ClusterResourceQuota resource.

Signed-off-by: Anmol Sachan <anmol13694@gmail.com>
Signed-off-by: Anmol Sachan <anmol13694@gmail.com>
@agarwal-mudit
Copy link
Member

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 30, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Aug 30, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: agarwal-mudit, umangachapagain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [agarwal-mudit,umangachapagain]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 8c9b10a into red-hat-storage:master Aug 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. mvp Required for the next minimum viable product. priority/1-high

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants