Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

autoscaling: add new crd for automatic storage scaling #3044

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

parth-gr
Copy link
Member

added a new crd for automatic storage scaling

@parth-gr
Copy link
Member Author

Please review the new CRD
/assign @BlaineEXE @malayparida2000 @iamniting @nb-ohad @malayparida2000

@parth-gr
Copy link
Member Author

/retest

@iamniting iamniting self-requested a review February 18, 2025 15:32
Copy link
Member

@iamniting iamniting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you pls add all generated changes? Pls look at the failed github action for more details.

@parth-gr parth-gr force-pushed the auto-storage-crd branch 3 times, most recently from 81d56d9 to 2c6af46 Compare February 18, 2025 16:04
@parth-gr parth-gr requested a review from iamniting February 18, 2025 16:07
@parth-gr parth-gr requested a review from alfonsomthd February 19, 2025 08:02
Copy link
Member

@Madhu-1 Madhu-1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@parth-gr parth-gr force-pushed the auto-storage-crd branch 2 times, most recently from 3d3f124 to 37a903a Compare February 19, 2025 09:58
@parth-gr parth-gr requested a review from Madhu-1 February 19, 2025 09:59
@parth-gr
Copy link
Member Author

/assign @travisn @nb-ohad

@parth-gr parth-gr force-pushed the auto-storage-crd branch 3 times, most recently from 8a65924 to aa2e535 Compare February 26, 2025 09:15
@parth-gr parth-gr marked this pull request as draft February 27, 2025 12:05
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 27, 2025
@parth-gr parth-gr marked this pull request as ready for review March 4, 2025 08:37
@parth-gr parth-gr requested a review from travisn March 7, 2025 17:09
Copy link
Contributor

@travisn travisn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Mar 7, 2025
Copy link
Contributor

openshift-ci bot commented Mar 7, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: parth-gr, travisn

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 7, 2025
@parth-gr parth-gr requested a review from BlaineEXE March 7, 2025 17:55
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 7, 2025
Copy link
Contributor

openshift-ci bot commented Mar 7, 2025

New changes are detected. LGTM label has been removed.

@parth-gr parth-gr requested a review from travisn March 7, 2025 18:03
@parth-gr parth-gr requested a review from BlaineEXE March 7, 2025 18:33
@parth-gr parth-gr force-pushed the auto-storage-crd branch 2 times, most recently from f764483 to 0746a48 Compare March 7, 2025 18:40
Comment on lines +78 to +84
// LastExpansionStartTime is the time stamp of the last run start of the storage scaling
// +optional
LastExpansionStartTime metav1.Time `json:"lastExpansionStartTime,omitempty"`
// LastExpansionCompletionTime is the time stamp of the last run completion of the storage scaling
// +optional
LastExpansionCompletionTime metav1.Time `json:"lastExpansionCompletionTime,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest asking @nb-ohad and @iamniting whether they think these time fields should be pointers/nullable so that the output doesn't include the epoch date-time when time is equal to 0.

Suggested change
// LastExpansionStartTime is the time stamp of the last run start of the storage scaling
// +optional
LastExpansionStartTime metav1.Time `json:"lastExpansionStartTime,omitempty"`
// LastExpansionCompletionTime is the time stamp of the last run completion of the storage scaling
// +optional
LastExpansionCompletionTime metav1.Time `json:"lastExpansionCompletionTime,omitempty"`
// LastExpansionStartTime is the time stamp of the last run start of the storage scaling
// +optional
// +nullable
LastExpansionStartTime *metav1.Time `json:"lastExpansionStartTime,omitempty"`
// LastExpansionCompletionTime is the time stamp of the last run completion of the storage scaling
// +optional
// +nullable
LastExpansionCompletionTime *metav1.Time `json:"lastExpansionCompletionTime,omitempty"`

Copy link
Member Author

@parth-gr parth-gr Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its already see we used without pointer in lot of places where we report heartbeat, so not a concern for now, can discuss this async.

Comment on lines 71 to 74
// Phase describes the Phase of StorageAutoScaler
// +kubebuilder:validation:Enum="";'NotStarted;InProgress;Failed;Succeeded
Phase StorageAutoScalerPhase `json:"phase,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a previous note I added as a reply got missed. It'd be best to not enum the phase. This can only introduce errors creating the CRD, and I don't know of any resources that enum status items like this.

For comparison the StorageCluster status phase isn't enum-ed

// Phase describes the Phase of StorageCluster
// This is used by OLM UI to provide status information
// to the user
Phase string `json:"phase,omitempty"`

Copy link
Member Author

@parth-gr parth-gr Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why cant we use enum with +optional or + nullable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a better way,
+kubebuilder:default="NotStarted"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if we can use enum with nullable, since nullable implies a type that is a pointer. It might be valid or might not. But in general, I don't think enums in status items are very helpful. In my experience, they are unlikely to help and likely to cause a problem, so don't bother. They won't help users, since users don't set the status themselves. Also, they can result in the operator being unable to set otherwise valid statuses, resulting in unexpected runtime errors that we need to bug fix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I removed +nullable, and used a default value,

The main reason here is to use enum is that coming from alert team, as alert are based on the phase they had to make sure operator is setting up correctly

Comment on lines 104 to 112
// The start storage capacity is the original storage capacity of the OSDs before the expansion in progress is completed.
// After the expansion is completed, this would be updated to the expected storage capacity.
// +optional
StartStorageCapacity resource.Quantity `json:"startStorageCapacity,omitempty"`
// The Expected storage capacity is the storage capacity that the auto-expansion has decided to set.
// This will be set on the storageCluster CR as the desired storage capacity of the OSDs.
// +optional
ExpectedStorageCapacity resource.Quantity `json:"expectedStorageCapacity,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what benefit this gives us over StartOsdSize and ExpectedOsdSize. I think those are the things that actually control whether expansion will continue or not right?

Doesn't this have room for calculation errors between Ceph's reported total capacity and the PV/PVC capacity?

Copy link
Member Author

@parth-gr parth-gr Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have to use these values with the alerts, #3045 (comment)

@parth-gr parth-gr requested a review from BlaineEXE March 7, 2025 18:52
added a new crd for automatic  storage scaling

Signed-off-by: parth-gr <[email protected]>
@parth-gr
Copy link
Member Author

parth-gr commented Mar 8, 2025

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants