-
Notifications
You must be signed in to change notification settings - Fork 533
WIP: PAO render initial draft #804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/priority important-soon |
|
|
||
| ### Goals | ||
|
|
||
| 1) PAO can execute outside of a cluster via podman/docker, consume user manifests from a directory and generate additional manifests to another (or the same) directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have an issue with saying that we would use a container image, but it feels like a level of detail that fits better in the Implementation section, rather than in the Goals. Placing it here narrows our options in a way that doesn't feel necessary. I might rewrite this as
PAO can be run outside of a cluster to generate manifests to pass to the OpenShift installer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, you are right, but I think it should be part of the design. How would the doc writers know what the expected user flow looks like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we talk about the delivery mechanism down in either the "Proposal" or "Implementation Details/Notes/Constraints" section, all of the information is still available for the tech writers.
| 1) PAO takes all PerformanceProfile yamls from the input directory and generates all the usual manifests as yaml files to the output directory | ||
| 1) When a command line option `--enable-workload-partitioning` is passed to PAO in render mode, it also generates the necessary configuration for the `management` partition to work properly | ||
| 1) PAO at day 2 must not make any changes to the existing generated objects it owns | ||
| 1) PAO sets the ownership references on the generated yamls so the generated files are properly linked to the PerformanceProfiles |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lines 89-92 are implementation details. They could go into the general proposal section around line 101 with a little bit of text to tie them together into a paragraph, or they could go down to line 110.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I would treat these as acceptance criteria. Because it is the "API contract" of the flow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just think you're being too specific, too early.
We usually try to set enough context about the requirements that a reviewer reading the details later can evaluate the choices we're making against the goals to see if they match or if we have gaps. If we set as a goal to implement something a specific way, with specific interfaces, then it's like we're not asking for feedback on the design.
So, for lines 89 and 90 we could say something like
- The PAO running outside of a cluster can provide input to the OpenShift installer to enable workload partitioning during cluster deployment in a way that does not require changes to the OpenShift installer.
Then for 91 maybe
- The configuration from the PAO that is passed to the installer is compatible with the configuration the PAO generates when running inside the cluster.
And after re-reading 92 with those proposed changes, I think it's OK to keep it as it is.
| - 1) [optional] The cluster administrator will collect PAO must-gather output from a test cluster with no partitioning enabled | ||
| 1) [optional] The cluster administrator will execute the Performance profile creator to get well formed PerformanceProfiles | ||
| 1) The PerformanceProfiles are placed into a directory | ||
| 1) PAO is executed using podman run -v /input/dir:/input -v /output/dir:/output registry/performance-addon-operator render [optional --enable-workload-partitioning] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| 1) PAO is executed using podman run -v /input/dir:/input -v /output/dir:/output registry/performance-addon-operator render [optional --enable-workload-partitioning] | |
| 1) PAO is executed using `podman run -v /input/dir:/input -v /output/dir:/output registry/performance-addon-operator render [optional --enable-workload-partitioning]` |
|
|
||
| This feature is activated locally by user and follows the lifecycle of PAO itself. There is no upgrade procedure to perform. | ||
|
|
||
| ### Version Skew Strategy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to worry about matching the PAO version and OCP version? I think we talked about that, but don't remember whether we came to a conclusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the desired version could be passed into the render command so there could be validation against the running image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, the desired version is part of the PAO pull spec, so the validation would basically compare whether a command line argument matches the compiled in version string. We do not support generating older versions, you need to use older PAO for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should list the skew as a risk. We're going to want to make sure the documentation at least explains that the versions have to match.
|
|
||
| Similar to the `Drawbacks` section the `Alternatives` section is used to | ||
| highlight and record other possible approaches to delivering the value proposed | ||
| by an enhancement. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We did identify a couple of alternatives.
- Add PAO to the release payload so the render step could be run during bootstrapping. This was rejected because the PAO is not useful in all, or even most, clusters.
- Wait to enable partitioning until the PAO can be installed into the cluster on day 2. We may yet take this approach in the future, but we still consider rolling out the changes to enable partitioning on a running cluster potentially dangerous and certainly disruptive.
| 1) PAO is executed using podman run -v /input/dir:/input -v /output/dir:/output registry/performance-addon-operator render [optional --enable-workload-partitioning] | ||
| 1) The files from both input and output directories are passed over to the installer | ||
|
|
||
| ### Implementation Details/Notes/Constraints [optional] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest we document the required inputs (performance profile(s), workload_partitioning CR) and the outputs manifests (kubeconfig, tuned etc). The performance profile CR will also need to be an output of running the render command.
| This feature is about adding a `render` mode to PAO that would allow an admin to pre-compute all the Openshift manifests that are needed in a way that prevents any typing mistakes from happening. | ||
| These generated manifests can then be passed to the installer and later taken over by the PAO reconcile loop. | ||
|
|
||
| ## Motivation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to providing a mechanism to enable to workload partioning feature, the application of the manifests at install time will reduce the e2e deployment time for solutions that use pao as there will be fewer reboots. Suggest we capture that as a motivation as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on some of the review feedback on #753, are we sure this will be true immediately? We may need to do some additional work in the MCO to get that benefit. We should still capture it as a goal, but not lose sight of the gap.
|
|
||
| ## Proposal | ||
|
|
||
| ### User Stories |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integrating into an external orchestration system
|
@MarSik: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
Inactive enhancement proposals go stale after 28d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle stale |
|
Stale enhancement proposals rot after 7d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Mark the proposal as fresh by commenting If this proposal is safe to close now please do so with /lifecycle rotten |
|
Rotten enhancement proposals close after 7d of inactivity. See https://github.com/openshift/enhancements#life-cycle for details. Reopen the proposal by commenting /close |
|
@openshift-bot: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
No description provided.