-
Notifications
You must be signed in to change notification settings - Fork 146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for Deploying Terminating Pods as Brokers #437
Support for Deploying Terminating Pods as Brokers #437
Conversation
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
One way would be to make the Configuration CRD change in a first one. And then add the implementation in another. |
Good point. One thing we could do is never delete a job -- let them stick around as "Completed" (which may be useful anyways for logs purposes) -- and jsut deploy the new job and add an anti-affinity that will not run it so long as another Job is running. |
controller/src/util/config_action.rs
Outdated
); | ||
} | ||
Some(broker) => { | ||
if !should_recreate_config(&config, &prev_config_state) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, it has been a while since i've looked at this code ... should_recreate_config is confusing to me.
here we are only acting if it returns false. it seems like we used to do nothing if it returned false.
the name is also confusing to me. are we actually recreating a Configuration? or are we just executing code as though the Configuration is deleted and recreated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i agree the name is unhelpful. It works with the agent's context but not the controller. It may be best to rename it something like broker_changed
or more specifically only_broker_changed
. This check is done by the agent upon a config change. If true, it does not recreate the configuration rather lets the change stand (no further action). The controller does this check, and if true, it redeploys the brokers to reflect the new config. This is our first step at more explicit configuration modification handling. For this PR, we could leave this out and stick to the old way of deleting and reapplying a configuration, but it means instances will be brought down and redeployed. We need to figure out how to handle this gracefully and this was my first stab at it. it does mean though that if the controller goes down for some reason, a configuration may be applied that does not reflect the system (since the controller was not there to redeploy the brokers)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bfjelds @jiria @romoh on this note, i am thinking that at least for this PR, we should remove support for handling broker changes and stick to our old approach of the agent delete and reapplying modified configurations. This would make this PR smaller, as it would remove the configuration watcher from the controller, removing the "Configuration Event" section of the flow diagram here for the time being.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All in favor of splitting into smaller prs.
Another thought on how to handle controller restarts, we could cache the config that was used to create a broker as part of broker annotation. So then controller could check if the broker is based on the latest configuration or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. That can also done simply via the Job name. I am using the Configuration generation in the Job name. However, someone could feasibly delete a configuration and reapply a new one while the Controller is down and then the Controller would wrongly think it has the right state.
@jiria anti-affinity cannot be set in a JobSpec; however, we could set it in the podspecs as illustrated in the "Note on future extension" section of the PR description above. However, that would restrict any two pods of a job from running on a node, which seems okay, but there may be a scenario (doesnt seem likely though) where someone may want two identical pods |
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
@romoh @jiria @bfjelds to reduce the size of this PR, I have removed the Configuration watcher (config_action.rs, configuration_state.rs) from the Controller and ignoring brokerSpec changes in the agent (config_action.rs) -- motivated by this note. This not only was this the section that was the most confusing (Controller and Agent coordinating configuraiton recreation, handling letting jobs finish before deleting them, managing configuration state) but it is also hard to justify. It was added only to enable modifying a brokerSpec without bringing down and recreating instances. Since instances are still directly tied to a Configuration, this seems unneccesary. I have updated the PR description -- the examples point to newly built containers. There should now be much less to review. I updated the proposal and added a section on concerns over gracefully handling Configuration changes as well. Gracefully handling Configuration changes is a discussion that we can leave for another proposal and PR. |
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Signed-off-by: Kate Goldenring <[email protected]>
Why do we need the anti-affinity in this case? Re cleanup, imho some cleanup might be useful, but perhaps not killing the running jobs. Also, from another pov, we might not want to have two firmware updates running at the same time. Is there a good way for the two jobs to signal each other? |
Signed-off-by: Kate Goldenring <[email protected]>
@jiria per this comment and the changes associated with it, I removed all gracefully handling of Configuration changes from this PR. As this thread shows, we have plenty to discuss here. For now, they will be deleted immediately upon Configuration changes just like pods are. We dont do anything graceful from them currently either. See this diagram from the proposal PR for a visual. As to your questions, "anti-affinity" was the signalling i was hoping for. aka i was looking for a way for a job to know another job is running and therefore not run. Anti-affinity cannot do that for jobs. I agree that we should think this through for firmware update settings and make a plan. |
// Misplaced `resources` | ||
// Valid: .request.object.spec.brokerSpec.brokerPodSpec.containers[*].resources | ||
// Invalid: .request.object.spec.brokerSpec.brokerPodSpec.resources | ||
const INVALID_BROKER_POD_SPEC: &str = r#" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it woud be good to have an invalid test where both job and pod specs are defined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that type of request would never even get to the admission controller because it would never be correctly applied to the cluster. I believe a Configuration must successfully be applied (match expected CRD format) and then is passed to an admission controller as an AdmissionReview
. So this test is better suited as a unit test in configuration.rs, which affirms that that will not pass the expectation that it is of type Configuration CRD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had the wrong thinking around this. Currently, our CRD does not specify oneOf
with regards to brokerSpec
. It seems that we cannot use the oneOf
structural schema specifier since we need to set x-kubernetes-preserve-unknown-fields: true
. We should do more investigation here to see if we can limit in our Configuration CRD specifying both types of brokers -- that way the error isnt caught by our application and they error out. For now, here is the test that asserts that serde will catch this and error out the Agent: 20422dc
Signed-off-by: Kate Goldenring <[email protected]>
…ec are specified Signed-off-by: Kate Goldenring <[email protected]>
What this PR does / why we need it:
This adds support for deploying Jobs to devices discovered by the Akri Controller. Previously, Akri only supported deploying Pods that never terminated. Adding Jobs will enable more scenarios with Akri such as device management scenarios. It follows the design in this proposal project-akri/akri-docs#17. Some specific changes to call out:
brokerPodSpec
section of the Configuration changed to abrokerType
enum with abrokerPodSpec
orbrokerJobSpec
options.brokerType
of the Configuration.Logic to handle thebrokerType
of a Configuration being updated -- will bring down and redeploy workloads and services.completions
,parallelism
, andbackoffLimit
are configurable. All other types of JobSpec customizations will need to be done by creation a Configuration yaml and modifying it. Instructions for this can be added to our docs, just as they are for PodSpecs. Removed default image for debugEcho -- docs will need to be updated for this.Docs
I put in a PR with docs updating Pod deployment in tandem with this one: project-akri/akri-docs#21
It also contains documentation on Jobs.
Tests
This PR includes unit tests. Should we extend out e2e workflows to test Jobs? Maybe we just test a subset of our platform/k8s distro matrix for jobs to prevent too many runs. If so, can that be tabled to a later pr so as to not overload this one?
Special notes for your reviewer:
I went ahead and built and published some containers for folks to try out this implementation. You can run this on this Kubernetes playground if you do not have a cluster:
Clone and checkout this branch. Then package the helm chart locally:
Install Akri with a debugEcho Configuration that will deploy a Job to each (
foo0
andfoo1
) discovered device. The Job will echo "Hello World" and then sleep for 5 seconds before terminating.Say you are feeling more exuberant and want the Job to echo "Hello Amazing World", you can change the
brokerSpec
. Upgrade the installation to do so and watch Akri delete all the resources (instances, jobs) and recreate them, deploying the new job (Dokubectl logs
on the Pods to verify).Test out some of the Job settings and modify the Job to run the Job twice per device (
completions=2
) in parallel (parallelism=2
). These Helm settings simply modify equivalently named parts of the Kubernetes JobSpecPods are still deployed the same per usual and they can also be updated.
Note on future extension
Many deployment scenarios can be fulfilled with Jobs just by adjusting the
parallelism
(number of the same Pod that should run concurrently) andcompletions
(number of times the Pods of the Job need to successfully complete). However, an additional scenario could be enabled by adding support in the controller for addingpodAntiAffinity
(by instance name).For example, you could specify that you want 3 Jobs running concurrently (
parallelism=3
andcompletions=3
) but that you want at most 1 on each node. Limiting one to run on each node can be done by settingpodAntiAffinity
in the JobSpec like so:This would need to be done by the controller rather than in the Helm charts as it is likely desirable to set
podAntiAffinity
per device/instance rather than per Configuration.If applicable:
cargo fmt
)cargo build
)cargo clippy
)cargo test
)cargo doc
)./version.sh
)