-
Notifications
You must be signed in to change notification settings - Fork 462
daemon: Refuse to disable FIPS mode #1233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon: Refuse to disable FIPS mode #1233
Conversation
|
This makes sense to me. Should we refuse to let them change FIPS at all since Day 2 wouldn't be valid? |
yeah, I was actually thinking if day2 isn't valid at all we might want to go the route to completely drop this FIPS from MC and either you install with fips or not. That would be the safest option to me w/o providing any way to disable it through the MCO |
ashcrow
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The change itself is 👍 though I still wonder if we should remove the flag (or make it 100% read only).
Possibly but we currently rely on this code since we haven't landed an initramfs replacement yet. |
ee457ec to
7608545
Compare
|
/retest |
|
/skip |
|
/lgtm |
|
/override ci/prow/e2e-gcp-op I've been told there are currently some issues being looked at for the above job. Skipping. |
|
@ashcrow: Overrode contexts on behalf of ashcrow: ci/prow/e2e-gcp-op DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
/hold |
|
I just noticed this in the logs: |
|
@cgwalters i noticed in the fips e2e they were hitting OOD on 2 nodes in the failed runs as well. sent you the link to chat convo |
|
Are we concerned about transitions in the opposite direction? Going from non-FIPS compliant to (technically still not) FIPS compliant seems more dangerous. |
|
@crawford Yes but see |
|
@ashcrow You added a hold - can you explain why? |
|
@cgwalters I added it as @kikisdeliveryservice wanted to review why the GCE job failed to ensure it was a flake. |
held bc we saw the failed FIPs test and wanted to investigate it (ie the stuff we've been looking at) @cgwalters If you are fine with merging it without it passing that job at all, I'm fine with removing hold. I dont know enough about FIPs to say so myself... just lmk |
|
Hmm. I think this PR is quite safe, and will actually help our issues with the e2e-gcp-op test suite since it'll shorten it a bit. But, it can also probably wait until we've figured out that suite. |
Day 2 FIPS is broken and this test is consistently failing. Day 2 FIPS will be dropped in openshift#1233 so this test will be unneeded. Related-to: openshift#1233
|
It's late for me, but I've figured out that our test is failing precisely bc FIPS day2 is broken. I have a PR here removing the test to unblock the PRs that have been stuck since Friday: Can someone in the morning just double check and then merge this (and then approve my PR)? We have features blocked so it would be great to get the test gone and things back to working. |
|
/lgtm |
|
/hold cancel |
|
Needs a quick rebase |
Our new thought around this is that really FIPS should be a "day 1" operation, and we don't want to make it really easy to undo. See also openshift/installer#2594 Anyone who wants to force this can change the MC flag, then `oc debug node` and run the disable command by hand, then reboot. Our MachineConfig merge semantics should make it hard for this to happen unless the admin explicitly deletes the installer-generated MC, but still. Since we don't support it and don't want customers to do it by accident, let's disable it and also stop wasting compute hours testing it. Further, a pending RHCOS change will delete the FIPS command entirely and move it into the initramfs. Cleanly handle that case by also refusing to enable FIPS "day 2" - what we expect to be the future. But we still support enabling day 2 for testing until that RHCOS change lands.
7608545 to
a42cb0c
Compare
|
OK rebased 🏄♂️ and I also added some code to more cleanly handle the pending RHCOS change to drop |
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ashcrow, cgwalters, mrunalp The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/override ci/prow/e2e-aws-scaleup-rhel7 My understanding is the above failing test is unrelated |
|
@ashcrow: Overrode contexts on behalf of ashcrow: ci/prow/e2e-aws-scaleup-rhel7 DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
|
@cgwalters: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
|
GCP errors /test e2e-gcp-upgrade |
This effectively reverts fb1c4b4 e2e-fips is currently failing with `/bin/bash: line 15: nodes[i]: unbound variable` Looking at this...we already have code to validate the state of FIPS in the MCO, see: https://github.com/openshift/machine-config-operator/blob/091afde36ac117ef8b782a85b38ae8783ddf4b70/pkg/daemon/update.go#L571 openshift/machine-config-operator#1252 openshift/machine-config-operator#1233 I think these types of checks should be the MCO's role, or if we choose not to do that, let's at least implement them in Go in the existing e2e suite and avoid nontrivial shell-in-YAML.
Our new thought around this is that really FIPS should be a "day 1"
operation, and we don't want to make it really easy to undo.
See also openshift/installer#2594
Anyone who wants to force this can change the MC flag, then
oc debug nodeand run the disable command by hand, then reboot.Our MachineConfig merge semantics should make it hard for this
to happen unless the admin explicitly deletes the installer-generated MC,
but still.
Since we don't support it and don't want customers
to do it by accident, let's disable it and also stop wasting compute
hours testing it.