-
Notifications
You must be signed in to change notification settings - Fork 525
Add enhancement spec for api tls curves #1894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
|
||
| ## Motivation | ||
|
|
||
| As cryptographic standards evolve, there is a growing need to support Post-Quantum Cryptography (PQC) to protect against future threats. This enhancement contributes directly to the goal of enabling PQC support in OpenShift. It provides the mechanism to configure specific TLS curves in the OpenShift API, allowing administrators to explicitly enable PQC-ready curves such as ML-KEM. This ensures OpenShift clusters can be configured to meet emerging security compliance requirements and future-proof communications. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe not necessary, up to you - but is worth making it explicit that as an admin I want to move into hybrid pqc (the first phase of getting pqc safe)
|
|
||
|
|
||
| ### Implementation Details/Notes/Constraints | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if no curves are provided what are the defaults - just down to whatever defaults are provided by the respective crypto modules? I would maybe make it as explicit as possible different configuration setups. Might be useful to see examples of proper config, bad config, etc. and the data flow for each maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. It is imperative that introducing this new field doesn't change existing configurations, where curves cannot be predefined.
If there are curves listed, I assume that would override the defaults and only the explicitly listed curves would be allowed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any intention to dictate a common default that all OpenShift components should use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I brought that question up in a separate meeting. There are a few facets to this.
Mapping from named profiles
Right now in OCP:
- library-go defines the mapping of profile name to MinTLSVersion and CipherSuites (which means it is possible for different controllers to use different defaults due to using different commits of library-go). This mapping is updated periodically to reflect Mozilla's definitions of the named profiles.
- The library-go mapping does not include curves (because it isn't yet part of the API), which means that the default curves end up coming from the TLS implementation.
Mozilla's mapping does specify curves, so I would expect that we would update library-go to include those curve mappings.
Custom Profile
If a custom profile does not mention curves, what should we suggest implementers do? I am not a cryptography person, so I don't think I'm qualified to make a judgement here, but I suspect that the safest thing to do is to honor the defaults from the crypto module. That is likely what most implementations do now.
Consistency among all OCP components
When a named profile is used, I suggested a future improvement if we want to further improve consistency. It may make sense to implement a controller for APIs that embed the TLS profile stanza, such that the centralized controller could implement the mapping and provide the specific parameters that all components should pass to the TLS server configs. I think this aspect should be treated as a follow-up since it would significantly increase the scope of the initial effort, which by itself will be a major improvement to our ability to consistently apply TLS configuration to all core components much more consistently than we do now.
|
|
||
| ## Version Skew Strategy | ||
|
|
||
| By default, TLS implementations (openssl, golang, etc...) fallback to a sensible default when curves are not set. Currently, openshift components that do not set curves exhibit this behavior. This should be verified during component onboarding. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh this might address above a bit - but might still be good to show a config example and then the expected result for the difference scenarios
|
@richardsonnick: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
| ### Risks and Mitigations | ||
|
|
||
| OpenShift components could forego utilizing the curves set in the API config. However, this is a risk | ||
| that exists in the current TLS config flow. This change will require coordination with component owners | ||
| to comply with the new TLS config field. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a risk that setting an incorrect TLS custom suite could brick the cluster?
I'm not sure what the configuration topology is, but if the apiserver-operator is querying the apiserver to understand what TLS profile to configure the apiserver with, and the apiserver-operator configures the apiserver with an invalid TLS profile, that would mean:
- The apiserver is no longer able to handshake with clients, so admins would not be able to patch the TLS profile back to a working state because the apiserver server would respond with a handshake failure.
- Even if the TLS profile could be patched somehow, the apiserver-operator could lose its connection with the now-broken apiserver, which would mean it would not know to re-configure the apiserver back to a working TLS profile.
Is there any proposed mitigation for such a possibility?
| The ability to set curves explicitely will also make it possible to align our | ||
| OpenShift TLS profiles to match the curves present in the [Mozilla TLS Profiles](https://wiki.mozilla.org/Security/Server_Side_TLS). | ||
|
|
||
| This change will require working with OpenShift component owners to use this new field, however this is outside the scope of this proposal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably ought to make sure we understand how components are going to use these new fields and include stakeholders from each of the teams responsible for these components as part of this enhancement proposal.
Just adding a new field without any backing implementation isn't helpful to anyone.
| ... | ||
| ``` | ||
|
|
||
| ### API Extensions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include one of:
- A link to the proposed API changes
- A copy of the proposed API changes
|
|
||
| #### Hypershift / Hosted Control Planes | ||
|
|
||
| N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supported on HyperShift? Are there any special considerations for HyperShift here?
|
|
||
| #### Single-node Deployments or MicroShift | ||
|
|
||
| N/A |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supported for single node? For microshift?
| There is a case where the administrator could incorrectly specificy a set of ciphersuites | ||
| that do not work with each other. For example using an RSA ciphersuite with a ECDHE curve (such as TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 and P-256). The default behavior OpenSSL as well as go's crypto/tls (both used extensively in OpenShift) is to fail at **TLS handshake time**. . The TLS server instance will start normally, but when TLS clients attempt to handshake with the TLS server, the handshake will fail with a `handshake failure` | ||
|
|
||
| I propose that this is the *desired* behavior of OpenShift. Administrators that utilize the custom TLS profile are foregoing the guaranteed correctly configured TLS profiles (such as Modern, Intermediate, etc.) and the system should comply accordingly. The landscape of TLS is constantly evolving and relying on the TLS implementation (Openssl, golang crypto/tls, etc) itself provides users more flexibility. Adding a validation system on top as part of the OpenShift would be cumbersome and restricts admins to use what we deem compatible rather than what the underylying TLS implementation is capable of. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a validation system on top as part of the OpenShift would be cumbersome and restricts admins to use what we deem compatible rather than what the underylying TLS implementation is capable of
We are generally expected to be the SMEs here. If there are combinations that we know ahead of time will not work, why should we not prevent users from doing that earlier in the loop?
Where possible we should provide appropriate guardrails to help prevent users from breaking their own clusters.
|
|
||
| ## Open Questions [optional] | ||
|
|
||
| 1. I propose that we should *not* include a validation system for custom TLS configurations. See "Proposal" for more context. However, this restricts the ability to ensure that the TLS config works when a new config is set, instead relying on failed handshakes to clue the administator in on any issues. There could also be "skew" between the supported curves and ciphers of the various TLS implementations utilized within OpenShift. This could make it difficult to reason about the state of the TLS capabilities of the cluster. Any thoughts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO we should probably only support curves that all of the affected components by this change are capable of supporting
|
|
||
|
|
||
| ### Implementation Details/Notes/Constraints | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any intention to dictate a common default that all OpenShift components should use?
Adds an enhancement request for the addition of a
curvesfield in the TLS profile spec of the OpenShift API config.