Design decisions related to ML-KEM and ML-DSA keys. #26652

slontis · 2025-02-06T00:22:20Z

Before merging the ML-KEM or ML-DSA feature branches to master the following design decisions should be agreed on as the correct approach..

Background:

MK-KEM and ML-DSA both generate their key pair components using an input seed.
As part of the FIPS standards the public and private keys are encoded into a formats that can be serialized.

PKCS11 requires the encoded format,
NIST insisted on the encoded format to start with, but has since changed their position to allow the seed variant to also be specified (this only happened recently).
IETF insisted on the seed format, but has changed it position when presented with new information related to PKCS11 requirements.
There is also currently a raw format.
OQS has its own format currently.

Because of these different requirements, OpenSSL is stuck in a difficult position as to what the format should be, whilst IETF decides on the final format.

Because of these issues, the OpenSSL design allows lots of format choices....

To support this OpenSSL currently allows provider config options to allow input and output choices. If the choices are omitted then an error will occur if that type is attempted to be loaded.

So this raises the following questions:
(1) Which options should be supported for input. Should we just always allow all formats to load and not use input format choices?
(2) Which options should be supported for output. (Should OQS and raw format be allowed for example). Does it need a prioritized list?

When loading there are cases where both seed, priv encoding and public key could be present.. So what should happen in these cases?
(3) When importing - If both the seed and priv encoding are present, should we generate from the seed and make sure it matches the private encoding OR should we decode the private part and just store the seed.
(4) Since the public key can be derived from the private key, the same questions also relate to the private key.
(5) If either the seed or private key are present when exporting should the public key be exported also?

The following PR adds a Pairwise Test to the import

openssl/providers/implementations/keymgmt/ml_kem_kmgmt.c

Line 456 in 8a14495

&& ossl_ml_kem_encode_seed(seed, sizeof(seed), key)

The pairwise test checks that the public key matches the private key.. Given that the private key is trusted does that also imply that we trust the public key.

Is this PCT necessary or required?

paulidale · 2025-02-06T00:26:51Z

Consideration of the very long term support guarantees the project has are relevant here. It would be less than ideal to support an effectively dead format for half a decade or more.

vdukhovni · 2025-02-06T02:38:41Z

The counter-argument is that if this is helpful to users, the "support" in question amounts to one line in a table.
So the argument that such support is a tangible cost is weak.
This is perhaps more a matter of taste, and not wanting to cater to folks who are not keeping up with the times.

And we still don't know what LAMPS will come up with...

slontis · 2025-02-06T03:17:09Z

And we still don't know what LAMPS will come up with...

If there is yet another format spec after we release, then that will still require changes regardless of if everything is data driven or not.
Whether OpenSSL should even ship before it has settled is another question.

paulidale · 2025-02-06T05:18:23Z

A further option could be having a minimal table of formats and letting the user specify an additional one (or more). Compatibility with everything weird and wonderful would be maintained without polluting the code with "unused" things.

tomato42 · 2025-02-06T12:14:56Z

First of all: what's raw format? I missed that one being discussed or mentioned.

Second of all, I'm rather conflicted: I wish that there was One True Format that we've all agreed on right now, but I don't see that happening very quickly at IETF, let alone it being frozen in an RFC. So I think we need to support all the different formats: that will lead to a much better user experience.

I need to put some blame on us, as we do ship oqsprovider in Centos 10 Stream that will output PKCS#8 files with NIST OIDs but in OQS provider format. Sorry about it!. (We just tried to fix it, so that it produces files in one of the older formats specified in the IETF draft, but that breaks some pretty core functionality: open-quantum-safe/oqs-provider#637). We will also extensively document that those are not final formats.

One saving grace of this, is that we're talking about private key format, so unlike with values that we need to assume come from an attacker (public keys, TLS, etc.), I think we can apply Postel's law here and be liberal with what we accept.

So:

I think openssl should support as many formats as reasonably possible (the old IETF I-D expanded-key only, old IETF I-D seed only, whatever they will agree on now in IETF, and the OQS formats are probably the minimum set)
OpenSSL should default to output in the actual IETF RFC format (that may mean change of default down the road, yes, I really don't like it, especially with my Red Hat Fedora on, but I think we're justified in doing that)
If at all possible, the read-write operation should be lossless: if we read a seed value, unless specifically requested by the user, we should retain the seed value
I don't think we need to support writing in all the different formats we support for reading

(4) Since the public key can be derived from the private key, the same questions also relate to the private key.

I haven't grokked ML-DSA yet (been focusing on side-channel analysis of ML-KEM), so may be missing something, but it seems to me like the private and public parts of ML-DSA have slightly different variables: (ro, ..., t_0) vs (ro, t_1). Can we derive t_1 from the private key parameters?

For ML-KEM conversion of the private to public key is trivial, as the private key has the full public key embedded inside.

Haven't looked at SLH-DSA at all... (I think it's just that the operation is computationally expensive?)

vdukhovni · 2025-02-06T12:47:42Z

We default to retaining the seed value. If provided on input, or generated, it is included in the output (unless the user chooses explicitly to not retain or output the seed).

I don't see the value of excluding some (non-default) formats from being written. Users may legitimately need them for interop. They're unlikely to output these by accident.

In ML-DSA, the public key can be computed from just the private parts of the key, and this is what happens on import. So there is no possibility of a mismatch in ML-DSA.

By way of contrast, in ML-KEM, the embedded public key could actually be incompatible with the enclosing private key, but (pending PR) we'll be doing a PCT test that will exclude that possibility with high probability.

[ It would be good to have Paul comment on why the entropy in the PCT test is not randomly generated, is there a good reason why high quality random data might not yet be available during the PCT test? If there isn't I should random data, rather than having the full private + public key material ].

As for "raw" keys that's an EVP concept, to distinguish e.g. internal EC point encoding from wire EC point encoding.
For ML-KEM and ML-DSA the raw and encoded formats are identical.

To hedge the bets a bit, the format table also supports bare-non-ASN.1 forms of each of the public and private keys with no ASN.1 octet wrapping, I hope those won't be the formats, but we'll see.

paulidale · 2025-02-06T22:02:30Z

[ It would be good to have Paul comment on why the entropy in the PCT test is not randomly generated, is there a good reason why high quality random data might not yet be available during the PCT test? If there isn't I should random data, rather than having the full private + public key material ].

For the most part, these PCTs are pointless. On key gen they are a pure waste of effort. On import there might be some minimal benefit for some algorithms to detect mismatched or corrupted keys but the chance of that is really small & any error will appear later. It isn't worth the performance hit making the check for the benefit gained. For the PQ algorithms, the benefits are further reduced. Consequently, it is desirable to get past them as fast as possible. Using high quality entropy is not required by the FIPS 140-3 IGs, so it's better & easier to avoid it.

t-j-h · 2025-02-07T10:26:50Z

Excellent write up @slontis

Right now there is a range of incompatible formats out there in current usage. There is a split among the early adopters in terms of what should be supported and there is no clear "one-true-format" and there may never be.

We need to make a pragmatic decision here - and it isn't a technical one - we can review the technical implementation to get a sense of the cost of maintaining it - and with the way it has been done - I for one think the cost is sufficiently low that the user benefit outweighs that cost.

I've chatted with quite a few folks and no one is particularly happy at the state of things in terms of the mess - but we also need to keep in mind that things will not settle down any time soon. We will need to be able to generate multiple output formats - that is very clear. There is not a single output format that is universally acceptable.

Once we get to the position of there isn't a single one then it is a matter of accepting that we have a range and the incremental cost of adding additional items to a table is sufficiently small enough to be negligible in my view.

Note I don't hold that view generically - i.e. if the code wasn't clean and straight forward it would tip the balance. And if there was a reasonable universal output format that would also make my view different.

@vdukhovni and I have been following all the developments on the LAMPs mailing list and participating in lots of on-list and lots of off-list discussions all aimed at seeing if we could get to a single format that works for everyone - and that is clearly not going to happen any time soon. We don't have the luxury of waiting a year or so to see how things pan out with deployments.

And once we are multi-format I also don't see any benefit of not supporting the oqs-provider formats as that does ease things for the early adopters who have been using that code to prototype - we (OpenSSL) pointed people there prior to having our in-tree solutions and when the effort is minimal to support it, I don't see a justification to not do that.

We also already added in the oqs_provider names for algorithms which went in without any controversy - which will mean those names that do not match the standard will get used and will be around "forever" as such. I'm much less inclined to propagate algorithm names that users will use going forward that do not match a specification document than I am to support file formats that are in use. But the aliases are in - and I'm not suggesting removing those. To me compatibility with a user base that we also encouraged to occur isn't a bad thing - it does come at a cost - but I don't see that cost as sufficiently high to justify removing working code (names or format handling).

I would also pick our default output format for maximum interoperability - and that remains a challenge - as the IETF format may end up not actually being maximally interoperable - it may lead to a balkanised user base - and maximum interoperability should remain our driver here in my view - with interoperability being more important than strict IETF specification compliance.

t8m · 2025-02-07T12:45:31Z

Given we already have the code I am also inclined to allow all the possible implemented variants on input (assuming all are non-ambiguous and can be safely distinguished by ASN.1 tags or lengths) and support full configurability of the output format.

I am not so sure about the need to have the supported input formats configurable instead of just importing everything that we can import. One argument could be seen for making the potential attack surface smaller, but then I would say that should be a build-time configurable.

vdukhovni · 2025-02-07T14:07:30Z

Given we already have the code I am also inclined to allow all the possible implemented variants on input (assuming all are non-ambiguous and can be safely distinguished by ASN.1 tags or lengths) and support full configurability of the output format.

I am not so sure about the need to have the supported input formats configurable instead of just importing everything that we can import. One argument could be seen for making the potential attack surface smaller, but then I would say that should be a build-time configurable.

The downside of build-time configurable is that the poor users on distro platforms are stuck with whatever the vendor chose (either more or less than they actually might need). The advantage of runtime configurable, is that a config file update can trim the list, or allow users to restore functionality that the vendor's default config file turned off. So am I not keen on leaving system operators with no ability to make deployment-time choices. Either every format is supported when reading, or it should be config file not compile-time determined.

The input formats are unambiguous. They are matched by both ASN.1 and length except for the two "bare" formats of just seed or just key with no OCTET string wrapping, which are matched by length alone. The order of input formats is documented irrelevant for that reason, so a decision to not support disabling some of then is possible, but then LAMPS purists can't turn off the "both" format if they hate it so much they can't abide supporting reading it, ... and the "bare" key format that might not be in use anywhere, and is perhaps more of a proof of concept than something known to be used, can't be turned off (we could of course drop that one before the release, if we're sure it won't be one of the final choices forced by the LAMPS ASN.1 haters).

Bottom line, I rather think that configurability serves us more in the current uncertain climate that any concern about purity, or hypothetical long-term support cost, which is regrettable (in an ideal world LAMPS would not have gone rogue), but something we can easily deal with.

baentsch · 2025-02-08T15:56:47Z

As a person from the team having "helped" create this mess I probably should keep silent -- but just can't :)

fwiw, I conceptually completely agree with the approach "accept any input, make output run-time configurable" to cater to all possible futures (standards), but I have to ask this:

If it is a tenet that OpenSSL only implements finalized standards (?), would it be justifiable to have no (at least private key) encoders active by default in 3.5? A purist position would even be to not include any encoders as there's no standard one can refer to -- but as the code/options is there, this default setting at least could make the unwary aware (of the problem of no commonly agreed standard being available) and force the people shipping distros make a conscious decision (maybe/hopefully activating by config the standard finalized at time of packaging).

For me at least this worked reasonably well: encoders for KEMs are by default not available in oqsprovider and people who wanted them needed to know what they do -- and it wasn't many that then asked for support. In addition, there seems to be a pretty basic problem hinting at it not being seriously used to boot. For ML-DSA the story is different, of course.

vdukhovni · 2025-02-08T16:15:19Z

If it is a tenet that OpenSSL only implements finalized standards (?), would it be justifiable to have no (at least private key) encoders active by default in 3.5? A purist position would even be to not include any encoders as there's no standard one can refer to -- but as the code/options is there, this default setting at least could make the unwary aware (of the problem of no commonly agreed standard being available) and force the people shipping distros make a conscious decision (maybe/hopefully activating by config the standard finalized at time of packaging).

No it is not a tenet, and shipping without working encoders is not acceptable. Some users are already creating ML-KEM certificates, and these require being able to encode the public key, and to to be able to load and use the private key in protocols that use static KEM keys (these are already in use). We would not ship if the algorithm or public-key encoding were ill-defined, because there's no chance of interoperability, but private keys are a different matter, and if we can handle a superset of the specification, and with luck have a better sense of the right default output format before long, we're solid, even if our default format is outside the spec, but ideally we won't need to do that.

baentsch · 2025-02-09T08:11:15Z

Thanks for the explanation/background. Makes sense.

we're solid,

I don't doubt that. My sole concern is how this "situation" is best represented (beyond documentation) towards all types of users so they're not caught off-guard.

slontis · 2025-02-10T02:27:37Z

Added another question to the end of the top section..

paulidale · 2025-02-10T22:03:22Z

The counter-argument is that if this is helpful to users, the "support" in question amounts to one line in a table. So the argument that such support is a tangible cost is weak. This is perhaps more a matter of taste, and not wanting to cater to folks who are not keeping up with the times.

It is a lot more than one line in a table. It needs unit and interop tests. Sure, they mightn't change much over time but they are a cost.

paulidale · 2025-02-10T22:10:42Z

If it is a tenet that OpenSSL only implements finalized standards (?), …

I'm really glad someone bought this up. The policy for inclusion is national or international standard. OQS certainly doesn't qualify as either. I'm not advocating waiting until the IETF finishes they length processes since it would be counter productive.

For ML-DSA we're going to have to make a stab at something if we want to include it. ML-DSA is kind of pointless without being about to encode and decode keys. I think we should be cautious about including too much here nonetheless.

For ML-KEM we could take a more conservative approach. ML-KEM's primary, and by far largest, use case is for TLS. In this role encoders and decoders are not required. That is, we could ship ML-KEM without encoders & decodes and wait for the standards to settle before adding them. If there is a published standard for ML-KEM certs, this is moot. We should support them.

I really hate introducing new features with built in legacy. Supporting OQS formats does exactly this.

tomato42 · 2025-02-10T23:41:07Z

That is, we could ship ML-KEM without encoders & decodes and wait for the standards to settle before adding them.

One thing why I would really like to see ability to load ML-KEM keys in from a file (even if they are just raw byte strings) is side-channel testing. To be able to check if there is no leakage related to private key values I need to know what key was actually used for the operation...

baentsch · 2025-02-11T06:26:31Z

To be able to check if there is no leakage related to private key values I need to know what key was actually used for the operation...

That is a valid use case speaking for having an option to enable encoder/decoders but imo does not demand having this enabled by default. Also, it is not an argument for supporting all kinds of formats.

t-j-h · 2025-02-11T07:37:20Z

If it is a tenet that OpenSSL only implements finalized standards (?), …
I'm really glad someone bought this up. The policy for inclusion is national or international standard.

As I've noted a number of times before, that context is only for cryptographic algorithms and was put in place when we were getting requires for random-algorithm-of-the-week to be added. We made a rule that if the algorithm did not have at least national level standard documentation we would not add it in.

This has not applied (and is not meant to apply) to anything else. We implement what we believe represents what our communities want to see in place - and generally that tends towards interoperable code - interoperable is more important than "correct".

I really hate introducing new features with built in legacy. Supporting OQS formats does exactly this.

I don't see it that way at all - we can make it more painful for the early adopters to move across, or we can make it easier.
It does come at a cost - and we have to make a value judgement on the cost of supporting additional formats and the benefit to the users for doing so.

For the PQC work, with the formats within the IETF in the context of PKCS#8 changing over the last 6 months between various incompatible formats and the latest details showing things are still not quite settled down lead @vdukhovni to implement a logical range of formats and having OQS support was effectively very low cost. As OQS was what we have been recommending for those that needed to experiment with until such time as we had our built-in implementations, it is leaving those users high and dry if we don't at least read the format that was written. And when the variations between the formats are such they can be easily table driven it makes the cost/benefit decision lean a particular direction.

tomato42 · 2025-02-11T12:10:39Z

To be able to check if there is no leakage related to private key values I need to know what key was actually used for the operation...

That is a valid use case speaking for having an option to enable encoder/decoders but imo does not demand having this enabled by default. Also, it is not an argument for supporting all kinds of formats.

correct, I just need a way to import key, it doesn't have to be the standardised PKCS#8 format

martinschmatz · 2025-02-11T17:19:08Z

I need to know what key was actually used

@tomato42 Do you need to 'only' know the key, or do you need to control the key?

If it's the former, at least for TLS, you might be able to leverage SSL_set_msg_callback to get insights into any stage of the TLS session establishment, where you could get the connection from the ssl 'object' and then access to the private keys via connection->s3.tmp.pkey.

Haven't tried it out, but something like below would be my starting point for such a callback function (example for the client side):

void getPrivateKeyCallback(int write_p, int version,
                           int content_type, const void *buf,
                           size_t len, SSL *ssl, void *arg)
{                   
  if (version == TLS1_3_VERSION &&               /* TLS version must be v1.3 */
      content_type == SSL3_RT_HANDSHAKE &&       /* must be a handshake record type */
      write_p == 0 &&                            /* must be incoming traffic */
      ((uint8_t*)buf)[0] == SSL3_MT_SERVER_HELLO /* must be a server_hello message type */
      ) 
  {        
      SSL_CONNECTION *connection = SSL_CONNECTION_FROM_SSL(ssl);
      EVP_PKEY *ckey = connection->s3.tmp.pkey;
      /* Copy the above ckey (or parts of it) into the arg in an 
         adequate way as per your needs for offline analysis */
  }
}

vdukhovni · 2025-02-11T17:25:32Z

Apart from PKCS#8, EVP layer provides EVP_PKEY_fromdata() and EVP_PKEY_todata functions for key import/export, importing and generating (equivalent) by seed value are documented in EVP_PKEY-ML-KEM(7) and EVP_PKEY-ML-DSA(7).

And openssl-genpkey(1) command supports a -pkeyopt hexseed:... option, though of course that risks ps(1) exposing the key to other users. So there are work-arounds, but it is sad that IETF appears to be heading for a very suboptimal outcome.

tomato42 · 2025-02-11T17:31:56Z

Do you need to 'only' know the key, or do you need to control the key?

Since the leakage from private key operations are basically relevant only in context of CMS or similar protocols, I'd rather not do it inside the TLS context at all. Also, I'd rather have the test harness as simple as possible, something like this, but one that loads a new key for every operation.

I plan to be able to test also the seed format for private keys, so the amount of control over the actually used key (in terms of numerical values) is quite limited...

Apart from PKCS#8, EVP layer provides EVP_PKEY_fromdata() and EVP_PKEY_todata functions for key import/export,

yes, those should be sufficient for side-channel testing

ghen2 · 2025-02-12T19:09:37Z

OQS has its own format currently.

FWIW, this format (privkey+pubkey, no seed) is also what GnuTLS 3.8.9 reads and writes, exclusively.

baentsch · 2025-02-13T06:51:41Z

OQS has its own format currently.

FWIW, this format (privkey+pubkey, no seed) is also what GnuTLS 3.8.9 reads and writes, exclusively.

Correct, but this seems not a deliberate GnuTLS design decision but a mere limitation/consequence of it currently still using liboqs. And as that usage is afaik on the way out it doesn't seem a strong argument which format to keep supporting in the long run. Tagging @ueno for his opinion.

ghen2 · 2025-02-13T07:40:32Z

GnuTLS 3.8.9 switched to leancrypto for PQC support, that's why I mentioned the version. ;-)
The private key format is indeed probably an inheritance from the original liboqs based implementation, and the leancrypto PR explicitly mentions seed storage as one of the reasons to switch, but that is not implemented yet.

I just wanted to mention it as another implementation that may benefit from OpenSSL's input/output flexibility. Which works great btw, well done!

slontis added the issue: bug report The issue was opened to report a bug label Feb 6, 2025

slontis added this to the 3.5.0 Release Blocker milestone Feb 6, 2025

slontis added this to Project Board 3.5.0 planning Feb 6, 2025

paulidale added triaged: question The issue contains a question and removed issue: bug report The issue was opened to report a bug labels Feb 6, 2025

nhorman added this to Development Board Feb 6, 2025

github-project-automation bot moved this to Pre-Refinement in Development Board Feb 6, 2025

vavroch2010 removed this from Project Board 3.5.0 planning Feb 6, 2025

t-j-h modified the milestones: 3.5.0 Release Blocker, 3.5.0 Feature Complete Feb 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design decisions related to ML-KEM and ML-DSA keys. #26652

Design decisions related to ML-KEM and ML-DSA keys. #26652

slontis commented Feb 6, 2025 •

edited

Loading

paulidale commented Feb 6, 2025

vdukhovni commented Feb 6, 2025 •

edited

Loading

slontis commented Feb 6, 2025

paulidale commented Feb 6, 2025

tomato42 commented Feb 6, 2025

vdukhovni commented Feb 6, 2025 •

edited

Loading

paulidale commented Feb 6, 2025

t-j-h commented Feb 7, 2025

t8m commented Feb 7, 2025

vdukhovni commented Feb 7, 2025

baentsch commented Feb 8, 2025

vdukhovni commented Feb 8, 2025

baentsch commented Feb 9, 2025

slontis commented Feb 10, 2025

paulidale commented Feb 10, 2025

paulidale commented Feb 10, 2025 •

edited

Loading

tomato42 commented Feb 10, 2025

baentsch commented Feb 11, 2025

t-j-h commented Feb 11, 2025

tomato42 commented Feb 11, 2025

martinschmatz commented Feb 11, 2025

vdukhovni commented Feb 11, 2025

tomato42 commented Feb 11, 2025

ghen2 commented Feb 12, 2025

baentsch commented Feb 13, 2025

ghen2 commented Feb 13, 2025

Design decisions related to ML-KEM and ML-DSA keys. #26652

Design decisions related to ML-KEM and ML-DSA keys. #26652

Comments

slontis commented Feb 6, 2025 • edited Loading

paulidale commented Feb 6, 2025

vdukhovni commented Feb 6, 2025 • edited Loading

slontis commented Feb 6, 2025

paulidale commented Feb 6, 2025

tomato42 commented Feb 6, 2025

vdukhovni commented Feb 6, 2025 • edited Loading

paulidale commented Feb 6, 2025

t-j-h commented Feb 7, 2025

t8m commented Feb 7, 2025

vdukhovni commented Feb 7, 2025

baentsch commented Feb 8, 2025

vdukhovni commented Feb 8, 2025

baentsch commented Feb 9, 2025

slontis commented Feb 10, 2025

paulidale commented Feb 10, 2025

paulidale commented Feb 10, 2025 • edited Loading

tomato42 commented Feb 10, 2025

baentsch commented Feb 11, 2025

t-j-h commented Feb 11, 2025

tomato42 commented Feb 11, 2025

martinschmatz commented Feb 11, 2025

vdukhovni commented Feb 11, 2025

tomato42 commented Feb 11, 2025

ghen2 commented Feb 12, 2025

baentsch commented Feb 13, 2025

ghen2 commented Feb 13, 2025

slontis commented Feb 6, 2025 •

edited

Loading

vdukhovni commented Feb 6, 2025 •

edited

Loading

vdukhovni commented Feb 6, 2025 •

edited

Loading

paulidale commented Feb 10, 2025 •

edited

Loading