Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using Sigstore for the ledger? #183

Open
dlorenc opened this issue Aug 21, 2023 · 11 comments
Open

Consider using Sigstore for the ledger? #183

dlorenc opened this issue Aug 21, 2023 · 11 comments

Comments

@dlorenc
Copy link

dlorenc commented Aug 21, 2023

Sigstore infrastructure might meet the needs of the binary ledger here, did you consider using it?

@lann
Copy link
Collaborator

lann commented Aug 21, 2023

Yes, we have considered integration with both rekor and fulcio.

Rekor doesn't provide all of the functionality our "package transparency" design requires. We may be interested in posting top-level registry checkpoints to the rekor public instance at some point, but that isn't currently a priority.

We may be very interested in integrating with fulcio for OIDC PKI in the future, but we don't yet support cert chains and again it isn't currently a priority.

@dlorenc
Copy link
Author

dlorenc commented Aug 21, 2023

Rekor doesn't provide all of the functionality our "package transparency" design requires. We may be interested in posting top-level registry checkpoints to the rekor public instance at some point, but that isn't currently a priority.

Could you share the design for package transparency? I'd love to read it.

@lann
Copy link
Collaborator

lann commented Aug 21, 2023

Like so many early stage projects every doc we've put together is already hopelessly out of date. I think this talk (slides) is the latest published description of the system but I'll summarize the current high-level design since I'm sure you aren't the only one who might be interested:

Package releases are entries in a hash-chained log very much like git commits, with each package getting a separate log. Each entry includes the hash of the previous entry, release metadata, and the hash of any associated release artifact. There are other entry types for permissions management, yanking releases, etc, which follow the same pattern.

Each entry is also submitted to a registry-global ledger as a (, ) pair. This pair is included in the registry-global verifiable log (like rekor) and in a verifiable map, which allows clients to retrieve an efficient proof of a package log's latest entry.

A registry checkpoint is calculated from the tree head hashes of both the verifiable log and verifiable map. This checkpoint is timestamped, signed, and published by the registry as its commitment to the state of every package.

@esoterra
Copy link
Collaborator

Lann's overview is correct, though to the very point about things being out-of-date constantly, the more recent talk is Package Transparency for WebAssembly Registries (Cloud Native Security Con 2023) (slides) which goes into much more detail on the design and would be a good starting point for learning more about it.

@znewman01
Copy link

Are you planning using a sparse Merkle tree as in the "verifiable map" link above? I generally consider a Merkle Binary Prefix Tree to be a better choice for verifiable maps (concretely faster in most cases). Happy to elaborate if that's useful.

@lann
Copy link
Collaborator

lann commented Aug 21, 2023

We use a sparse merkle tree with optimizations for empty and single-leaf subtrees. I believe this is reasonably up-to-date: https://github.com/bytecodealliance/registry/blob/main/docs/merkle_tree.md

@esoterra
Copy link
Collaborator

@znewman01 I've been meaning to follow up on your previous comment that there were other cryptographic data structures / techniques we ought to consider, but haven't had the time. I'm happy to take a look at MBPTs and see if they're a good fit.

The closer they are to a drop-in replacement, the easier and therefore more likely it is that we'll be able to make the switch if they are better. We don't have a lot of resources to invest in a rewrite of the data structures right now and we have a lot of other projects going on.

@znewman01
Copy link

MBPT and "SMT with optimizations" are actually pretty much the same 🙂 I think you're in good shape based on my read. I just wanted to make sure you didn't have depth-256 proofs all over the place as in a naive SMT.

@esoterra
Copy link
Collaborator

If you have some time, we'd appreciate any review or feedback on the SMT optimizations doc I wrote or the data structures we implemented. I'm sure there's room for improvements on both.

@esoterra
Copy link
Collaborator

esoterra commented Aug 22, 2023

@znewman01 to keep things simple and be consistent with what we saw in other RFCs and papers, our optimizations are all transparent and the hash of our trees are the same as they would be in naive SMT. We just optimize the storage and serialization of the data structures and proofs where there are empty or single-leaf subtrees. So we aren't going to store or send 256 height proofs over the wire most of the time but we are going to compute 256 hashes still when evaluating the proofs. I'm not sure if that's what you mean by "not having depth-256 proofs".

We did consider versions where the hash of a subtree with only one leaf is just hash(prefix || hash(key) || value) instead of being the hash of all of the branches going down to the bottom and the root hashes don't match naive SMT but decided against it.

Having read more of the MPT description, it sounds like it's a more sophisticated version of that general approach we'd considered.

@znewman01
Copy link

So we aren't going to store or send 256 height proofs over the wire most of the time but we are going to compute 256 hashes still when evaluating the proofs. I'm not sure if that's what you mean by "not having depth-256 proofs".

Ahh yeah, you can do better using BPTs but I think avoiding the blowup in proof size is most important.

Having read more of the MPT description, it sounds like it's a more sophisticated version of that general approach we'd considered.

Precisely. IMHO not worth further effort here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants