-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IETF-readiness] Add Prior Art and Translation section, update deprecation FAQ entry #164
base: master
Are you sure you want to change the base?
[IETF-readiness] Add Prior Art and Translation section, update deprecation FAQ entry #164
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fine aside from the minor link formatting issue and a quibble about ULEB128 which is material and probably shouldn't be swept aside
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found only a minor typo. I agree with @rvagg that the minimal encoding of the ULEB128 should be mentioned as that's one of those things that mentions for content-addressed systems.
1. (for binary form:) prefix existing binary multihash with `0x42` to designate that what follows is a multicodec prefix followed by an ULEB128 hash value. | ||
2. (for ASCII form:) convert the `0x42` prefix to URL format, i.e., `ni:///mh;` and then append a base64url, no-padding encoding of the entire binary multihash with prefix (and _without_ adding the additional base-64-url-no-padding prefix, `u`, if using a [multibase][] library for this base-encoding). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this a spec proposal? Doesn't seem to be anywhere else and seems to effectively be a separate spec.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea that if we merge this to then go to the nih registry and request 0x42
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: which number we use doesn't bother me but we used 0x42 for the CID tag in dag-cbor. I don't see NIH wanting CIDs more than multihashes so probably fine, but wanted to flag so it's documented here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea that if we merge this to then go to the nih registry and request
0x42
?
that's the beauty of it:
https://www.ietf.org/archive/id/draft-multiformats-multihash-07.html#name-the-mh-digest-algorithm
(I assumed the 42 was an intentional nod to that other iana registration!)
Co-authored-by: Volker Mische <[email protected]>
Co-authored-by: Rod Vagg <[email protected]>
…ge/multihash into feat/prior-art-refresh
|
||
Cannot find a good standard on this. Found some _different_ IANA ones: | ||
In IETF's corpus of normative protocols, there are two partial overlaps worth knowing about to ensure a safe implementation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to ensure a safe implementation
What does this mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
meaning, if you're really brownfield or ingesting unknown data and you get something that isn't a multiformat, here are some other prefixes you might want to sniff for as fallback, that might have been put there by other IETF self-description conventions 😄 . any wordsmithing help appreciated!
1. (for binary form:) prefix existing binary multihash with `0x42` to designate that what follows is a multicodec prefix followed by an ULEB128 hash value. | ||
2. (for ASCII form:) convert the `0x42` prefix to URL format, i.e., `ni:///mh;` and then append a base64url, no-padding encoding of the entire binary multihash with prefix (and _without_ adding the additional base-64-url-no-padding prefix, `u`, if using a [multibase][] library for this base-encoding). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the idea that if we merge this to then go to the nih registry and request 0x42
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bumblefudge @rvagg I switched to Approve given that my main issues have been resolved and I don't want to block merging if it's important.
However, I'd still like some clarity on the ones I have open. In particular, this PR adds a new spec (the NIH translation) into the doc. It seems fine, but also requires grabbing a number in the NIH registry which I don't know much about (are they likely to even let us reserve it)? Want to at least understand the state of things in the PR, even if we need to do a dance of merge, submit to NIH, make another PR that states 0x42 is provisionary and to look at the open request for the NIH registry.
@aschmahmann i checked before opening the PR-- |
As part of the ongoing work of getting Multiformats ready for another attempt at IETF, I wanted to add some "prior art and translation" sections, to show how a CID-powered system could, say, turn CIDs into DataURIs (tracked in a separate issue, how a binary CID can be wrapped in a little CBOR outer wrapper for mixed-tooling systems, etc etc. This is primarily to address what one interlocutor at IETF 118 called the "why bother reinventing wheels" question which multiformats needs to answer in its next charter and draft specs, but also to situate it as a useful and complementary (rather than just redundant) member in the family of IETF specifications and standards.
This specific translation section seemed like the most urgent to do first, given the history and the layering (the section about "unchunked" CIDs is probably the one that needs the most massaging, honestly). I am endebted to @gobengo from web3.storage for prototyping the conversions in his great blog post, "the Secret of NIMHs". Speaking of prototypes, if there is interest I could theoretically spin up a little NPM repo that spits out
ni://...
andni://mh;
URLs for test vectors already in this repo or others, if it were needed, and/or just add an e2e example for each of the two forks in the algorithm described here for the conversion.I'm test-ballooning here in multiformats/multibase a section that I will add to the IETF draft of multibase before re-applying, if it gets merged here and makes sense. Next up: dataURIs...