Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider base16k #11

Closed
sffc opened this issue Jul 14, 2021 · 5 comments
Closed

Consider base16k #11

sffc opened this issue Jul 14, 2021 · 5 comments

Comments

@sffc
Copy link

sffc commented Jul 14, 2021

My colleague @markusicu has proposed Base16k as an alternative to Base64 that is more compact when the resulting text is stored as UTF-16. This could be useful when storing byte arrays in JSON, for example.

https://sites.google.com/site/markusicu/unicode/base16k

If we're going to be opinionated enough to add Base64 to the standard library, then perhaps we should explore some competitors on their own merits.

@bakkot
Copy link
Collaborator

bakkot commented Jul 14, 2021

I would be very reluctant to add any encoding which is not already in widespread use.

@sffc
Copy link
Author

sffc commented Jul 14, 2021

A philosophy in both Intl and Temporal is to nudge developers in the right direction. Popularity aside, if Base16k is better on the merits than Base64, then I think we should consider it.

To be clear, Base16k is not a "slam dunk"; it is significantly better when UTF-16 text is used for interchange, but a bit worse when UTF-8 text is used.

@bakkot
Copy link
Collaborator

bakkot commented Jul 14, 2021

A serialization format should be useful for communicating with other systems, so popularity is an important merit in its own right.

@jimmywarting
Copy link

while i do not agree that we should encouraging base64#6 (we should just deal with typed arrays / binaries anyway and not sending them as JSON)

One thing I'm not particular fan of when it comes to basexx encoding is that they are often hard coded to one specific subset. Either base64, base64 web safe, base32, base16 (aka hax) or just about any other encoding. and some are case insensitive and others are not

if we are going to ship something like it. then i rather think that we should be able to have a more dynamic subset that works with anything you can think of

subsets = {
  base64: 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/',
  base64web: 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_',
  base32: '0123456789abcdefghjkmnpqrtuvwxyz',
  hex: '0123456789ABCDEF',
  binary: '01',
}
ArrayBuffer.fromSubset('_w', { subset: subset.base64web, padding: false })
ArrayBuffer.fromSubset('fffe', { subset: subset.hex })

@bakkot
Copy link
Collaborator

bakkot commented Feb 8, 2024

Closing as settled; this proposal will only include base64 (and base64url) and hex.

@bakkot bakkot closed this as completed Feb 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants