Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement blake3 #191

Closed
allada opened this issue Jul 16, 2023 · 10 comments
Closed

Implement blake3 #191

allada opened this issue Jul 16, 2023 · 10 comments
Assignees

Comments

@allada
Copy link
Member

allada commented Jul 16, 2023

Sha256 is known to be quite slow. There's a new kid in town, Blake3. It's crazy fast, super secure and just better overall.

BRE already supports it:
https://github.com/bazelbuild/remote-apis/blob/39c174e10d224c46b556d8d4615863804d5b2ff6/build/bazel/remote/execution/v2/remote_execution.proto#L1900

Bazel appears to be in the process of supporting it:
bazelbuild/bazel#18658

Micro-bench testing shows it is worth the effort:
https://gist.github.com/allada/6b4321a6487c2888ff73ce1cc0fc86ed

All results are on a 16 core i9 with (no threads):
1GB @ 10:

sha256: 35.546221869s
blake3: 2.346503712s

abs difference: 33.199718157s
% difference r: 1514.86%
% difference i: 6.60%

1MB @ 10_000:

sha256: 34.653737424s
blake3: 2.129524155s

abs difference: 32.524213269s
% difference r: 1627.30%
% difference i: 6.15%

1KB @ 1_000_000:

sha256: 3.714629289s
blake3: 725.049013ms

abs difference: 2.989580276s
% difference r: 512.33%
% difference i: 19.52%

100B @ 1_000_000:

sha256: 453.309362ms
blake3: 113.922135ms

abs difference: 339.387227ms
% difference r: 397.91%
% difference i: 25.13%

This is a significant difference and when things are under high workload, it is often because we are spending so much time hashing. This hash a very high chance of dramatically improving performance of the cas stores.

@aaronmondal
Copy link
Member

This looks fantastic! It even has an official implementation for Rust 😍

There don't seem to be any viable alternatives either since this is also officially endorsed by the RBE. As for usability it seems like Buck2 already supports it.

cc @JannisFengler since this might interest you 😄

@allada
Copy link
Member Author

allada commented Jul 16, 2023

For anyone that is interested in implementing it, I think a great first step is to support it in some of our internal Store<->Store mappings like DedupStore. Because in Dedup store we can use Blake3 for both content_store, since this is all implementation detail that client's don't know about.

@SpamDoodler
Copy link

I would be interested in working on this issue!

@SpamDoodler
Copy link

SpamDoodler commented Jul 18, 2023

The hash in check_missing_last_chunk_test() in cas/store/tests/dedup_store_test.rs seems to be wrong (it should be since it's computed with sha256).
How do I get the correct test hash for blake3 there?

Other tests are already passing.

@aaronmondal
Copy link
Member

aaronmondal commented Jul 18, 2023

How do I get the correct test hash for blake3 there?

If you use an assert to check for equality failing tests should print something like "you gave me hash but I expected other_hash". Then you can copy-paste it from that message and if the CI runner passes on that new hash things should be fine.

As an additional test you could also use the testvectors from the blake3 repo to make sure that the "basecase" of running blake3 on something is really the same: https://github.com/BLAKE3-team/BLAKE3/blob/master/test_vectors/test_vectors.json

@allada
Copy link
Member Author

allada commented Jul 21, 2023

Awesome thanks @SpamDoodler , but for some reason it looks like #205 's commit was attributed to me instead of you. This is probably because of some rebase issues. If you want I can revert and recommit if you really care about the attribution.

@chrisstaite-menlo
Copy link
Collaborator

chrisstaite-menlo commented Sep 14, 2023

The v2.3 format specifies the hash type in the URL, meaning that BLAKE3 can now be supported by CAS in general... #281 parses the value in. It's worth noting that the hash type is supposed to be interpreted based on length of the hash if it isn't specified. See the remote_execution.proto for more details.

Our capabilities response should also be updated to specify that we support BLAKE3.

allada added a commit that referenced this issue Nov 16, 2023
Adds DigestHasher and hashing implementation for Sha256. This will
be used to abstract the different hashing algos when computing
digests.

towards: #191
allada added a commit that referenced this issue Nov 19, 2023
Adds DigestHasher and hashing implementation for Sha256. This will
be used to abstract the different hashing algos when computing
digests.

towards: #191
allada added a commit that referenced this issue Nov 19, 2023
Adds DigestHasher and hashing implementation for Sha256. This will
be used to abstract the different hashing algos when computing
digests.

towards: #191
@allada
Copy link
Member Author

allada commented Nov 20, 2023

Done.

@allada allada closed this as completed Nov 20, 2023
TripleKai pushed a commit to TripleKai/turbo-cache that referenced this issue Nov 29, 2023
Adds DigestHasher and hashing implementation for Sha256. This will
be used to abstract the different hashing algos when computing
digests.

towards: TraceMachina#191
@prestonvanloon
Copy link

Does docker-compose have blake3 support? Or is there some config that needs to be adjusted for it to work? I tried from f9e537c and CAS seemed very unhappy about it.

native_link_local_cas_1  | [2023-12-04T21:50:52.618Z ERROR native_link_service::bytestream_server] Write Resp: 0.000190608 None Status { code: Internal, message: "Received erroneous partial chunk: Error { code: Internal, messages: [\"Writer was dropped before EOF was sent\"] } : During buf_channel::take() : Failed to read take in update in compression store : --- : Received erroneous partial chunk: Error { code: Internal, messages: [\"Writer was dropped before EOF was sent\"] } : Failed to receive data in filesystem store : While processing with temp file \"~/.cache/native-link/tmp_path-cas/bb3bde56c0b4e37f6ed21c4bbc996f2a550c4cf483e588e92d31000000000000-22578\" : Inner store update in compression store failed : Compression underlying store update failed : --- : Hashes do not match, got: bb3bde56c0b4e37f6ed21c4bbc996f2a550c4cf483e588e9b063c4b3841e93dc but digest hash was c6ed54e3f70d5c943750369a564b7e49adb001a40fbb234765e9aa7bee6b66c0 : Error updating inner store : In ByteStreamServer::write()", source: None }

@allada
Copy link
Member Author

allada commented Dec 5, 2023

Does docker-compose have blake3 support? Or is there some config that needs to be adjusted for it to work? I tried from f9e537c and CAS seemed very unhappy about it.

Yes it does work but it looks like you are using VerifyStore, which has not been updated to support it yet. Currently the verify_store is hard-codded to sha256 currently. In theory it will be easy to migrate to use the global default hashing config, but to use the request specific hash function will require all stores to be updated to pipe the hash function around.

Here's the line where it's hard codded:
https://github.com/TraceMachina/native-link/blob/f9e537c3f9b5656be6251902640ff003a5b8cc48/native-link-store/src/verify_store.rs#L140

The easiest way is if you are able to disable hash verification in the verify store in the config it should work. I'll see about getting the VerifyStore updated too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants