feat: Sequential insertion in indexed trees#10111
Conversation
…c-packages into arv/insert_sequential
| { | ||
| Signal signal(1); | ||
| typename TreeType::AddCompletionCallback completion = [&](const auto&) -> void { signal.signal_level(0); }; | ||
| bool success = true; |
There was a problem hiding this comment.
How does the sequential insertion compare with the batch insertion?
The mutli-threaded batch insertion doesn't perform as well as I had hoped. It was much better when we used pedersen as a much greater portion of the time was spent hashing.
There was a problem hiding this comment.
With witnesses, it's worse (that's expected since you're fetching more witnesses)
However, without witnesses, it's better single threaded (yay) but worse multithreaded. That one is a weird one, since I expected sparse_batch_update (which is the only thing sequential insertion without witnesses does) to perform better than sparse_batch_update and then append. Do you know why this could be the case? Should I profile this thing?
Here is the bench results I got on mainframe (not the best place for benching but good enough I hope)
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/2/iterations:1000 1.84 ms 0.028 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/4/iterations:1000 2.99 ms 0.031 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/8/iterations:1000 5.48 ms 0.029 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/16/iterations:1000 10.2 ms 0.027 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/32/iterations:1000 20.2 ms 0.034 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/64/iterations:1000 39.4 ms 0.041 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/512/iterations:10 303 ms 0.081 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/1024/iterations:10 647 ms 0.093 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/2048/iterations:10 1271 ms 0.125 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/4096/iterations:10 2471 ms 0.237 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/8192/iterations:10 4966 ms 0.388 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/2/iterations:1000 2.22 ms 0.018 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/4/iterations:1000 4.92 ms 0.029 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/8/iterations:1000 9.24 ms 0.029 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/16/iterations:1000 18.2 ms 0.028 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/32/iterations:1000 36.2 ms 0.033 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/64/iterations:1000 73.7 ms 0.044 ms 1000
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/512/iterations:10 575 ms 0.076 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/1024/iterations:10 1187 ms 0.091 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/2048/iterations:10 2406 ms 0.106 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/4096/iterations:10 5038 ms 0.148 ms 10
single_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/8192/iterations:10 11375 ms 0.942 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/2/iterations:1000 1.75 ms 0.022 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/4/iterations:1000 1.69 ms 0.027 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/8/iterations:1000 2.32 ms 0.023 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/16/iterations:1000 4.16 ms 0.033 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/32/iterations:1000 6.99 ms 0.029 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/64/iterations:1000 13.8 ms 0.030 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/512/iterations:10 82.9 ms 0.057 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/1024/iterations:10 143 ms 0.083 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/2048/iterations:10 341 ms 0.115 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/4096/iterations:10 663 ms 0.174 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, BATCH>/8192/iterations:10 1489 ms 0.226 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/2/iterations:1000 1.03 ms 0.020 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/4/iterations:1000 1.43 ms 0.019 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/8/iterations:1000 2.74 ms 0.020 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/16/iterations:1000 4.95 ms 0.022 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/32/iterations:1000 10.0 ms 0.025 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/64/iterations:1000 20.6 ms 0.024 ms 1000
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/512/iterations:10 126 ms 0.053 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/1024/iterations:10 279 ms 0.072 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/2048/iterations:10 658 ms 0.104 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/4096/iterations:10 1641 ms 0.213 ms 10
multi_thread_indexed_tree_with_witness_bench<Poseidon2, SEQUENTIAL>/8192/iterations:10 4417 ms 0.286 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/2/iterations:1000 1.32 ms 0.019 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/4/iterations:1000 1.80 ms 0.023 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/8/iterations:1000 2.54 ms 0.025 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/16/iterations:1000 4.04 ms 0.025 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/32/iterations:1000 7.11 ms 0.028 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/64/iterations:1000 13.4 ms 0.037 ms 1000
single_thread_indexed_tree_bench<Poseidon2, BATCH>/512/iterations:10 64.9 ms 0.077 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/1024/iterations:10 120 ms 0.099 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/2048/iterations:10 211 ms 0.089 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/4096/iterations:10 389 ms 0.159 ms 10
single_thread_indexed_tree_bench<Poseidon2, BATCH>/8192/iterations:10 777 ms 0.696 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/2/iterations:1000 1.01 ms 0.018 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/4/iterations:1000 1.47 ms 0.019 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/8/iterations:1000 2.20 ms 0.024 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/16/iterations:1000 3.84 ms 0.023 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/32/iterations:1000 6.64 ms 0.030 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/64/iterations:1000 12.5 ms 0.032 ms 1000
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/512/iterations:10 63.7 ms 0.074 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/1024/iterations:10 112 ms 0.079 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/2048/iterations:10 205 ms 0.108 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/4096/iterations:10 374 ms 0.152 ms 10
single_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/8192/iterations:10 730 ms 0.402 ms 10
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/2/iterations:1000 1.43 ms 0.026 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/4/iterations:1000 1.68 ms 0.033 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/8/iterations:1000 1.98 ms 0.028 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/16/iterations:1000 2.55 ms 0.029 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/32/iterations:1000 3.75 ms 0.034 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/64/iterations:1000 6.05 ms 0.034 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/512/iterations:100 31.7 ms 0.060 ms 100
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/1024/iterations:100 71.2 ms 0.095 ms 100
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/2048/iterations:100 124 ms 0.106 ms 100
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/4096/iterations:100 248 ms 0.153 ms 100
multi_thread_indexed_tree_bench<Poseidon2, BATCH>/8192/iterations:100 594 ms 0.245 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/2/iterations:1000 0.859 ms 0.020 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/4/iterations:1000 0.987 ms 0.022 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/8/iterations:1000 1.30 ms 0.024 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/16/iterations:1000 2.46 ms 0.034 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/32/iterations:1000 3.68 ms 0.040 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/64/iterations:1000 5.84 ms 0.031 ms 1000
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/512/iterations:100 40.1 ms 0.069 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/1024/iterations:100 62.7 ms 0.086 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/2048/iterations:100 139 ms 0.115 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/4096/iterations:100 246 ms 0.153 ms 100
multi_thread_indexed_tree_bench<Poseidon2, SEQUENTIAL>/8192/iterations:100 808 ms 0.301 ms 100
There was a problem hiding this comment.
As discussed. Looks like the multi-threaded performs a little better in most scenarios. I think we should wait until the compete use case of the world state has settled further and then optimise.
| const size_t initial_size = 1024 * 16; | ||
| std::vector<NullifierLeafValue> initial_batch(initial_size); | ||
| for (size_t i = 0; i < batch_size; ++i) { | ||
| for (size_t i = 0; i < initial_size; ++i) { |
There was a problem hiding this comment.
That look like a bug, nice catch
| // block inside this scope in order to keep current_batch/completion alive until the end of all operations | ||
| signal.wait_for_level(); | ||
| PublicDataTree::AddCompletionCallback completion = [&](const auto&) -> void { signal.signal_decrement(); }; | ||
| wrapper.tree->add_or_update_values_sequentially(public_writes, completion); |
There was a problem hiding this comment.
Nice, so we don't have to do the sync in batches.
8a442fd to
f4586ca
Compare
|
This should be ready for review now |
| index_t index = leaf_index; | ||
| uint32_t level = depth_; | ||
| fr new_hash = HashingPolicy::hash(leaf.get_hash_inputs()); | ||
| fr new_hash = leaf.value.is_empty() ? 0 : HashingPolicy::hash(leaf.get_hash_inputs()); |
There was a problem hiding this comment.
Should we use fr::zero() instead of 0?
There was a problem hiding this comment.
I think recently there was a group of PRs to change fr::zero() to 0. The rationale is that the constructor of fr is constexpr, so the cost is the same (compile-time), while readability is (subjectively) worse. I have no preference personally.
There was a problem hiding this comment.
Me neither, but since in other places of this file fr::zero is used, I don't mind making it the same
PhilWindle
left a comment
There was a problem hiding this comment.
Just the one comment
🤖 I have created a release *beep* *boop* --- <details><summary>aztec-package: 0.65.2</summary> ## [0.65.2](aztec-package-v0.65.1...aztec-package-v0.65.2) (2024-11-28) ### Features * New proving broker ([#10174](#10174)) ([6fd5fc1](6fd5fc1)) </details> <details><summary>barretenberg.js: 0.65.2</summary> ## [0.65.2](barretenberg.js-v0.65.1...barretenberg.js-v0.65.2) (2024-11-28) ### Miscellaneous * **barretenberg.js:** Synchronize aztec-packages versions </details> <details><summary>aztec-packages: 0.65.2</summary> ## [0.65.2](aztec-packages-v0.65.1...aztec-packages-v0.65.2) (2024-11-28) ### Features * Fee foresight support ([#10262](#10262)) ([9e19244](9e19244)) * New proving broker ([#10174](#10174)) ([6fd5fc1](6fd5fc1)) * Sequential insertion in indexed trees ([#10111](#10111)) ([bfd9fa6](bfd9fa6)) * Swap polys to facilitate dynamic trace overflow ([#9976](#9976)) ([b7b282c](b7b282c)) ### Bug Fixes * Don't store indices of zero leaves. ([#10270](#10270)) ([c22be8b](c22be8b)) * Expect proper duplicate nullifier error patterns in e2e tests ([#10256](#10256)) ([4ee8344](4ee8344)) ### Miscellaneous * Check artifact consistency ([#10271](#10271)) ([6a49405](6a49405)) * Dont import things that themselves import jest in imported functions ([#10260](#10260)) ([9440c1c](9440c1c)) * Fix bad merge in integration l1 publisher ([#10272](#10272)) ([b5a6aa4](b5a6aa4)) * Fixing sol warnings ([#10276](#10276)) ([3d113b2](3d113b2)) * Pull out sync changes ([#10274](#10274)) ([391a6b7](391a6b7)) * Pull value merger code from sync ([#10080](#10080)) ([3392629](3392629)) * Remove default gas settings ([#10163](#10163)) ([c9a4d88](c9a4d88)) * Replace relative paths to noir-protocol-circuits ([654d801](654d801)) * Teardown context in prover coordination test ([#10257](#10257)) ([7ea3888](7ea3888)) </details> <details><summary>barretenberg: 0.65.2</summary> ## [0.65.2](barretenberg-v0.65.1...barretenberg-v0.65.2) (2024-11-28) ### Features * Sequential insertion in indexed trees ([#10111](#10111)) ([bfd9fa6](bfd9fa6)) * Swap polys to facilitate dynamic trace overflow ([#9976](#9976)) ([b7b282c](b7b282c)) ### Bug Fixes * Don't store indices of zero leaves. ([#10270](#10270)) ([c22be8b](c22be8b)) ### Miscellaneous * Pull value merger code from sync ([#10080](#10080)) ([3392629](3392629)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
🤖 I have created a release *beep* *boop* --- <details><summary>aztec-package: 0.65.2</summary> ## [0.65.2](AztecProtocol/aztec-packages@aztec-package-v0.65.1...aztec-package-v0.65.2) (2024-11-28) ### Features * New proving broker ([#10174](AztecProtocol/aztec-packages#10174)) ([6fd5fc1](AztecProtocol/aztec-packages@6fd5fc1)) </details> <details><summary>barretenberg.js: 0.65.2</summary> ## [0.65.2](AztecProtocol/aztec-packages@barretenberg.js-v0.65.1...barretenberg.js-v0.65.2) (2024-11-28) ### Miscellaneous * **barretenberg.js:** Synchronize aztec-packages versions </details> <details><summary>aztec-packages: 0.65.2</summary> ## [0.65.2](AztecProtocol/aztec-packages@aztec-packages-v0.65.1...aztec-packages-v0.65.2) (2024-11-28) ### Features * Fee foresight support ([#10262](AztecProtocol/aztec-packages#10262)) ([9e19244](AztecProtocol/aztec-packages@9e19244)) * New proving broker ([#10174](AztecProtocol/aztec-packages#10174)) ([6fd5fc1](AztecProtocol/aztec-packages@6fd5fc1)) * Sequential insertion in indexed trees ([#10111](AztecProtocol/aztec-packages#10111)) ([bfd9fa6](AztecProtocol/aztec-packages@bfd9fa6)) * Swap polys to facilitate dynamic trace overflow ([#9976](AztecProtocol/aztec-packages#9976)) ([b7b282c](AztecProtocol/aztec-packages@b7b282c)) ### Bug Fixes * Don't store indices of zero leaves. ([#10270](AztecProtocol/aztec-packages#10270)) ([c22be8b](AztecProtocol/aztec-packages@c22be8b)) * Expect proper duplicate nullifier error patterns in e2e tests ([#10256](AztecProtocol/aztec-packages#10256)) ([4ee8344](AztecProtocol/aztec-packages@4ee8344)) ### Miscellaneous * Check artifact consistency ([#10271](AztecProtocol/aztec-packages#10271)) ([6a49405](AztecProtocol/aztec-packages@6a49405)) * Dont import things that themselves import jest in imported functions ([#10260](AztecProtocol/aztec-packages#10260)) ([9440c1c](AztecProtocol/aztec-packages@9440c1c)) * Fix bad merge in integration l1 publisher ([#10272](AztecProtocol/aztec-packages#10272)) ([b5a6aa4](AztecProtocol/aztec-packages@b5a6aa4)) * Fixing sol warnings ([#10276](AztecProtocol/aztec-packages#10276)) ([3d113b2](AztecProtocol/aztec-packages@3d113b2)) * Pull out sync changes ([#10274](AztecProtocol/aztec-packages#10274)) ([391a6b7](AztecProtocol/aztec-packages@391a6b7)) * Pull value merger code from sync ([#10080](AztecProtocol/aztec-packages#10080)) ([3392629](AztecProtocol/aztec-packages@3392629)) * Remove default gas settings ([#10163](AztecProtocol/aztec-packages#10163)) ([c9a4d88](AztecProtocol/aztec-packages@c9a4d88)) * Replace relative paths to noir-protocol-circuits ([654d801](AztecProtocol/aztec-packages@654d801)) * Teardown context in prover coordination test ([#10257](AztecProtocol/aztec-packages#10257)) ([7ea3888](AztecProtocol/aztec-packages@7ea3888)) </details> <details><summary>barretenberg: 0.65.2</summary> ## [0.65.2](AztecProtocol/aztec-packages@barretenberg-v0.65.1...barretenberg-v0.65.2) (2024-11-28) ### Features * Sequential insertion in indexed trees ([#10111](AztecProtocol/aztec-packages#10111)) ([bfd9fa6](AztecProtocol/aztec-packages@bfd9fa6)) * Swap polys to facilitate dynamic trace overflow ([#9976](AztecProtocol/aztec-packages#9976)) ([b7b282c](AztecProtocol/aztec-packages@b7b282c)) ### Bug Fixes * Don't store indices of zero leaves. ([#10270](AztecProtocol/aztec-packages#10270)) ([c22be8b](AztecProtocol/aztec-packages@c22be8b)) ### Miscellaneous * Pull value merger code from sync ([#10080](AztecProtocol/aztec-packages#10080)) ([3392629](AztecProtocol/aztec-packages@3392629)) </details> --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please).
Uh oh!
There was an error while loading. Please reload this page.