streamer/sigverify: use bounded channels between streamers and sigver#9732
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #9732 +/- ##
=======================================
Coverage 82.7% 82.7%
=======================================
Files 901 901
Lines 324458 324458
=======================================
+ Hits 268337 268379 +42
+ Misses 56121 56079 -42 🚀 New features to boost your workflow:
|
11fe411 to
465437c
Compare
|
The proposed size of 50K transactions should be equivalent to 50ms of buffering at 1M TPS, which is way more than what streamer is ever supposed to let through. total_handle_chunk_to_packet_send_full_err metric will get incremented in case we bump against the channel size here. I would consider wiring in some logic to throttle connections for a few ms when this buffer becomes full. It would be far better than spilling excess transactions on the floor after the clients have sent them. |
…anza-xyz#9732) use bounded channels between streamers and sigver
|
this change failed to update error send-side error handling properly handled in #10498 |
|
Backports to the beta branch are to be avoided unless absolutely necessary for fixing bugs, security issues, and perf regressions. Changes intended for backport should be structured such that a minimum effective diff can be committed separately from any refactoring, plumbing, cleanup, etc that are not strictly necessary to achieve the goal. Any of the latter should go only into master and ride the normal stabilization schedule. Exceptions include CI/metrics changes, CLI improvements and documentation updates on a case by case basis. |


Problem
Streamer and sigverify exchange information using unbounded channels.
This leads to unnecessary allocations:
Summary of Changes
Instead, use bounded channel between streamer and sigverify for TPU transactions. Votes and Fwd will be addressed separately. To determine the size of the channel I propose to use metric data. It should be noted that the version 3.1 and earlier do have separate batching stage between sigverify and streamer which is removed in 4.0. In this batching stage, we used bounded channel of size 250k packets. During the high load event 10-10-25, it was never saturated which means that this might be considered as upper bound. From the other hand, there is metric
tpu-verifier.max_packetswhich indicates how many packets we consume at once from the input channel for the sigverify. The peak on the same date was 20k. Hence, I think that 50k is a more reasonable upper bound.The results in profiler are the following:
