ledger: Don't keep repeated slots in the leader schedule#9126
ledger: Don't keep repeated slots in the leader schedule#9126vadorovsky merged 1 commit intoanza-xyz:masterfrom
Conversation
a5744e9 to
28b9f63
Compare
28b9f63 to
ab32710
Compare
|
If this PR represents a change to the public RPC API:
Thank you for keeping the RPC clients in sync with the server API @vadorovsky. |
00e2390 to
9e01c16
Compare
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## master #9126 +/- ##
========================================
Coverage 83.3% 83.3%
========================================
Files 845 845
Lines 319896 320030 +134
========================================
+ Hits 266511 266641 +130
- Misses 53385 53389 +4 🚀 New features to boost your workflow:
|
Tests were constructing schedules with `repeat > 1` but feeding arrays whose length wasn’t a multiple of `repeat`, so some leaders never repeated the expected number of times. That inconsistency also blocks follow-up work on optimizing the schedule calculation (see anza-xyz#9126). Add a debug assert that the schedule length is divisible by `repeat`, and fix the tests to generate properly aligned schedules. Change `TEST_SLOTS_PER_EPOCH` in rpc from 129 (an incorrect, unaligned value) to 256, which stays above the delinquent-slot threshold while keeping the tests fast.
Tests were constructing schedules with `repeat > 1` but feeding arrays whose length wasn’t a multiple of `repeat`, so some leaders never repeated the expected number of times. That inconsistency also blocks follow-up work on optimizing the schedule calculation (see anza-xyz#9126). Add a debug assert that the schedule length is divisible by `repeat`, and fix the tests to generate properly aligned schedules. Change `TEST_SLOTS_PER_EPOCH` in rpc from 129 (an incorrect, unaligned value) to 256, which stays above the delinquent-slot threshold while keeping the tests fast. Ref: anza-xyz#8280
Tests were constructing schedules with `repeat > 1` but feeding arrays whose length wasn’t a multiple of `repeat`, so some leaders never repeated the expected number of times. That inconsistency also blocks follow-up work on optimizing the schedule calculation (see anza-xyz#9126). Add a debug assert that the schedule length is divisible by `repeat`, and fix the tests to generate properly aligned schedules. Change `TEST_SLOTS_PER_EPOCH` in rpc from 129 (an incorrect, unaligned value) to 256, which stays above the delinquent-slot threshold while keeping the tests fast. Ref: anza-xyz#8280
Tests were constructing schedules with `repeat > 1` but feeding arrays whose length wasn’t a multiple of `repeat`, so some leaders never repeated the expected number of times. That inconsistency also blocks follow-up work on optimizing the schedule calculation (see #9126). Add a debug assert that the schedule length is divisible by `repeat`, and fix the tests to generate properly aligned schedules. Change `TEST_SLOTS_PER_EPOCH` in rpc from 129 (an incorrect, unaligned value) to 256, which stays above the delinquent-slot threshold while keeping the tests fast. Ref: #8280
9e01c16 to
d46fd8a
Compare
7e1401f to
3d68019
Compare
75bec31 to
8475e2d
Compare
4e76a2d to
b1d981a
Compare
|
I will try to slowlana this tomorrow with some epoch boundary ledger, but it should become more visible when applied together with #8451 |
brooksprumo
left a comment
There was a problem hiding this comment.
I still need to go through get_leader_upcoming_slots(). Here's some quick comments though.
| // Find out how many slots from the starting chunk we still have to | ||
| // yield. | ||
| let offset_in_chunk = offset_in_epoch % repeat; |
There was a problem hiding this comment.
I think there is an issue here if the offset_chunk is not in index.
If not found, then the offset in the chunk should be 0. But if offset_in_epoch is in the middle of a leader span (not the first slot of four, for example), then offset_in_chunk will be non-zero, skipping those initial slots of the next leader span.
So if .binary_search() returns Err, offset_in_chunk should be set to 0. Probably can match on the result to handle this?
Here's a test to put in test_get_leader_upcoming_slots_with_repeat():
assert!(
leader_schedule
.get_leader_upcoming_slots(&leader_a.id, 5)
.take(10)
.eq([8, 9, 10, 11, 12, 13, 14, 15, 20, 21])
);There was a problem hiding this comment.
Good catch! Fixed and added more tests.
80c3058 to
9873863
Compare
Leader schedule structs keep a vector with public keys that reflect the leadership of them for each slot. However, leaders always repeat for N slots (currently 4). That results in having a vector with each element being subsequently repeated for N times. Reduce the size of vectors 4 times by storing only non-repeating leaders, then utilizing flattened `std::iter::repeat` and floor division to present the correct information to the callers of the `LeaderSchedule` trait.
1972a00 to
f8a6480
Compare
There is no visible difference in profiles and the benchmark shows <1ms difference. The main win from this PR is that we will be able to use |
Problem
Leader schedule structs keep a vector with public keys that reflect the leadership of them for each slot. However, leaders always repeat for N slots (currently 4). That results in having a vector with each element being subsequently repeated for N times.
Summary of Changes
Reduce the size of vectors 4 times by storing only non-repeating leaders, then utilizing flattened
std::iter::repeatand floor division to present the correct information to the callers of theLeaderScheduletrait.