-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix pruner timing issue with batch pruning #14929
base: develop
Are you sure you want to change the base?
Conversation
@@ -336,7 +325,7 @@ func (s *Store) SaveBlock(ctx context.Context, signed interfaces.ReadOnlySignedB | |||
// if a `saveBlindedBeaconBlocks` key exists in the database. Otherwise, we check if the last | |||
// blocked stored to check if it is blinded, and then write that `saveBlindedBeaconBlocks` key | |||
// to the DB for future checks. | |||
func (s *Store) shouldSaveBlinded(ctx context.Context) (bool, error) { | |||
func (s *Store) shouldSaveBlinded() (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this ctx variable as it was unused
beacon-chain/db/pruner/pruner.go
Outdated
@@ -159,6 +169,24 @@ func (p *Service) prune(slot primitives.Slot) error { | |||
return nil | |||
} | |||
|
|||
func (p *Service) pruneBatches(pruneUpto primitives.Slot) (int, error) { | |||
ctx, cancel := context.WithDeadline(p.ctx, time.Now().Add(defaultPruningWindow)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctx, cancel := context.WithDeadline(p.ctx, time.Now().Add(defaultPruningWindow)) | |
ctx, cancel := context.WithTimeout(p.ctx, defaultPruningWindow) |
A little easier this way and achieves the same result
for { | ||
select { | ||
case <-ctx.Done(): | ||
return numBatches, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this routine always way until the context has expired?
What happens when there is nothing to do? Seems like this routine should exit the loop early if there are no batches left to prune.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I haven't mentioned the number of batches to prune. The idea is to prune as many batches in 3 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The underlying db method returns the number of slots pruned. If the index query for the slot range returns no results, it will early return zero. Looks like on line 191 of pruner.go
, we return from the select if the number of slots is zero. So I think the answer to Preston's question is that the db method will return 0 as soon as it runs out of slots to prune and then we'll break out of this select loop.
for k, v := c.Seek(key); k != nil; k, v = c.Prev() { | ||
slot := bytesutil.BytesToSlotBigEndian(k) | ||
if slot > end { | ||
continue // Seek will seek to the next key *after* the given one if not present |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I hope it will be clear to readers that this only applies on the init condition (since we only seek in the first iteration of the loop and call Prev after that) but it could be worth stating that more explicitly, like: "If the highest slot which we seek to at the beginning of the for loop does not exist, the loop will see the slot after the high slot. continue
here jumps to the c.Prev()
call which rewinds to whatever slot comes just before that".
What type of PR is this?
Bug fix
What does this PR do? Why is it needed?
This PR fixes a bug in pruner where it was taking a lot of time to prune a large database therefore blocking everything else because boltDB locks the database until pruning is completed.
Other notes for review
Solved the above by pruning in batches.
Acknowledgements