Skip to content

Conversation

@potuz
Copy link
Contributor

@potuz potuz commented Jul 27, 2022

This PR ensures that in the situation described in the linked issues, where after pruning INVALID blocks, no viable head is possible, we satisfy the following properties

This is achieved by allowing head to refer to an INVALID block, both in Forkchoice and in blockchain-service.head. Even though both store and forkchoice can have an INVALID root as head, the node itself and the blocks/states would have been removed from forkchoice and database.

Moreover, as a result of this prunning the blockchain.head and the forkchoice's cached head may differ, depending on wether the INVALID status was a result of notifyNewPayload (no block is inserted to forkchoice) or notifyForkchoiceUpdated (the block is inserted in forkchoice and not saved in blockchain.head)

Another danger is that any call to forkchoice.Head() while on this status will result in an error since there are no viable tips.

Fixes #10782
Fixes #10777

s.headLock.RUnlock()

return s.IsOptimisticForRoot(ctx, s.head.root)
return s.IsOptimisticForRoot(ctx, headRoot)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsOptimisticForRoot now requires a lock.

Comment on lines 340 to 344
headRoot, err := s.HeadRoot(ctx)
if err == nil && bytes.Equal(headRoot, root[:]) {
return true, nil
}
return true, errInvalidNilSummary
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it's safer to return true by default, even erroring out. The first if here ensures that if the head root is not in forkchoice and not in DB then we return that we are optimistic. The only reason I can currently think this may happen is only if we have pruned it and the headroot is invalid.

"blockRoot": fmt.Sprintf("%#x", bytesutil.Trunc(headRoot[:])),
"invalidCount": len(invalidRoots),
}).Warn("Pruned invalid blocks, could not update head root")
return nil, invalidBlock{error: ErrInvalidPayload, root: arg.headRoot, invalidAncestorRoots: invalidRoots}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will log here the pruned blocks, this is a code smell for a few reasons:

  1. It repeats the log below, it can be done by having two different log messages the only difference between this one and the one below in the regular pruning on invalid blocks is that the one below contains the new head root (that we failed to compute here). I didn't care much about the repetition because this code should not really happen in runtime unless some pretty extreme situation.
  2. This is a fairly sensitive change. On the one hand it fixes a bad bug: we were returning nil, allowing onBlock to succeed without any error when our head was INVALID! on the other hand it changes the return path of a recursive function, care must be taken by reviewers in all possible paths here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's what is bothering me a little bit. We are returning nil for every err != nil under ErrInvalidPayloadStatus. Should we change most of them to invalidBlock{error: ErrInvalidPayload, root: arg.headRoot, invalidAncestorRoots: invalidRoots}? Ok to do this in another PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It bothers me too, but we used to return errors on everything and it was bad rendering the node locked with minor things. I feel this situation shouldn't happen often so it's kinda safe to put it here, I haven't thought deeply on the other errors here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we only add the invalid block error here ?

invalidBlock{error: ErrInvalidPayload, root: arg.headRoot, invalidAncestorRoots: invalidRoots}

In this whole conditional branch, this is the only location where we return the error.

require.Equal(t, true, IsInvalidBlock(err))
}

// See the description in #10777 and #10782 for the full setup
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests added here cover the main circumstances of a single branch being pruned and no viable head being available. It checks the behavior when the INVALID return comes from NotifyNewPayload and from FCU. Note however that in order to trigger justification change, at least one INVALID block needs to be imported to forkchoice, so this requires at least one INVALID block to be returned as SYNCING/ACCEPTED by notify_newPayload.

These tests do not cover initial sync which can be added if requested, but the paths seem to be contained to the execution engine functions, so it should be similar from onBlock to onBlockBatch. They also do not cover rebooting from this condition. This is a bit more dangerous since our starting head would be non-existent.

}
}

syncCommitteeBits := make(bitfield.Bitvector512, 64)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function prepared a full SyncAggregate, but the signature would fail at the fork. Fixing that perhaps is probable, but since this is not used anywhere on any test, I decided that it was easier to just return the empty participation case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See below: I had to reuse the previous behavior because of the monitor's test

return nil, err
}
blockHash := indexToHash(uint64(slot))
newExecutionPayload := &enginev1.ExecutionPayload{
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be improved by including transactions and configuring some of the fields that are hardcoded here.

NumAttestations uint64
NumDeposits uint64
NumVoluntaryExits uint64
NumTransactions uint64 // Only for post Bellatrix blocks
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is unused now

)

// BlockSignatureBellatrix calculates the post-state root of the block and returns the signature.
func BlockSignatureBellatrix(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is exactly the same as the non-bellatrix version therefore I removed it.

func BlockSignature(
bState state.BeaconState,
block *ethpb.BeaconBlock,
block interface{},
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to use this function to sign Altair and Bellatrix blocks as well.

"github.com/prysmaticlabs/prysm/time/slots"
)

func generateSyncAggregate(st state.BeaconState, privs []bls.SecretKey, parentRoot [32]byte) (*ethpb.SyncAggregate, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function may be salvaged, the signature here is wrong and is never really used. Perhaps it can be saved in a way that it would sign correctly on fork transition, but since it's not used I figured it was better removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to reuse it because it turned out to be used by the monitor indirectly in tests

@potuz potuz marked this pull request as ready for review July 31, 2022 10:43
@potuz potuz requested a review from a team as a code owner July 31, 2022 10:44
votedKeys, votedIndices, didntVoteIndices, err := FilterSyncCommitteeVotes(s, sync)
if err != nil {
return nil, err
return nil, errors.Wrap(err, "could not filter sync committee votes")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this related to the PR? I don't mind the changes, I'm just curious

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I see, you made it better because you had to work through GenerateFullBlockAltair for tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, things were failing and I didn't know where

potuz and others added 2 commits July 31, 2022 14:35
Co-authored-by: terencechain <terence@prysmaticlabs.com>
terencechain
terencechain previously approved these changes Aug 1, 2022
// if the requested root is the headroot we should treat the
// node as optimistic. This can happen if we pruned INVALID
// nodes and no viable head is available.
headRoot, err := s.HeadRoot(ctx)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this work, we have a head for which there is no state summary/block in the db ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and even worse, it can be different than forkchoice's

"blockRoot": fmt.Sprintf("%#x", bytesutil.Trunc(headRoot[:])),
"invalidCount": len(invalidRoots),
}).Warn("Pruned invalid blocks, could not update head root")
return nil, invalidBlock{error: ErrInvalidPayload, root: arg.headRoot, invalidAncestorRoots: invalidRoots}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we only add the invalid block error here ?

invalidBlock{error: ErrInvalidPayload, root: arg.headRoot, invalidAncestorRoots: invalidRoots}

In this whole conditional branch, this is the only location where we return the error.

@mkalinin
Copy link

mkalinin commented Aug 2, 2022

Another danger is that any call to forkchoice.Head() while on this status will result in an error since there are no viable tips.

Why will it result in an error? According to the specification, get_head should return store.justified_checkpoint.root if no viable tip is available

@potuz
Copy link
Contributor Author

potuz commented Aug 2, 2022

Another danger is that any call to forkchoice.Head() while on this status will result in an error since there are no viable tips.

Why will it result in an error? According to the specification, get_head should return store.justified_checkpoint.root if no viable tip is available

This is implementation dependant. The specs cannot say what a function does or doesn't do. In this case the computed head will be the justified checkpoint but we return an error because it's not viable for head

Co-authored-by: Nishant Das <nishdas93@gmail.com>
@terencechain
Copy link
Collaborator

Chatted with @potuz offline. The ideal solution would be:

Implement this method in fork choice

(f *forkchocice) AreAllTipsInvalid() bool {}

Then in IsOptmistic, we add the following breaking:

func (s *Service) IsOptimistic(ctx context.Context) (bool, error) {
   if s.cfg.ForkChoiceStore.AreAllTipsInvalid {
      return true, nil
   }
}

That can be done in another PR. For now, I'm going to move forward with this one

@potuz potuz merged commit 4b46dea into develop Aug 2, 2022
@delete-merged-branch delete-merged-branch bot deleted the invalid-head branch August 2, 2022 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update justification when pruning INVALID nodes Wrong justification on init sync nodes

4 participants