-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Block validation flow v2 + Batch (serial) sig verification #2250
Conversation
|
||
### TransitionVerifiedBeaconBlocks | ||
|
||
A block that passes the state transition checks and can be successfully applied to the beacon chain is considered `TransitionVerified`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, the state transition checks all signatures (unless it's a sigverified), so how can this type of block happen? I think in all cases we want to run signature checks before or as part of state transition, but never after (as running state transition on blocks that haven't passed sigver is fishy)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's up to us. In the current state it can't happen because I do crypto verif first hence why TransitionVerified isn't materialized in the code.
That said currently before this PR, within a state_transition epoch/block we did all the checks first and crypto verification last because it is much more expensive. This is an option that we can keep in mind, depending on which worst case scenario we want: either faulty or malicious parties use valid crypto and fake blocks and in that case we want to verify state transition first, or they use valid blocks and invalid crypto and in that case we want to verify crypto first.
return err((ValidationResult.Reject, Invalid)) | ||
|
||
static: doAssert sizeof(SignedBeaconBlock) == sizeof(SigVerifiedSignedBeaconBlock) | ||
let sigVerifBlock = cast[SigVerifiedSignedBeaconBlock](signedBlock) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this certainly looks fishy... I'm surprised it even works with the List types and Trusted vs normal bls values
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's AFAIK the simplest way to convert. I changed TrustedSig so that it's a "distinct" type from normal Signatures (just an object with the exact same fields) and has the same byte representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's a distinct type, SigVerifiedBeaconBlock(signedBlock)
should work - without the cast - I'm a bit afraid the cast might do bad things to the gc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, looked it up, it's not a distinct type, just a similar type.. so when assigning something to something else, the gc needs new refcounts for seqs - I'm not sure that will happen with cast
- check the generated C
code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, looking a bit more at this, the codegen does some extremely inefficient things for this cast - it will generate a temporary union type with the two types as members, then genericAssign
twice, once for each type, in and out of the union.. it's "correct:ish" but extremely wasteful.
here's something better:
var sigVerifBlock: SigVerifiedSignedBeaconBlock
assign(sigVerifiedBlock, cast[ptr SigVerifiedSignedBeaconBlock](signedBlock)[])
using stew/assign - this "should" work as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I think casting through pointer in the function call will produce no copy (because verified blocks are never used where they are produced, they are passed to another consumer).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess, yeah - the only thing I'm thinking about is strict pointer aliasing - the nim impl is inefficient but correct (with stew/assign it would be decently efficient as well but...)..
hm, not sure what the right thing here is tbh - if I look at vtune
, copying blocks is one of the highest cpu consumers apart from crypto because of how extremely slow genericAssign is.. we could do the pointer thing which represents a small risk of strict aliasing issues, but we're not looping here or anything such, so perhaps it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New codegen
#line 235 "/home/beta/Programming/Status/nim-beacon-chain/beacon_chain/block_pools/clearance.nim"
T29_ = addRawBlockCheckStateTransition__ln6s2fp4XrmWohuHu9bOV8A(dag, quarantine, (&(*((tyObject_SigVerifiedSignedBeaconBlock__FeS1XMPF6R63Ue3RhifLCg*) (signedBlock)))), (&cache)); valRes = T29_.Field0; blockErr = T29_.Field1;
// [...]
#line 242 "/home/beta/Programming/Status/nim-beacon-chain/beacon_chain/block_pools/clearance.nim"
addResolvedBlock__kpy9cmxed3iYehH784xNCUA(dag, quarantine, (&(*(*dag)).clearanceState), (&(*((tyObject_TrustedSignedBeaconBlock__hdN6xn3RN9aAfpVMBC35UEw*) (signedBlock)))), parent, (&cache), onBlockAdded);
#2255 was needed to unbreak unstable. I think all concerns are addressed for this PR. |
@@ -304,12 +304,16 @@ func toBeaconBlockSummary(v: SomeBeaconBlock): BeaconBlockSummary = | |||
parent_root: v.parent_root, | |||
) | |||
|
|||
# TODO: we should only store TrustedSignedBeaconBlock in the DB. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with the new asTrusted
where memory layout is the same for all block kinds, this should actually be a simple change
…rypto and state transition
… br_hmac_drbg_context in stdlib_assertions.nim.c)
…e pubkey validations is now done first in signatures_batch)
681a0a8
to
2378ccc
Compare
The PR has been rebased and should be good to merge. Following #2259 I modified signatures_batch.nim. Previously since signatures and pubkeys were verified at deserialization it could assume that the keys were correct and avoid |
* bump nim-blscurve * Outline the block validation flow * introduce the SigVerified types, pass the tests * Split clearance/quarantine to prepare for batch crypto verif * Add a batch signature collector * Make clearance use SigVerified block and split verification between crypto and state transition * Always use signedBeaconBlock for the onBlockAdded callback * RANDAO signing_root is the epoch instead of the full block * Support skipping BLS for testing * Fix compilation of the validator client * Try to fix strange errors MacOS and Jenkins (Clang, unknown type name br_hmac_drbg_context in stdlib_assertions.nim.c) * address #2250 (comment) * address #2250 (comment) * onBlockAdded callback should use TrustedSignedBeaconBlock #2250 (comment) * address #2250 (comment) * Use the application RNG: #2250 (comment) * Improve codegen of conversion zero-cost) * Quick fixes with loadWithCache after #2259 (TODO: graceful error since pubkey validations is now done first in signatures_batch) * Graceful handle rogue pubkeys and signatures now that those are lazy-loaded
Overview
This reworks the block validation flow to introduce
SigVerified
blocks besides the existing blocks andtrusted
blocks.This mention but does not introduce
TransitionVerified
blocks as without generics or static enum that would lead to lots of duplication.The flow matrix is the following
Notice that at the moment
TrustedAttestation
is imprecise and actually meansSigVerifiedAttestation
.The current code checks cryptography then state meaning
SigVerifiedAttestation
is a state that only exists between those lines of code:nimbus-eth2/beacon_chain/block_pools/clearance.nim
Lines 199 to 205 in ea8fe25
Implementation
This reimplements the branch https://github.com/status-im/nimbus-eth2/tree/block-validation-flow which is stuck due to issues mentioned in #2219. In particular, besides workarounds needed to avoid compiler ICE, the branch is stuck trying to deal with generic types in macro nim-lang/RFCs#44
Switching to multithreading
Switching to multithreading requires changing this line
nimbus-eth2/beacon_chain/spec/signatures_batch.nim
Lines 285 to 286 in ea8fe25
to
batchVerify
and compile with-d:openmp
.