-
Notifications
You must be signed in to change notification settings - Fork 189
[Execution] Reload unexecuted blocks to execution queues on startup #73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
engine/execution/ingestion/engine.go
Outdated
} | ||
|
||
finalizedHeight := header.Height | ||
futureHeight := uint64(8655590) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can't be hardcoded, since we'll eventually run into this number
…/flow-go into leo/4920-reload-unexecuted-blocks
|
||
// starting from the first unexecuted block, go through each unexecuted and finalized block | ||
// reload its block to execution queues | ||
for height := firstUnexecuted; height <= final.Height; height++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We used to fix by starting from the last executed block, which is 1 height lower than the first unexecuted block.
The reason we needed that fix is that the write operations to update the executed height might be interrupted by a restart, which causes the highest executed height to be inaccurate. Inaccurate means, the
This new approach no longer depend on the highest executed height. Instead, it goes through each finalized block and pending block, and check if the block has actually been executed before loading them the execution queue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this solve the problem of moving finalized height? (finalized block increased while we where running older blocks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
I think the finalized height is moving because the follower engine started processing blocks, which triggers BlockProcessable events during the reloading.
I made a fix to lock the queue during the reloading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch 👍
@@ -241,8 +241,9 @@ func (c *Core) prune(final *flow.Header) { | |||
} | |||
} | |||
|
|||
prunedHeights := len(c.heights) - initialHeights | |||
prunedBlockIDs := len(c.blockIDs) - initialBlockIDs | |||
prunedHeights := initialHeights - len(c.heights) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saw the log printing the prunedHeights was negative, and realized the math here needs to reversed.
@@ -50,6 +51,24 @@ type ReadOnlyExecutionState interface { | |||
GetCollection(identifier flow.Identifier) (*flow.Collection, error) | |||
} | |||
|
|||
// IsBlockExecuted returns whether the block has been executed. | |||
// it checks whether the statecommitment exists in execution state. | |||
func IsBlockExecuted(ctx context.Context, state ReadOnlyExecutionState, block flow.Identifier) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reusable function builds on top of abstractions.
|
||
// ES is a mocked version of execution state that | ||
// simulates some of its behavior for testing purpose | ||
type ES struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be something a bit more descriptive
return commit, nil | ||
} | ||
|
||
func ExecuteBlock(t *testing.T, es *ES, block *flow.Block) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason this isn't just attached to the ES
struct?
// If you are testing a module that depends on protocol state's | ||
// behavior, but you don't want to mock up the methods and its return | ||
// value, then just use this module | ||
type PS struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you decide to change the ES
struct, would suggest changing this as well
"github.com/onflow/flow-go/storage" | ||
) | ||
|
||
// PS is a mocked version of protocol state, which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little bit hesitant on having a mocked Protocol State, but it does make the unit tests much more isolated, which is nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't implement this mock, we will be using another mock, which more or less have to re-implement a partial behaviors, which can't be reused.
But now, this mock can be reused and extended.
Testing engines' unittests that depend on protocol state could just use this mocked version
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, I really like the structure you have implemented Leo. Very clean 👏
I have some concerns regarding:
- information consistency during the reloading of the block (multiple calls to
e.state.Final()
which might return different values). - handling of spork's root block seems too much tailored to a genesis block with height 0 and parent block
flow.Zero
. Both conditions don't hold for a general root block of a spork.
respective comments are marked with a
@@ -50,6 +51,24 @@ type ReadOnlyExecutionState interface { | |||
GetCollection(identifier flow.Identifier) (*flow.Collection, error) | |||
} | |||
|
|||
// IsBlockExecuted returns whether the block has been executed. | |||
// it checks whether the statecommitment exists in execution state. | |||
func IsBlockExecuted(ctx context.Context, state ReadOnlyExecutionState, block flow.Identifier) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason why we don't attach this method to ReadOnlyExecutionState
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I thought about that.
If IsBlockExecuted
was a method of ReadOnlyExecutionState
, then each implementation would have to provide the implementation.
However, IsBlockExecuted
is just a function build on top of StateCommitmentByBlockID
, so if you have implemented StateCommitmentByBlockID
, then you should get IsBlockExecuted
for free.
} | ||
|
||
func (es *ExecutionState) StateCommitmentByBlockID(ctx context.Context, blockID flow.Identifier) (flow.StateCommitment, error) { | ||
commit, ok := es.commits[blockID] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are you using this in a concurrent setting? If so, es.commits
should be protected by a mutex. Just asking because in PersistStateCommitment
you explicitly lock es.commits
} | ||
|
||
func (es *ExecutionState) ExecuteBlock(t *testing.T, block *flow.Block) { | ||
_, ok := es.commits[block.Header.ParentID] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here: we might want to protect es.commits
bu a mutex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think It's necessary, since only concurrent write will need protection.
There is no write here, only one read from the commits
.
Also see this experiment, where it's able to read the value without a lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Go Memory Model states:
Note that a read r may observe the value written by a write w that happens concurrently with r. Even if this occurs, it does not imply that reads happening after r will observe writes that happened before w.
As far as I understand statement, you are claiming that a read would be able to observe the latest concurrent read even without a lock. This is not true. You may be observing some writes, but there is no guarantee which ones and when without a mutex.
Co-authored-by: Alexander Hentschel <[email protected]>
Co-authored-by: Alexander Hentschel <[email protected]>
Co-authored-by: Alexander Hentschel <[email protected]>
…/flow-go into leo/4920-reload-unexecuted-blocks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the revisions Leo 👍
50c7b8e
to
c5fb677
Compare
Fix #4920
This PR fixes the issue that on startup, all the unexecuted blocks need to be reloaded into execution queue, and fetch collection for them for execution. Otherwise, the execution might be halt.