Skip to content

[Execution] Reload unexecuted blocks to execution queues on startup #73

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 53 commits into from
Nov 2, 2020

Conversation

zhangchiqing
Copy link
Member

@zhangchiqing zhangchiqing commented Oct 15, 2020

Fix #4920

This PR fixes the issue that on startup, all the unexecuted blocks need to be reloaded into execution queue, and fetch collection for them for execution. Otherwise, the execution might be halt.

}

finalizedHeight := header.Height
futureHeight := uint64(8655590)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can't be hardcoded, since we'll eventually run into this number

@zhangchiqing zhangchiqing requested review from Kay-Zee and removed request for psiemens October 26, 2020 04:23
@zhangchiqing zhangchiqing changed the title Leo/4920 reload unexecuted blocks [Execution] Reload unexecuted blocks to execution queues on startup Oct 26, 2020

// starting from the first unexecuted block, go through each unexecuted and finalized block
// reload its block to execution queues
for height := firstUnexecuted; height <= final.Height; height++ {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to fix by starting from the last executed block, which is 1 height lower than the first unexecuted block.

The reason we needed that fix is that the write operations to update the executed height might be interrupted by a restart, which causes the highest executed height to be inaccurate. Inaccurate means, the

This new approach no longer depend on the highest executed height. Instead, it goes through each finalized block and pending block, and check if the block has actually been executed before loading them the execution queue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this solve the problem of moving finalized height? (finalized block increased while we where running older blocks)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

I think the finalized height is moving because the follower engine started processing blocks, which triggers BlockProcessable events during the reloading.

I made a fix to lock the queue during the reloading

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch 👍

@@ -241,8 +241,9 @@ func (c *Core) prune(final *flow.Header) {
}
}

prunedHeights := len(c.heights) - initialHeights
prunedBlockIDs := len(c.blockIDs) - initialBlockIDs
prunedHeights := initialHeights - len(c.heights)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw the log printing the prunedHeights was negative, and realized the math here needs to reversed.

@@ -50,6 +51,24 @@ type ReadOnlyExecutionState interface {
GetCollection(identifier flow.Identifier) (*flow.Collection, error)
}

// IsBlockExecuted returns whether the block has been executed.
// it checks whether the statecommitment exists in execution state.
func IsBlockExecuted(ctx context.Context, state ReadOnlyExecutionState, block flow.Identifier) (bool, error) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusable function builds on top of abstractions.


// ES is a mocked version of execution state that
// simulates some of its behavior for testing purpose
type ES struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be something a bit more descriptive

return commit, nil
}

func ExecuteBlock(t *testing.T, es *ES, block *flow.Block) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason this isn't just attached to the ES struct?

// If you are testing a module that depends on protocol state's
// behavior, but you don't want to mock up the methods and its return
// value, then just use this module
type PS struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you decide to change the ES struct, would suggest changing this as well

"github.com/onflow/flow-go/storage"
)

// PS is a mocked version of protocol state, which
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit hesitant on having a mocked Protocol State, but it does make the unit tests much more isolated, which is nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we don't implement this mock, we will be using another mock, which more or less have to re-implement a partial behaviors, which can't be reused.

But now, this mock can be reused and extended.

Testing engines' unittests that depend on protocol state could just use this mocked version

@zhangchiqing zhangchiqing requested a review from ramtinms October 30, 2020 00:10
Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, I really like the structure you have implemented Leo. Very clean 👏

I have some concerns regarding:

  • information consistency during the reloading of the block (multiple calls to e.state.Final() which might return different values).
  • handling of spork's root block seems too much tailored to a genesis block with height 0 and parent block flow.Zero. Both conditions don't hold for a general root block of a spork.

respective comments are marked with a ⚠️ sign

@@ -50,6 +51,24 @@ type ReadOnlyExecutionState interface {
GetCollection(identifier flow.Identifier) (*flow.Collection, error)
}

// IsBlockExecuted returns whether the block has been executed.
// it checks whether the statecommitment exists in execution state.
func IsBlockExecuted(ctx context.Context, state ReadOnlyExecutionState, block flow.Identifier) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why we don't attach this method to ReadOnlyExecutionState?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I thought about that.

If IsBlockExecuted was a method of ReadOnlyExecutionState, then each implementation would have to provide the implementation.

However, IsBlockExecuted is just a function build on top of StateCommitmentByBlockID, so if you have implemented StateCommitmentByBlockID, then you should get IsBlockExecuted for free.

}

func (es *ExecutionState) StateCommitmentByBlockID(ctx context.Context, blockID flow.Identifier) (flow.StateCommitment, error) {
commit, ok := es.commits[blockID]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you using this in a concurrent setting? If so, es.commits should be protected by a mutex. Just asking because in PersistStateCommitment you explicitly lock es.commits

}

func (es *ExecutionState) ExecuteBlock(t *testing.T, block *flow.Block) {
_, ok := es.commits[block.Header.ParentID]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here: we might want to protect es.commits bu a mutex

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think It's necessary, since only concurrent write will need protection.

There is no write here, only one read from the commits.

Also see this experiment, where it's able to read the value without a lock

Copy link
Member

@AlexHentschel AlexHentschel Nov 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️
The Go Memory Model states:

Note that a read r may observe the value written by a write w that happens concurrently with r. Even if this occurs, it does not imply that reads happening after r will observe writes that happened before w.

As far as I understand statement, you are claiming that a read would be able to observe the latest concurrent read even without a lock. This is not true. You may be observing some writes, but there is no guarantee which ones and when without a mutex.

Copy link
Member

@AlexHentschel AlexHentschel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the revisions Leo 👍

@zhangchiqing zhangchiqing force-pushed the leo/4920-reload-unexecuted-blocks branch from 50c7b8e to c5fb677 Compare November 2, 2020 21:42
@zhangchiqing zhangchiqing merged commit 8ef6e57 into master Nov 2, 2020
@zhangchiqing zhangchiqing deleted the leo/4920-reload-unexecuted-blocks branch November 2, 2020 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants