[Execution] Reload unexecuted blocks to execution queues on startup #73

zhangchiqing · 2020-10-15T04:11:06Z

This PR fixes the issue that on startup, all the unexecuted blocks need to be reloaded into execution queue, and fetch collection for them for execution. Otherwise, the execution might be halt.

Kay-Zee · 2020-10-25T19:19:18Z

engine/execution/ingestion/engine.go

+	}
+
+	finalizedHeight := header.Height
+	futureHeight := uint64(8655590)


This can't be hardcoded, since we'll eventually run into this number

engine/execution/ingestion/engine.go

Co-authored-by: Kan Zhang <[email protected]>

…/flow-go into leo/4920-reload-unexecuted-blocks

zhangchiqing · 2020-10-26T21:53:36Z

engine/execution/ingestion/engine.go

+
+	// starting from the first unexecuted block, go through each unexecuted and finalized block
+	// reload its block to execution queues
+	for height := firstUnexecuted; height <= final.Height; height++ {


We used to fix by starting from the last executed block, which is 1 height lower than the first unexecuted block.

The reason we needed that fix is that the write operations to update the executed height might be interrupted by a restart, which causes the highest executed height to be inaccurate. Inaccurate means, the

This new approach no longer depend on the highest executed height. Instead, it goes through each finalized block and pending block, and check if the block has actually been executed before loading them the execution queue.

Does this solve the problem of moving finalized height? (finalized block increased while we where running older blocks)

Good point.

I think the finalized height is moving because the follower engine started processing blocks, which triggers BlockProcessable events during the reloading.

I made a fix to lock the queue during the reloading

Good catch 👍

zhangchiqing · 2020-10-26T21:54:22Z

module/synchronization/core.go

@@ -241,8 +241,9 @@ func (c *Core) prune(final *flow.Header) {
 		}
 	}

-	prunedHeights := len(c.heights) - initialHeights
-	prunedBlockIDs := len(c.blockIDs) - initialBlockIDs
+	prunedHeights := initialHeights - len(c.heights)


I saw the log printing the prunedHeights was negative, and realized the math here needs to reversed.

zhangchiqing · 2020-10-26T21:54:55Z

engine/execution/state/state.go

@@ -50,6 +51,24 @@ type ReadOnlyExecutionState interface {
 	GetCollection(identifier flow.Identifier) (*flow.Collection, error)
 }

+// IsBlockExecuted returns whether the block has been executed.
+// it checks whether the statecommitment exists in execution state.
+func IsBlockExecuted(ctx context.Context, state ReadOnlyExecutionState, block flow.Identifier) (bool, error) {


Reusable function builds on top of abstractions.

… block

Kay-Zee · 2020-10-29T23:03:15Z

utils/unittest/mocks/execution_state.go

+
+// ES is a mocked version of execution state that
+// simulates some of its behavior for testing purpose
+type ES struct {


Could be something a bit more descriptive

Kay-Zee · 2020-10-29T23:03:46Z

utils/unittest/mocks/execution_state.go

+	return commit, nil
+}
+
+func ExecuteBlock(t *testing.T, es *ES, block *flow.Block) {


Any reason this isn't just attached to the ES struct?

Kay-Zee · 2020-10-29T23:04:20Z

utils/unittest/mocks/protocol_state.go

+// If you are testing a module that depends on protocol state's
+// behavior, but you don't want to mock up the methods and its return
+// value, then just use this module
+type PS struct {


if you decide to change the ES struct, would suggest changing this as well

Kay-Zee · 2020-10-29T23:05:21Z

utils/unittest/mocks/protocol_state.go

+	"github.com/onflow/flow-go/storage"
+)
+
+// PS is a mocked version of protocol state, which


A little bit hesitant on having a mocked Protocol State, but it does make the unit tests much more isolated, which is nice.

If we don't implement this mock, we will be using another mock, which more or less have to re-implement a partial behaviors, which can't be reused.

But now, this mock can be reused and extended.

Testing engines' unittests that depend on protocol state could just use this mocked version

AlexHentschel

Generally, I really like the structure you have implemented Leo. Very clean 👏

I have some concerns regarding:

information consistency during the reloading of the block (multiple calls to e.state.Final() which might return different values).
handling of spork's root block seems too much tailored to a genesis block with height 0 and parent block flow.Zero. Both conditions don't hold for a general root block of a spork.

respective comments are marked with a ⚠️ sign

engine/execution/ingestion/engine.go

engine/execution/state/state.go

engine/execution/ingestion/engine.go

AlexHentschel · 2020-10-30T05:01:55Z

engine/execution/state/state.go

@@ -50,6 +51,24 @@ type ReadOnlyExecutionState interface {
 	GetCollection(identifier flow.Identifier) (*flow.Collection, error)
 }

+// IsBlockExecuted returns whether the block has been executed.
+// it checks whether the statecommitment exists in execution state.
+func IsBlockExecuted(ctx context.Context, state ReadOnlyExecutionState, block flow.Identifier) (bool, error) {


Is there a reason why we don't attach this method to ReadOnlyExecutionState?

Yes, I thought about that.

If IsBlockExecuted was a method of ReadOnlyExecutionState, then each implementation would have to provide the implementation.

However, IsBlockExecuted is just a function build on top of StateCommitmentByBlockID, so if you have implemented StateCommitmentByBlockID, then you should get IsBlockExecuted for free.

AlexHentschel · 2020-10-30T05:19:18Z

utils/unittest/mocks/execution_state.go

+}
+
+func (es *ExecutionState) StateCommitmentByBlockID(ctx context.Context, blockID flow.Identifier) (flow.StateCommitment, error) {
+	commit, ok := es.commits[blockID]


are you using this in a concurrent setting? If so, es.commits should be protected by a mutex. Just asking because in PersistStateCommitment you explicitly lock es.commits

AlexHentschel · 2020-10-30T05:19:47Z

utils/unittest/mocks/execution_state.go

+}
+
+func (es *ExecutionState) ExecuteBlock(t *testing.T, block *flow.Block) {
+	_, ok := es.commits[block.Header.ParentID]


same here: we might want to protect es.commits bu a mutex

I don't think It's necessary, since only concurrent write will need protection.

There is no write here, only one read from the commits.

Also see this experiment, where it's able to read the value without a lock

⚠️
The Go Memory Model states:

Note that a read r may observe the value written by a write w that happens concurrently with r. Even if this occurs, it does not imply that reads happening after r will observe writes that happened before w.

As far as I understand statement, you are claiming that a read would be able to observe the latest concurrent read even without a lock. This is not true. You may be observing some writes, but there is no guarantee which ones and when without a mutex.

state/protocol/badger/mutator.go

engine/execution/ingestion/engine.go

Co-authored-by: Alexander Hentschel <[email protected]>

…/flow-go into leo/4920-reload-unexecuted-blocks

AlexHentschel

Thanks for the revisions Leo 👍

zhangchiqing added 6 commits October 14, 2020 19:29

log receiving collection

c28e513

reload all finalized and unexecuted blocks

3529525

print block height

6539803

update metrics

1d6ba0f

show block height in extend call

d322a7c

log missing parent

c18760d

zhangchiqing requested review from m4ksio and psiemens as code owners October 15, 2020 04:11

zhangchiqing added 7 commits October 14, 2020 21:17

linting

f159cd1

fix logging

1b43735

log last finalized

6f6d976

adjust log level

bdd0485

force executing on startup

d86c30e

pull until a future height

e50f7e5

fix linting

0ce8341

Kay-Zee requested changes Oct 25, 2020

View reviewed changes

zhangchiqing and others added 6 commits October 25, 2020 12:32

Update engine/execution/ingestion/engine.go

67ad9fd

Co-authored-by: Kan Zhang <[email protected]>

add comments

e8f48a6

deduplicate queue

cfa26e3

refactor block reloading

b50d86d

Merge branch 'master' into leo/4920-reload-unexecuted-blocks

4cb1354

fix IsBlockExecuted

c604e5b

zhangchiqing requested review from Kay-Zee and removed request for psiemens October 26, 2020 04:23

zhangchiqing added 3 commits October 25, 2020 21:24

Merge branch 'leo/4920-reload-unexecuted-blocks' of github.com:onflow…

34c646d

…/flow-go into leo/4920-reload-unexecuted-blocks

exclude the last executed block from the execution queues

7a523f3

update comment

b79dc3a

zhangchiqing changed the title ~~Leo/4920 reload unexecuted blocks~~ [Execution] Reload unexecuted blocks to execution queues on startup Oct 26, 2020

zhangchiqing commented Oct 26, 2020

View reviewed changes

log queue head, and refactor with IsBlockExecuted

9f044b1

zhangchiqing added 3 commits October 29, 2020 15:35

fix lint

210e56d

fix lint

e9eb8b7

reload the last executed block instead of the last finalized executed…

bc9910a

… block

Kay-Zee approved these changes Oct 29, 2020

View reviewed changes

zhangchiqing added 4 commits October 29, 2020 16:09

Merge branch 'master' into leo/4920-reload-unexecuted-blocks

b974edc

refactor mocks

70917c7

fix lint

114a8e8

update comment

905a973

zhangchiqing requested a review from ramtinms October 30, 2020 00:10

fix logging

b6a9788

ramtinms approved these changes Oct 30, 2020

View reviewed changes

AlexHentschel requested changes Oct 30, 2020

View reviewed changes

AlexHentschel reviewed Oct 30, 2020

View reviewed changes

engine/execution/ingestion/engine.go Show resolved Hide resolved

zhangchiqing and others added 5 commits October 29, 2020 23:27

Apply suggestions from code review

12c327a

Co-authored-by: Alexander Hentschel <[email protected]>

Update state/protocol/badger/mutator.go

a26a683

Co-authored-by: Alexander Hentschel <[email protected]>

Update engine/execution/state/state.go

6eea192

Co-authored-by: Alexander Hentschel <[email protected]>

pin the snapshot for reloading blocks

70fdcc2

Merge branch 'master' into leo/4920-reload-unexecuted-blocks

cfa0cdc

zhangchiqing requested a review from AlexHentschel October 30, 2020 06:43

zhangchiqing added 5 commits October 30, 2020 14:02

Merge branch 'master' into leo/4920-reload-unexecuted-blocks

12426d6

Merge branch 'master' into leo/4920-reload-unexecuted-blocks

f3c6bb2

fix tests

a08e985

Merge branch 'leo/4920-reload-unexecuted-blocks' of github.com:onflow…

5754117

…/flow-go into leo/4920-reload-unexecuted-blocks

add additional logging

c0f837b

AlexHentschel approved these changes Nov 2, 2020

View reviewed changes

locking the read of statecommitment in execution state mock

c5fb677

zhangchiqing force-pushed the leo/4920-reload-unexecuted-blocks branch from 50c7b8e to c5fb677 Compare November 2, 2020 21:42

Merge branch 'master' into leo/4920-reload-unexecuted-blocks

02ac631

zhangchiqing merged commit 8ef6e57 into master Nov 2, 2020

zhangchiqing deleted the leo/4920-reload-unexecuted-blocks branch November 2, 2020 22:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Execution] Reload unexecuted blocks to execution queues on startup #73

[Execution] Reload unexecuted blocks to execution queues on startup #73

zhangchiqing commented Oct 15, 2020 •

edited

Loading

Kay-Zee Oct 25, 2020

zhangchiqing Oct 26, 2020

ramtinms Oct 29, 2020

zhangchiqing Oct 29, 2020

zhangchiqing Oct 30, 2020

zhangchiqing Oct 26, 2020

zhangchiqing Oct 26, 2020

Kay-Zee Oct 29, 2020

Kay-Zee Oct 29, 2020

Kay-Zee Oct 29, 2020

Kay-Zee Oct 29, 2020

zhangchiqing Oct 29, 2020

AlexHentschel left a comment •

edited

Loading

AlexHentschel Oct 30, 2020

zhangchiqing Oct 30, 2020

AlexHentschel Oct 30, 2020

AlexHentschel Oct 30, 2020

zhangchiqing Oct 30, 2020

AlexHentschel Nov 2, 2020 •

edited

Loading

AlexHentschel left a comment

[Execution] Reload unexecuted blocks to execution queues on startup #73

[Execution] Reload unexecuted blocks to execution queues on startup #73

Conversation

zhangchiqing commented Oct 15, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexHentschel left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AlexHentschel Nov 2, 2020 • edited Loading

Choose a reason for hiding this comment

AlexHentschel left a comment

Choose a reason for hiding this comment

zhangchiqing commented Oct 15, 2020 •

edited

Loading

AlexHentschel left a comment •

edited

Loading

AlexHentschel Nov 2, 2020 •

edited

Loading