Skip to content

node: Add follower node for sync mode#5009

Merged
winder merged 28 commits into
algorand:masterfrom
Eric-Warehime:sync-mode
Feb 1, 2023
Merged

node: Add follower node for sync mode#5009
winder merged 28 commits into
algorand:masterfrom
Eric-Warehime:sync-mode

Conversation

@Eric-Warehime
Copy link
Copy Markdown
Contributor

@Eric-Warehime Eric-Warehime commented Jan 12, 2023

Adds AlgorandFollowerNode which allows users to control the sync round on the catchup service and disables the agreement service.

Also adds the ledger delta APIs to the router when running and data node. This is all put behind a config flag, and a few new interfaces were required to get the APIs to work with both the data node and the full node.

Comment thread daemon/algod/api/server/router.go Outdated
@codecov
Copy link
Copy Markdown

codecov Bot commented Jan 12, 2023

Codecov Report

Merging #5009 (c0f9cf1) into master (149ebb8) will decrease coverage by 0.10%.
The diff coverage is 20.38%.

@@            Coverage Diff             @@
##           master    #5009      +/-   ##
==========================================
- Coverage   53.62%   53.53%   -0.10%     
==========================================
  Files         432      433       +1     
  Lines       54057    54277     +220     
==========================================
+ Hits        28990    29055      +65     
- Misses      22821    22969     +148     
- Partials     2246     2253       +7     
Impacted Files Coverage Δ
config/localTemplate.go 43.75% <ø> (ø)
daemon/algod/api/server/router.go 12.19% <0.00%> (-0.97%) ⬇️
daemon/algod/api/server/v2/handlers.go 0.82% <0.00%> (-0.01%) ⬇️
daemon/algod/server.go 4.39% <0.00%> (-0.23%) ⬇️
libgoal/libgoal.go 2.53% <0.00%> (-0.06%) ⬇️
node/error.go 0.00% <0.00%> (ø)
node/node.go 4.29% <0.00%> (+0.01%) ⬆️
node/follower_node.go 25.69% <25.69%> (ø)
catchup/service.go 68.94% <63.63%> (-0.39%) ⬇️
cmd/tealdbg/debugger.go 72.69% <0.00%> (-0.81%) ⬇️
... and 6 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Eric-Warehime Eric-Warehime marked this pull request as ready for review January 13, 2023 01:26
@Eric-Warehime Eric-Warehime requested a review from winder January 13, 2023 17:05
@winder winder requested review from cce and shiqizng January 17, 2023 16:54
Comment thread config/localTemplate.go Outdated
// NodeSyncMode launches the node in "sync" or "data" mode. This turns off agreement service,
// and APIs related to broadcasting transactions, and enables APIs which retrieve detailed information
// from ledger caches.
NodeSyncMode bool `version[26]:"false"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be 27?

Copy link
Copy Markdown
Contributor

@cce cce Jan 18, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we have PRs open like #5018 collecting v27 changes for the next release currently, v26 already went out in the last one, but if this doesn't make v27 it will end up in v28

Comment thread node/data_node.go Outdated
Comment thread daemon/algod/server.go Outdated
Comment thread node/data_node.go Outdated
Copy link
Copy Markdown
Contributor

@winder winder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code duplication is a little disappointing. I know the earlier PRs had some more advanced changes to avoid it, but maybe there is some middle ground?

The changes here are quite straightforward, great complexity improvement over the previous work!

Comment thread daemon/algod/server.go Outdated
Comment thread node/data_node.go Outdated
Comment thread node/data_node.go Outdated
Comment thread node/data_node.go Outdated
Comment thread daemon/algod/api/server/router.go Outdated
@cce
Copy link
Copy Markdown
Contributor

cce commented Jan 18, 2023

Have you tested running a node that just does catchup and has no agreement service? Once you get to the latest block I think you will run into an issue where you will only be checking for a new block every 17 seconds (the value of deadlineTimeout in the code linked below) because catchup is built to assume new blocks are generally always going to come from agreement. However with your sync mode and explicit start/stop stuff maybe you won't run into this?
https://github.com/algorand/go-algorand/blob/master/catchup/service.go#L565-L582

Comment thread node/data_node.go Outdated
Comment thread node/data_node.go Outdated
Comment thread node/data_node.go Outdated
Comment thread node/data_node.go Outdated
Comment thread node/data_node.go Outdated
Copy link
Copy Markdown
Contributor

@winder winder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just have some small suggestions

Comment thread config/localTemplate.go Outdated
// NodeFollowerMode launches the node in "follower" mode. This turns off the agreement service,
// and APIs related to broadcasting transactions, and enables APIs which can retrieve detailed information
// from ledger caches and can control the ledger round.
NodeFollowerMode bool `version[27]:"false"`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: EnableFollowMode would fit a little better with other settings.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kind of don't want it to be lumped in with all of the other "enable" settings just because it's also disabling certain functionality. But the other "Mode" settings are integers that serve as sort of enums whereas this is a boolean so I guess it does make sense to go in the "Enable" category just by virtue of its possible values.

Comment thread daemon/algod/api/server/v2/handlers.go Outdated
Comment on lines +1292 to +1296
sRound := v2.Node.GetSyncRound()
if sRound > 0 && uint64(cpRound) > sRound {
err = fmt.Errorf("catchpoint round %v greater than sync round %v", cpRound, sRound)
return badRequest(ctx, err, fmt.Sprintf(errFailedToStartCatchup, err), v2.Log)
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about moving this check to v2.Node.StartCatchup or SetCatchpointCatchupMode?

Co-authored-by: cce <51567+cce@users.noreply.github.com>
Eric-Warehime and others added 3 commits January 25, 2023 08:29
Co-authored-by: cce <51567+cce@users.noreply.github.com>
@Eric-Warehime Eric-Warehime requested a review from winder January 25, 2023 17:36
winder
winder previously approved these changes Jan 25, 2023
@winder winder requested review from algorandskiy and cce January 25, 2023 18:08
Copy link
Copy Markdown
Contributor

@algorandskiy algorandskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Although there is some code duplication but generatlization would require quite a lot code and not sure if it's worth the effort.
Please fix reviewdog remarks and consider adding an e2e test checking the follower can successfully start and serve requests.

Comment thread daemon/algod/api/server/router.go
Comment thread node/follower_node.go
rootDir string
genesisID string
genesisHash crypto.Digest
devMode bool // is this node operates in a developer mode ? ( benign agreement, broadcasting transaction generates a new block )
Copy link
Copy Markdown
Contributor

@cce cce Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking about it now, a follower node won't really work with dev mode, right, because it has no transaction pool to assemble its own blocks? Maybe just error if it is configured?

Comment thread node/follower_node.go Outdated
node.devMode = genesis.DevMode

if node.devMode {
cfg.DisableNetworking = true
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without a transaction pool, and with networking disabled, enabling both follow mode and dev mode at the same time makes this node basically unusable right?

Comment thread node/follower_node.go

// BroadcastSignedTxGroup errors in follower mode
func (node *AlgorandFollowerNode) BroadcastSignedTxGroup(_ []transactions.SignedTxn) (err error) {
return fmt.Errorf("cannot broadcast txns in sync mode")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for now, but it would be possible to broadcast signed tx groups using gossip and skip the tx pool, if you didn't want local validation/feedback on your txn

cce
cce previously approved these changes Jan 30, 2023
Copy link
Copy Markdown
Contributor

@cce cce left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Eric-Warehime Eric-Warehime dismissed stale reviews from cce and winder via 1126399 January 30, 2023 22:23
@Eric-Warehime
Copy link
Copy Markdown
Contributor Author

@algorandskiy Added e2e test--let me know how it looks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants