Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvserver,rac2: initialize RaftNode early #129727

Merged
merged 2 commits into from
Sep 9, 2024
Merged

Conversation

pav-kv
Copy link
Collaborator

@pav-kv pav-kv commented Aug 27, 2024

This PR moves the RaftNode interface initialization to the point when raft.RawNode is created. This will be needed for correct initialization of LogTracker from the initial raft state.

The RaftNode interface is moved to replica_rac2 integration package, to reduce cluttering of kvserver.

Part of #129508

@pav-kv pav-kv requested review from a team as code owners August 27, 2024 16:29
Copy link

blathers-crl bot commented Aug 27, 2024

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@pav-kv pav-kv force-pushed the init-raft branch 5 times, most recently from 57a3fb4 to ec4381a Compare August 27, 2024 17:42
@pav-kv
Copy link
Collaborator Author

pav-kv commented Aug 27, 2024

@sumeerbhola This is ready for review.

Since the RawNode creation and setting the replica descriptor are scattered in code/time (notably, during range splits), we better need to initialize the RaftNode early to be sure that stable index tracking state is inited with the RawNode and misses no updates.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine but I am curious whether this was necessary. We currently initialized RaftNode on the first call to OnDescChangedLocked since that happens when the replica is initialized. Before that, there should be no processing of MsgStorageAppends, whether as a leader or a replica.

Initialization or state changes via calling into Processor are more prone to integration bugs where someone forgets to call something if another such place is added in kvserver that performs a state transition, which is why I've been trying to minimize them. They are currently limited to OnDestroyRaftMuLocked/OnDescChangedLocked and for the v1=>v2 transition SetEnabledWhenLeaderRaftMuLocked. A pull from Processor has some risk if it is called too early, but it isn't subject to such integration wiring bugs.

For this change, I suppose we can assume that kvserver code is stable enough that new places won't be added where r.mu.internalRaftGroup is set.

Reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @pav-kv)


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor.go line 484 at r1 (raw file):

func (p *processorImpl) InitRaftLocked(rn RaftNode) {
	p.raftMu.raftNode = rn
	// TODO(pav-kv): initialize the stable log mark tracking from RawNode state.

I am curious what this TODO is about.
The admittedTracker stable log mark doesn't need to be initialized here. It can wait until the callback from a MsgAppResp or in a HandleRaftReadyRaftMuLocked.


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor.go line 557 at r1 (raw file):

		panic(errors.AssertionFailedf("tenantId was changed from %s to %s",
			p.raftMu.tenantID, tenantID))
	}

can you add an assertion that raftNode is already initialized.


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor_test.go line 305 at r1 (raw file):

			p.opts.NodeID, p.opts.StoreID, p.opts.RangeID, p.opts.ReplicaID, tenantID,
			enabledLevelString(p.mu.enabledWhenLeader))
		p.InitRaftLocked(r.raftNode)

there are existing test cases that call handleRaftReady before initializing RawNode and ensure that the processorImpl is correctly doing nothing instead of causing a null pointer exception. Those will no longer get exercised by eagerly calling InitRaftLocked. Can you make this an explicit command, or roll this into the first "on-desc-locked".

Copy link
Collaborator Author

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree, this might be unnecessary. It would be good to confirm that "uninitialized replica" (one with RawNode but no descriptor) won't write to storage, and if that's the case we're good. Putting this on hold for now. Needed this to initialize LogTracker from the other PR, but it's ok to do it next to RaftNodeLocked() call too.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @pav-kv)

Epic: none
Release note: none
This commit moves the RaftNode interface initialization to the point
when raft.RawNode is created. This will be needed for correct
initialization of LogTracker from the initial raft state.

Epic: none
Release note: none
Copy link
Collaborator Author

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR now seems necessary, per context in #130171. Yeah, we can assume that the code initializing RawNode is fairly stable. Today, there is only one place where RawNode is created, and I don't expect it to get out hand.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor.go line 484 at r1 (raw file):

Previously, sumeerbhola wrote…

I am curious what this TODO is about.
The admittedTracker stable log mark doesn't need to be initialized here. It can wait until the callback from a MsgAppResp or in a HandleRaftReadyRaftMuLocked.

This will be the place for logTracker.init call. Motivation is explained in #130171 (comment): for correct stable index tracking we don't want LogTracker initialization racing with RawNode.Ready() handling.


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor.go line 557 at r1 (raw file):

Previously, sumeerbhola wrote…

can you add an assertion that raftNode is already initialized.

Done.


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor_test.go line 305 at r1 (raw file):

Previously, sumeerbhola wrote…

there are existing test cases that call handleRaftReady before initializing RawNode and ensure that the processorImpl is correctly doing nothing instead of causing a null pointer exception. Those will no longer get exercised by eagerly calling InitRaftLocked. Can you make this an explicit command, or roll this into the first "on-desc-locked".

Done.

Copy link
Collaborator Author

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor_test.go line 323 at r3 (raw file):

				return builderStr()

			case "init-raft":

Could also move this to set-raft-state, and init raft node if it's nil.

Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 8 files at r2, 5 of 5 files at r3, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @pav-kv)


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor.go line 301 at r3 (raw file):

type Processor interface {
	// InitRaftLocked is called when RaftNode is initialized for the Replica.
	// NB: can be called twice before the Replica is fully initialized.

And these two RawNodes (uninitialized replica, and later when initialized) are consistent with each other?

Just curious -- why did we need to recreate the RawNode? Is it because we don't have RaftAppliedIndex until the Replica is initialized -- I would have thought snapshot application (which Raft knows about) would have fixed that part of its state.


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor_test.go line 323 at r3 (raw file):

Previously, pav-kv (Pavel Kalinnikov) wrote…

Could also move this to set-raft-state, and init raft node if it's nil.

this is fine

Copy link
Collaborator Author

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @sumeerbhola)


pkg/kv/kvserver/kvflowcontrol/replica_rac2/processor.go line 301 at r3 (raw file):
I think this has to do with the fact that uninitialized replica needs to be able to participate in elections, before it gets the initial snapshot/RaftAppliedIndex and initialized.

And these two RawNodes (uninitialized replica, and later when initialized) are consistent with each other?

I think they are, at least RawNode always assumes that it starts from a durable/consistent state. Arguably it would be better to init RawNode exactly once, and move it forward to the new state when the replica is fully inited - should consider doing that in the future. The last related PR is #104657.

@pav-kv
Copy link
Collaborator Author

pav-kv commented Sep 9, 2024

bors r=sumeerbhola

@craig craig bot merged commit 60313dc into cockroachdb:master Sep 9, 2024
23 checks passed
@pav-kv pav-kv deleted the init-raft branch September 9, 2024 15:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants