Skip to content

Conversation

@wenyihu6
Copy link
Contributor

@wenyihu6 wenyihu6 commented Aug 7, 2025

Stacked on top of #151369.

Epic: CRDB-25222
Release note: none


mmaprototype: add MakePendingRangeChangeForTesting

This commit improves the String methods for several types in mmaprototype to
make datadriven test output more readable. It also exports a
MakePendingRangeChangeForTesting helper to facilitate constructing
pendingReplicaChanges in future datadriven tests.


mmaintegration: add mmaintegration datadriven tests

This commit adds a datadriven test for the mmaintegration package, verifying
the basic functionality of allocator sync. It tests how replica changes are
converted to mma replica changes and how the store pool is checked, using mocks
for components like the store pool and mma allocator.


mmaintegration: add store load datadriven tests

This commit adds a datadriven test for MakeStoreLoadMsg, covering a basic store
load scenario and one with near-zero cpu and byte size utilization.

wenyihu6 added 15 commits August 6, 2025 22:29
This commit adds allocator_op.go and allocator_sync.go, introducing
trackedAllocatorChange and basic coordination helpers for allocator sync. These
helpers register changes before they are applied and post apply the effect on
store pool after the change is applied. Note that allocator sync is not plumbed
or used yet. Future commits would plumb it to replicate queue, lease queue, and
mma store rebalancer to coordinate changes.
Previously, allocator sync files were added. This commit wires allocator sync
into the lease queue, replicate queue, and mma store rebalancer. Each node has a
single allocator sync instance responsible for coordinating with the
corresponding mma allocator. Note that while the allocator_sync is now plumbed,
they are not yet actively used.
Previously, allocator sync was plumbed into the replicate queue, lease queue,
and mma store rebalancer. This commit updates the lease and replicate queues to
call allocator sync to register changes before they are applied, and again post
apply to reflect load changes in the local store pool. For now, allocator sync
is only used for updating the store pool. Future commits will extend this to
coordinate with mma.
Previously, allocator sync was called before and after applying changes, but
only to update the store pool. This commit extends it to also register changes
with the mma allocator. If mma is enabled at registration time, allocator sync
tracks the change and informs mma during PostApply. If mma is disabled, it
continues to only update the store pool.
Previously, allocator sync tracked external changes from the replicate and lease
queues. This commit extends it to register mma driven changes as well. On
PostApply, allocator sync now informs mma of the outcome. Note that it is not
being used yet. Future commits will call into it from mma store rebalancer.
Previously, mma PreApply was introduced to register mma changes with allocator
sync. This commit integrates it into the mma store rebalancer. To handle
failures that occur before a change is even registered (such as when a replica
doesn’t exist), this commit also adds, MarkChangesAsFailed, in allocator sync.
This function bypasses the trackedChanges map since the change is not meant to
be applied. It must only be called on failure, and allocator sync is responsible
for notifying MMA of the failure.
This commit adds MakeStoreLoadMsg, moved from allocator_mma_integration.go in
mmaprototypehelpers. Eventually, all functions from mmaprototypehelpers will be
migrated to pkg/kv/kvserver/mmaintegration as part of productionizing the code.
Note that no tests have been added yet - a TODO has been left in place to
address this. Added to the spreadsheet as well.
This commit adds InvalidSyncChangeID to allocator sync, used only by asim to
indicate that no change is currently in progress. This is needed because asim
returns control to the tick loop rather than waiting for changes to complete
during simulations. So it needs to track the current change id in progress.
Previously, all helper functions from
pkg/kv/kvserver/allocator/mmaprototypehelpers were migrated to
pkg/kv/kvserver/mmaintegration. This commit completes the transition by removing
all usages of mmaprototypehelpers to pkg/kv/kvserver/mmaintegration.
This commit address minor code review comments.
Previously, the replicate queue could skip updating allocator sync when
rlm.AdminTransferLease failed, due to early returns on error. This commit
ensures allocator sync is always updated before exiting the function, regardless
of success or failure.
This commit integrates the store pool and MMA allocator into allocator sync via
an interface to facilitate testing within this package.
This commit improves the String methods for several types in mmaprototype to
make datadriven test output more readable. It also exports a
MakePendingRangeChangeForTesting helper to facilitate constructing
pendingReplicaChanges in future datadriven tests.
This commit adds a datadriven test for the mmaintegration package, verifying
the basic functionality of allocator sync. It tests how replica changes are
converted to mma replica changes and how the store pool is checked, using mocks
for components like the store pool and mma allocator.
This commit adds a datadriven test for MakeStoreLoadMsg, covering a basic store
load scenario and one with near-zero cpu and byte size utilization.
@blathers-crl
Copy link

blathers-crl bot commented Aug 7, 2025

Your pull request contains more than 1000 changes. It is strongly encouraged to split big PRs into smaller chunks.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@wenyihu6 wenyihu6 changed the title mmaintegration: add store load datadriven tests mmaintegration: add mmaintegration datadriven tests Aug 7, 2025
@wenyihu6
Copy link
Contributor Author

wenyihu6 commented Aug 7, 2025

This is the operations I mentioned on slack:

func makeChangeReplicasOperation(opType string, from, to int) changeReplicasOp {
switch opType {
case "add-nonvoter":
return makeAddNonVoterOp(to)
case "replace-voter-with-promotion":
return makeReplaceVoterWithPromotionOp(from, to)
case "rebalance-voter-with-promotion-and-demotion":
return makeRebalanceVoterOpWithPromotionAndDemotion(from, to)
default:
panic(fmt.Sprintf("unknown operation: %s", opType))
}
}
. I'm hoping to cover all combinations of replica changes that the replicate queue might generate to facilitate testing with thrashing prevention. For example, the ones here are based on how the replicate queue constructs changes for adding or replacing a non-voter:
ops := kvpb.MakeReplicationChanges(roachpb.ADD_NON_VOTER, newNonVoter)
if removeIdx < 0 {
log.KvDistribution.Infof(ctx, "adding non-voter %+v: %s",
newNonVoter, rangeRaftProgress(repl.RaftStatus(), existingNonVoters))
} else {
stats = stats.trackRemoveMetric(allocatorimpl.NonVoterTarget, replicaStatus)
removeNonVoter := existingNonVoters[removeIdx]
log.KvDistribution.Infof(ctx, "replacing non-voter %s with %+v: %s",
removeNonVoter, newNonVoter, rangeRaftProgress(repl.RaftStatus(), existingNonVoters))
ops = append(ops,
kvpb.MakeReplicationChanges(roachpb.REMOVE_NON_VOTER, roachpb.ReplicationTarget{
StoreID: removeNonVoter.StoreID,
NodeID: removeNonVoter.NodeID,
})...)
}
. The other two operations are from
ops = kvpb.ReplicationChangesForPromotion(newVoter)
and
promo := kvpb.ReplicationChangesForPromotion(addTarget)
demo := kvpb.ReplicationChangesForDemotion(removeTarget)
. Lmk if you think going that route might be too heavy weighted tho.

Regenerates the datadriven output since I changed the string methods.
@wenyihu6 wenyihu6 marked this pull request as ready for review August 7, 2025 13:22
@wenyihu6 wenyihu6 requested review from a team as code owners August 7, 2025 13:22
@wenyihu6 wenyihu6 requested review from sumeerbhola and removed request for a team August 7, 2025 13:22
Copy link
Collaborator

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may not be properly understanding the question.

IMO, we need to ensure that the tests cover conversion to/from the MMA and SMA formats quite exhaustively, i.e., mmaprototype.PendingRangeChange to/from {changeReplicasOp,leaseTransferOp}. That may mean standalone (non-datadriven) tests for convertReplicaChangeToMMA and lifting out the code in MMAPreApply that converts from mmaprototype.PendingRangeChang to what SMA needs and testing that function separately.

Alternatively, one could just test those conversions using the AllocatorSync datadriven test, if that is as simple as standalone tests. I don't have an opinion either way.


Flushed some comments. I didn't look at everything, since these comments may change the tests.

Reviewed 2 of 2 files at r13, 4 of 10 files at r14, 3 of 4 files at r15, 1 of 1 files at r16.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained


pkg/kv/kvserver/mmaintegration/datadriven_mmaintegration_test.go line 62 at r14 (raw file):

// mockMMAAllocator implements the mmaAllocator interface for testing.
type mockMMAAllocator struct {

I think mockMMAllocator and mockStorePool are doing more than what they need to, which is making this test code more complicated and making it harder at a glance to check that the testdata output is correct. We don't really care about what their internal state is, when testing AllocatorSync. All we care about is that AllocatorSync is calling their interface methods as expected, with the expected parameters. So one could remove all this internal state and just print those calls. As an example of this pattern, see TestProcessorBasic in replica_rac2. Similar to AllocatorSync, the processorImpl there is mostly glue code, and that is what is being tested, while the things it calls are abstracted via test interfaces (testRangeController being the main one)

var sched testRaftScheduler
var piggybacker testAdmittedPiggybacker
var q testACWorkQueue
var rcFactory testRangeControllerFactory
.

And as a nit, they are not really mocks (which assert on calls), so they could be renamed with a test prefix instead of mock.


pkg/kv/kvserver/mmaintegration/testdata/mmaintegration/add_nonvoter.txt line 10 at r14 (raw file):

----
sync_change_id:1
mma_change:

I was initially confused by the mma_change phrasing here. Looking at the code, it seems this is just the change, formatted in a way that MMA likes to receive, but may not necessarily be passed to MMA (when the change originates in MMA itself). If my understanding is correct, how about just renaming to the change:.


pkg/kv/kvserver/mmaintegration/testdata/mmaintegration/add_nonvoter.txt line 27 at r14 (raw file):

post-apply id=1 success=true
----
applied change 1 with success=true

I was expecting to see two lines of output, corresponding to the call to storePool.UpdateLocalStoreAfterRebalance and mmaState.AdjustPendingChangesDisposition.
To figure out whether the behavior is correct, one needs to look at the state printed below. Which is definitely a reasonable pattern if the AllocatorSync were glued to real things, say A and B. One would need to print the state of A and B since one can't intercept the AllocatorSync calls to them. But it is glued to test implementations. See more on this below.


pkg/kv/kvserver/mmaintegration/testdata/storeload/almost_zero_util.txt line 7 at r17 (raw file):

# write_bytes_per_second=1024 logical_bytes=2048 capacity=1000 available=200
# lease_count=500 range_count=1000 node_cpu_rate_capacity=2000
# node_cpu_rate_usage=1000 stores_cpu_rate=1500 num_stores=2 timestamp=1234567890

nit: let's not repeat the command below in this comment. It just adds more work for the reader, who is left wondering whether there is some subtle difference between the two.


pkg/kv/kvserver/mmaintegration/testdata/storeload/almost_zero_util.txt line 12 at r17 (raw file):

# - desc.NodeCapacity.NodeCPURateCapacity = 200000000000
# If capacity == available, byte size capacity is Available(200).
# - desc.Capacity.Capacity = desc.Capacity.Available

I don't understand desc.Capacity.Capacity = desc.Capacity.Available. Is the LHS supposed to be the capacity we compute? If so, it is StoreLoadMsg.Capacity[ByteSize] and not the descriptor capacity.


pkg/kv/kvserver/mmaintegration/testdata/storeload/almost_zero_util.txt line 13 at r17 (raw file):

# If capacity == available, byte size capacity is Available(200).
# - desc.Capacity.Capacity = desc.Capacity.Available
make-store-load-msg store_id=1 cpu_per_second=1000 write_bytes_per_second=1024 logical_bytes=2048 capacity=200 available=200 lease_count=500 range_count=1000 node_cpu_rate_capacity=200000000000 node_cpu_rate_usage=1 stores_cpu_rate=1500 num_stores=2 timestamp=1234567890

Don't change this now, since you've already made it a datadriven test, but mostly a test with a single command could be more simply written using a bunch of test cases with a testCase struct containing the input and the expected output.


pkg/kv/kvserver/mmaintegration/testdata/mmaintegration/add_nonvoter_mma.txt line 7 at r14 (raw file):

init mma_enabled=true
----
range_id=1 [replicas:{(r1*:n1,s1,voter),(r2:n2,s2,non-voter)} usage=(cpu:150,raft-cpu:50,write-band:1024,byte-size:2048)]

These tests fiddle with a single range. That means one needs a different test file for each small test case. Also, it doesn't test multiple changes in-flight at the same time. One could construct say 3 ranges, where the usage is scaled up, range_id=2 has 2x the usage, and range_id=3 has 3x the usage, and fiddle with all of them in a single test. Of course, what I am saying is subjective, so up to you how to structure this -- but I do think we need a test with multiple in-flight changes and some of them failing.

@tbg tbg self-requested a review August 25, 2025 08:37
@tbg tbg removed their request for review September 1, 2025 10:21
@tbg
Copy link
Member

tbg commented Sep 1, 2025

Unassigning for now, but ready to return to this once your current project is wrapped up @wenyihu6

@wenyihu6
Copy link
Contributor Author

wenyihu6 commented Sep 3, 2025

I’ll revisit this after the branch cut since it’s test-only unless either of you feel this should be prioritized. Added an item on our spreadsheet and assigned to me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants