kvserver: replicas on decommissioning node never being replaced

**Describe the problem**

We have observed replicas on a decommissioning node (<3) never getting replaced by the corresponding range leaseholder node, over 80 minutes.

The stall was resolved by manually enqueueing the blocking ranges into the leaseholder's replicate queue.

This is surprising, as the replica scanner should check each replica once every 10 minutes, against all store replica queues. By manually enqueueing the ranges via the advanced debug page, without skipping the should queue check, it demonstrates that if the scanner is calling `shouldQueue` on the replica, it would have been enqueued. There is no indication that this occurred.


**To Reproduce**

Attempts at reproducing the issue haven't been successful so far. The methods tested are shown below.

<details><summary>Details</summary>
<p>

Setup the cluster and run the workload:

```
export cluster=austen-decom-repro
roachprod create $cluster -n 41
roachprod put $cluster ./artifacts/cockroach cockroach
roachprod start $cluster:1-40
rp sql $cluster:1 -- -e 'CREATE DATABASE kv2'
rp sql $cluster:1 -- -e 'ALTER RANGE default CONFIGURE ZONE USING num_replicas = 5'
rp sql $cluster:1 -- -e 'ALTER DATABASE kv2 CONFIGURE ZONE USING num_replicas = 5'
rp run $cluster:1 -- './cockroach workload run kv --init --splits=6000 --min-block-bytes=16384 --max-block-bytes=16384 --insert-count=100000000 --max-rate=100 {pgurl:1}'
```

Chain decomissions:

```bash
for iteration in $(seq 1 10000); do
    echo "Starting iteration $iteration"

    # Loop through nodes
    for node in $(seq 20 40); do
        echo "Processing node $node (Iteration $iteration)"

        # Run drain command
        rp run $cluster:$node -- './cockroach node drain --insecure --self'

        # Run decommission command
        rp run $cluster:$node -- './cockroach node decommission --insecure --self'

        # Wipe the node
        rp wipe $cluster:$node

        # Put artifacts
        rp put $cluster:$node ./artifacts/cockroach

        # Start the node
        rp start $cluster:$node

        echo "Finished processing node $node (Iteration $iteration)"
        echo "------------------------"
    done

    echo "Finished iteration $iteration"
    echo "========================"
done
```

Restart a node every 3 minutes:

```bash
for iteration in $(seq 1 10000); do
    # Loop through nodes
    for node in $(seq 2 19); do
        echo "Restarting node $node (Iteration $iteration)"

        # Restart the node
        rp stop $cluster:$node
        # Start the node
        rp start $cluster:$node

        sleep 180

        echo "Finished restarting node $node (Iteration $iteration)"
        echo "------------------------"
    done

    echo "Finished iteration $iteration"
    echo "========================"
done
```


</p>
</details> 


**Expected behavior**

The replica scanner enqueues decommissioning ranges into the replicate queue at minimum every `replica_count * 100ms` (min scanner interval) duration, or 10 minutes, whichever is greater.

Decommissioning doesn't stall due to this.


**Environment:**
 - CockroachDB version `v23.1.22`, however it may affect other versions, it has only occurred on a `v23.1.22` cluster

**Additional context**

Manual intervention is required to complete a decommission.


Jira issue: CRDB-41920

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kvserver: replicas on decommissioning node never being replaced #130199

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

kvserver: replicas on decommissioning node never being replaced #130199

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions