Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in PersistentTopic#addReplicationCluster #12723

Closed
Jason918 opened this issue Nov 10, 2021 · 2 comments · Fixed by #12729
Closed

Race condition in PersistentTopic#addReplicationCluster #12723

Jason918 opened this issue Nov 10, 2021 · 2 comments · Fixed by #12729
Assignees
Labels
release/2.8.3 release/2.9.1 type/bug The PR fixed a bug or issue reported a bug
Milestone

Comments

@Jason918
Copy link
Contributor

Describe the bug
In class org.apache.pulsar.broker.service.persistent.PersistentTopic, we have field replicators of class ConcurrentOpenHashMap. There is a race condiftion in method addReplicationCluster.

replicators.computeIfAbsent(remoteCluster, r -> {
    try {
        return new PersistentReplicator(PersistentTopic.this, cursor, localCluster,
                remoteCluster, brokerService, (PulsarClientImpl) replicationClient);
    } catch (PulsarServerException e) {
        log.error("[{}] Replicator startup failed {}", topic, remoteCluster, e);
    }
    return null;
});

// clean up replicator if startup is failed
if (replicators.containsKey(remoteCluster) && replicators.get(remoteCluster) == null) {
    replicators.remove(remoteCluster);
}

It's clear that there is a race condition if multi threads would run in this code. For example
Thread A is just about to execute replicators.remove, Thread B inserts a non-null PersistentReplicator. Then thread A will delete the PersistentReplicator which thread B just created.

And there is no other thread safe measures applied to these code.

Expected behavior

It should be thead safe.

Screenshots
Not applicable.

Desktop (please complete the following information):

  • OS: mac

Additional context
Nop

@Jason918 Jason918 added the type/bug The PR fixed a bug or issue reported a bug label Nov 10, 2021
@Jason918
Copy link
Contributor Author

Jason918 commented Nov 10, 2021

I found another interesting thing is that in org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap, if we call map.put(key, null), it throws an NullPointerException. But we can use map.computeIfAbsent(key, k->null) to insert a null value into the map, which leads to a confusing situation that map.containsKey(key) is falsse, but there is actually a "key" in the map with null value.

Not the same behavior in java.util.concurrent.ConcurrentHashMap, a key with null value will be dismissed.

@Jason918
Copy link
Contributor Author

I am working on a fix.

@codelipenghui codelipenghui added this to the 2.10.0 milestone Nov 10, 2021
codelipenghui pushed a commit that referenced this issue Nov 15, 2021
…ster (#12729)

### Motivation

See #12723

### Modifications

Add a method org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap#removeNullValue to remove null value   in a thread safe way.
codelipenghui pushed a commit that referenced this issue Nov 18, 2021
…ster (#12729)

See #12723

Add a method org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap#removeNullValue to remove null value   in a thread safe way.

(cherry picked from commit a3fe00e)
eolivelli pushed a commit to eolivelli/pulsar that referenced this issue Nov 29, 2021
…ster (apache#12729)

### Motivation

See apache#12723

### Modifications

Add a method org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap#removeNullValue to remove null value   in a thread safe way.
lhotari pushed a commit to datastax/pulsar that referenced this issue Dec 3, 2021
…ster (apache#12729)

See apache#12723

Add a method org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap#removeNullValue to remove null value   in a thread safe way.

(cherry picked from commit a3fe00e)
(cherry picked from commit 230e1ac)
fxbing pushed a commit to fxbing/pulsar that referenced this issue Dec 19, 2021
…ster (apache#12729)

### Motivation

See apache#12723

### Modifications

Add a method org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap#removeNullValue to remove null value   in a thread safe way.
codelipenghui pushed a commit that referenced this issue Dec 20, 2021
…ster (#12729)

See #12723

Add a method org.apache.pulsar.common.util.collections.ConcurrentOpenHashMap#removeNullValue to remove null value   in a thread safe way.

(cherry picked from commit a3fe00e)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release/2.8.3 release/2.9.1 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants