Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] [broker] Fix infinite ack of Replicator after topic is closed #20232

Conversation

poorbarcode
Copy link
Contributor

@poorbarcode poorbarcode commented May 5, 2023

Motivation

In the scenarios[1] below, the replicator only acknowledges messages and does not replicate them, but if the topic is closing now, the operation acknowledge(in other words: cursor.asyncDelete) will be failed by the error Cursor was already closed, then it will retry again(see the code [2]) and fail again....loop.

[1]: scenarios of skip messages:

  • the message can not be deserialized.
  • the entry is a transaction marker.
  • the message has been aborted by the transaction.
  • the message is repeated from another cluster.
  • the message is expired(see the retention policy).
  • the message specifies the target cluster, but the replicator is not the owner of the target cluster.

[2]: Code of retry to acknowledge messages.
https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentReplicator.java#L545

Note: I tried to write a test, but it took me more than an hour but failed, so there was no test of this change.

Modifications

Just like the existing solution of read entry fail(see the code [3]): if ack failed by CursorAlreadyClosedException, just close the replicator(will not retry).

[3]: Code of the existing solution of read entry fail
https://github.com/apache/pulsar/blob/master/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentReplicator.java#L434

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

  • x

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label May 5, 2023
@poorbarcode poorbarcode self-assigned this May 5, 2023
Copy link
Contributor

@liangyepianzhou liangyepianzhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good work!

@Technoboy- Technoboy- added this to the 3.1.0 milestone May 6, 2023
@poorbarcode
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

@codecov-commenter
Copy link

codecov-commenter commented May 6, 2023

Codecov Report

Merging #20232 (5635bf0) into master (0dd238a) will increase coverage by 38.42%.
The diff coverage is 85.71%.

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #20232       +/-   ##
=============================================
+ Coverage     34.48%   72.90%   +38.42%     
- Complexity    12537    31961    +19424     
=============================================
  Files          1614     1868      +254     
  Lines        126170   138597    +12427     
  Branches      13771    15247     +1476     
=============================================
+ Hits          43509   101044    +57535     
+ Misses        77053    29518    -47535     
- Partials       5608     8035     +2427     
Flag Coverage Δ
inttests 24.27% <0.00%> (+0.14%) ⬆️
systests 24.91% <0.00%> (?)
unittests 72.20% <85.71%> (+39.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...mon/policies/data/stats/SubscriptionStatsImpl.java 69.56% <50.00%> (-0.35%) ⬇️
...roker/service/persistent/PersistentReplicator.java 68.45% <100.00%> (+24.77%) ⬆️

... and 1506 files with indirect coverage changes

@poorbarcode
Copy link
Contributor Author

/pulsarbot rerun-failure-checks

@Technoboy- Technoboy- merged commit 9841364 into apache:master May 7, 2023
poorbarcode added a commit that referenced this pull request May 7, 2023
poorbarcode added a commit that referenced this pull request May 7, 2023
poorbarcode added a commit that referenced this pull request May 7, 2023
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants