Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix][broker] Fix RoaringBitmap.contains can't check value 65535 #20176

Merged

Conversation

coderzc
Copy link
Member

@coderzc coderzc commented Apr 25, 2023

Motivation

RoaringBitmap/RoaringBitmap#623

MessageRedeliveryController uses the RoaringBitmap under the covers to track entryIds to redeliver.
The bug in RoaringBitmap (that has now been fixed) caused at least the case that the bit index 65535 couldn't be set.
The consequence of this would be that a message that should be redelivered doesn't get delivered and this could get Key_Shared subscriptions to get stuck. This could impact also other subscription types and cause a backlog to build up.
This bug was introduced in 2.10.2 with #17804 since that's when MessageRedeliveryController switched to use the RoaringBitmap based datastructure.

Modifications

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

(or)

This change is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(example:)

  • Added integration tests for end-to-end deployment with large payloads (10MB)
  • Extended integration test for recovery after broker failure

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository:

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Apr 25, 2023
@coderzc coderzc changed the title [fix][broker] Fix RoaringBitmap contains can't check value 65535 [fix][broker] Fix RoaringBitmap.contains can't check value 65535 Apr 25, 2023
@coderzc coderzc added type/bug The PR fixed a bug or issue reported a bug area/broker labels Apr 25, 2023
@codelipenghui
Copy link
Contributor

Interesting, ConcurrentBitmapSortedLongPairSet also used the same method.

@coderzc
Copy link
Member Author

coderzc commented Apr 25, 2023

Interesting, ConcurrentBitmapSortedLongPairSet also used the same method.

Yeah, I'm not sure if our usage is wrong

@coderzc coderzc closed this May 8, 2023
@coderzc coderzc reopened this May 8, 2023
@codecov-commenter
Copy link

Codecov Report

Merging #20176 (e28b882) into master (35e9897) will increase coverage by 35.24%.
The diff coverage is n/a.

Impacted file tree graph

@@              Coverage Diff              @@
##             master   #20176       +/-   ##
=============================================
+ Coverage     37.64%   72.89%   +35.24%     
- Complexity    12522    31963    +19441     
=============================================
  Files          1691     1868      +177     
  Lines        128933   138594     +9661     
  Branches      14061    15246     +1185     
=============================================
+ Hits          48540   101026    +52486     
+ Misses        74070    29531    -44539     
- Partials       6323     8037     +1714     
Flag Coverage Δ
inttests 24.14% <ø> (-0.02%) ⬇️
systests 24.77% <ø> (-0.10%) ⬇️
unittests 72.20% <ø> (+39.09%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

see 1431 files with indirect coverage changes

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go. One comment that can be follow-up insight.

}

for (long i = 1; i <= 100_000; i++) {
assertTrue(roaringBitmap.contains(i, i + 1));
Copy link
Member

@tisonkun tisonkun May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add description so that we know which i fails instead of generic:

java.lang.AssertionError:
Expected :true
Actual   :false

Copy link
Member

@tisonkun tisonkun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to go. One comment that can be follow-up insight.

@tisonkun
Copy link
Member

tisonkun commented May 8, 2023

Merging...

@tisonkun tisonkun merged commit 2f9f5df into apache:master May 8, 2023
poorbarcode pushed a commit that referenced this pull request May 8, 2023
poorbarcode pushed a commit that referenced this pull request May 8, 2023
poorbarcode pushed a commit that referenced this pull request May 8, 2023
coderzc added a commit that referenced this pull request May 8, 2023
nicoloboschi pushed a commit to datastax/pulsar that referenced this pull request May 11, 2023
@lhotari
Copy link
Member

lhotari commented May 17, 2023

@coderzc Please update the PR description with the impact of this bug.

Here's my version of describing the impact:
MessageRedeliveryController uses the RoaringBitmap under the covers to track entryIds to redeliver.
The bug in RoaringBitmap (that has now been fixed) caused at least the case that the bit index 65535 couldn't be set.
The consequence of this would be that a message that should be redelivered doesn't get delivered and this could get Key_Shared subscriptions to get stuck. This could impact also other subscription types and cause a backlog to build up.
This bug was introduced in 2.10.2 with #17804 since that's when MessageRedeliveryController switched to use the RoaringBitmap based datastructure.

@codelipenghui @coderzc Is this the correct interpretation?

@codelipenghui
Copy link
Contributor

@lhotari Yes, looks good to me. Thanks.

@coderzc coderzc added the cherry-picked/branch-2.9 Archived: 2.9 is end of life label May 31, 2023
@michaeljmarshall
Copy link
Member

As discussed on the mailing list https://lists.apache.org/thread/w4jzk27qhtosgsz7l9bmhf1t7o9mxjhp, there is no plan to release 2.9.6, so I am going to remove the release/2.9.6 label

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants