Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] [broker] Fix break change: could not subscribe partitioned topic with a suffix-matched regexp due to a mistake of PIP-145 #21885

Merged
merged 7 commits into from
Jan 15, 2024

Conversation

poorbarcode
Copy link
Contributor

@poorbarcode poorbarcode commented Jan 11, 2024

Motivation

  • Before PIP-145, it used the Partitioned topic name to match the Regexp. For example, the partitioned topic public/default/tp has one partition named public/default/tp-partition-0. Pulsar will use public/default/tp to match the Regexp[1].
  • After PIP-145, it used the partition name to match the Regexp. For example, the partitioned topic public/default/tp has one partition named public/default/tp-partition-0. Pulsar will use public/default/tp-partition-0 to match the Regexp[2].

It is a break change that is not expected; these changes are due to a mistake, and we should correct it.

Issue

If a consumer tries to subscribe to a partitioned topic with a suffix-matched regexp, it does not work.

  • create topic persistent://public/default/tp
  • create a consumer with regexp persistent://public/default/tp$
  • the pattern consumer has no internal consumers.

You can reproduce the issue by the test testPreciseRegexpSubscribe.

  • it can work when using 2.10.x
  • it can not work when using >=2.11.0

Modifications

Revert the behavior as the original implementation(before PIP-145)

Footnotes

[1]: the implementation of 2.10.x

  • Pulsar did the check of regexp on the client-side.
  • BinaryProtoLookupService.getTopicsUnderNamespace merge partitions to one partitioned topic. E.g.) Pulsar client merges public/default/tp-partition-0 and public/default/tp-partition-1 to ``public/default/tp`
  • PulsarClientImpl.topicsPatternFilter Use the partitioned topic name to match the regexp.

BinaryProtoLookupService.getTopicsUnderNamespace: https://github.com/apache/pulsar/blob/branch-2.10/pulsar-client/src/main/java/org/apache/pulsar/client/impl/BinaryProtoLookupService.java#L332-L338

List<String> result = new ArrayList<>();
r.forEach(topic -> {
    String filtered = TopicName.get(topic).getPartitionedTopicName();
    if (!result.contains(filtered)) {
        result.add(filtered);
    }
});

PulsarClientImpl.topicsPatternFilter : https://github.com/apache/pulsar/blob/branch-2.10/pulsar-client/src/main/java/org/apache/pulsar/client/impl/PulsarClientImpl.java#L568-L577

public static List<String> topicsPatternFilter(List<String> original, Pattern topicsPattern) {
    final Pattern shortenedTopicsPattern = topicsPattern.toString().contains("://")
        ? Pattern.compile(topicsPattern.toString().split("\\:\\/\\/")[1]) : topicsPattern;

    return original.stream()
        .map(TopicName::get)
        .map(TopicName::toString)
        .filter(topic -> shortenedTopicsPattern.matcher(topic.split("\\:\\/\\/")[1]).matches())
        .collect(Collectors.toList());
}

[2]: the implementation of >=2.11.0

public static List<String> filterTopics(List<String> original, Pattern topicsPattern) {

    final Pattern shortenedTopicsPattern = topicsPattern.toString().contains(SCHEME_SEPARATOR)
            ? Pattern.compile(SCHEME_SEPARATOR_PATTERN.split(topicsPattern.toString())[1]) : topicsPattern;

    return original.stream()
            .map(TopicName::get)
            .map(TopicName::toString)
            .filter(topic -> shortenedTopicsPattern.matcher(SCHEME_SEPARATOR_PATTERN.split(topic)[1]).matches())
            .collect(Collectors.toList());
}

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: x

@github-actions github-actions bot added the doc-not-needed Your PR changes do not impact docs label Jan 11, 2024
@poorbarcode poorbarcode self-assigned this Jan 11, 2024
@poorbarcode poorbarcode added this to the 3.3.0 milestone Jan 11, 2024
@poorbarcode poorbarcode added release/3.0.3 release/2.11.4 release/3.1.3 release/3.2.1 type/bug The PR fixed a bug or issue reported a bug category/reliability The function does not work properly in certain specific environments or failures. e.g. data lost labels Jan 11, 2024
@poorbarcode poorbarcode changed the title Fix/break change of pip 145 [fix] [broker] Break change of could not subscribe partitioned topic with a suffix-matched regexp due to a mistake of PIP-145 Jan 11, 2024
@poorbarcode poorbarcode changed the title [fix] [broker] Break change of could not subscribe partitioned topic with a suffix-matched regexp due to a mistake of PIP-145 [fix] [broker] Fix break change: could not subscribe partitioned topic with a suffix-matched regexp due to a mistake of PIP-145 Jan 11, 2024
@codelipenghui
Copy link
Contributor

/pulsarbot run-failure-checks

@poorbarcode poorbarcode requested a review from coderzc January 15, 2024 02:08
@codecov-commenter
Copy link

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (176bdea) 73.58% compared to head (a0eeae9) 73.52%.
Report is 5 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #21885      +/-   ##
============================================
- Coverage     73.58%   73.52%   -0.06%     
- Complexity    32325    32345      +20     
============================================
  Files          1859     1859              
  Lines        138263   138359      +96     
  Branches      15153    15159       +6     
============================================
- Hits         101736   101732       -4     
- Misses        28644    28743      +99     
- Partials       7883     7884       +1     
Flag Coverage Δ
inttests 24.12% <0.00%> (+0.02%) ⬆️
systests 23.67% <63.63%> (-0.11%) ⬇️
unittests 72.82% <81.81%> (-0.03%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...ava/org/apache/pulsar/common/topics/TopicList.java 89.65% <81.81%> (-5.59%) ⬇️

... and 62 files with indirect coverage changes

@Technoboy- Technoboy- merged commit 4ebbd2f into apache:master Jan 15, 2024
poorbarcode added a commit that referenced this pull request Jan 15, 2024
…ic with a suffix-matched regexp due to a mistake of PIP-145 (#21885)

(cherry picked from commit 4ebbd2f)
poorbarcode added a commit that referenced this pull request Jan 15, 2024
…ic with a suffix-matched regexp due to a mistake of PIP-145 (#21885)

(cherry picked from commit 4ebbd2f)
Technoboy- pushed a commit that referenced this pull request Jan 19, 2024
…ic with a suffix-matched regexp due to a mistake of PIP-145 (#21885)
@Technoboy- Technoboy- modified the milestones: 3.3.0, 3.2.0 Jan 19, 2024
closeAllConnections.invoke(pool, new Object[]{});
}

public void reconnectAllConnections() throws Exception {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks Pulsar SQL / Trino tests in branch-3.1 and before. I created #21976 to address the problem.

poorbarcode added a commit that referenced this pull request Feb 28, 2024
…ic with a suffix-matched regexp due to a mistake of PIP-145 (#21885)

(cherry picked from commit 4ebbd2f)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 1, 2024
…ic with a suffix-matched regexp due to a mistake of PIP-145 (apache#21885)

(cherry picked from commit 4ebbd2f)
(cherry picked from commit ce8c291)
mukesh-ctds pushed a commit to datastax/pulsar that referenced this pull request Mar 6, 2024
…ic with a suffix-matched regexp due to a mistake of PIP-145 (apache#21885)

(cherry picked from commit 4ebbd2f)
(cherry picked from commit ce8c291)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category/reliability The function does not work properly in certain specific environments or failures. e.g. data lost cherry-picked/branch-2.11 cherry-picked/branch-3.0 cherry-picked/branch-3.1 cherry-picked/branch-3.2 doc-not-needed Your PR changes do not impact docs release/2.11.4 release/3.0.3 release/3.1.3 type/bug The PR fixed a bug or issue reported a bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants