Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Branch 2.5 All test #14

Open
wants to merge 110 commits into
base: master
Choose a base branch
from
Open

Branch 2.5 All test #14

wants to merge 110 commits into from

Commits on Jan 6, 2020

  1. Release 2.5.0

    sijie committed Jan 6, 2020
    Configuration menu
    Copy the full SHA
    f2afad3 View commit details
    Browse the repository at this point in the history

Commits on Feb 17, 2020

  1. Fix create consumer on partitioned topic while disable topic auto cre…

    …ation. (apache#5572)
    
    ### Motivation
    
    Currently, disable the topic auto creation will cause consumer create failed on a partitioned topic. Since the partitioned topic is already created, so we should handle the topic partition create when disable the topic auto creation.
    
    ### Modifications
    
    By default, create partitioned topics also try to create all partitions, and if create partitions failed, users can use `create-missed-partitions` to repair.
    
    If users already have a partitioned topic without created partitions, can also use `create-missed-partitions` to repair.
    (cherry picked from commit 602f1c2)
    codelipenghui authored and jiazhai committed Feb 17, 2020
    Configuration menu
    Copy the full SHA
    80ad05f View commit details
    Browse the repository at this point in the history

Commits on Mar 21, 2020

  1. Fixed static linking on C++ lib on MacOS (apache#5581)

    * Fixed static linking on C++ lib on MacOS
    
    * Use `-undefined dynamic_lookup` when linking on Mac to not include python's own runtime
    
    * Fixed searching for protobuf
    
    (cherry picked from commit 125a588)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    2efd47d View commit details
    Browse the repository at this point in the history
  2. [pulsar-broker] close managed-ledgers before giving up bundle ownersh…

    …ip to avoid bad zk-version (apache#5599)
    
    ### Motivation
    
    We have seen multiple below occurrence where unloading topic doesn't complete and gets stuck. and broker gives up ownership after a timeout and closing ml-factory closes unclosed managed-ledger which corrupts metadata zk-version and topic owned by new broker keeps failing with exception: `ManagedLedgerException$BadVersionException`
    
    right now, while unloading bundle: broker removes ownership of bundle after timeout even if topic's managed-ledger is not closed successfully and `ManagedLedgerFactoryImpl` closes unclosed ml-ledger on broker shutdown which causes bad zk-version in to the new broker and because of that cursors are not able to update cursor-metadata into zk.
    
    ```
    01:01:13.452 [shutdown-thread-57-1] INFO  org.apache.pulsar.broker.namespace.OwnedBundle - Disabling ownership: my-property/my-cluster/my-ns/0xd0000000_0xe0000000
    :
    01:01:13.653 [shutdown-thread-57-1] INFO  org.apache.pulsar.broker.service.BrokerService - [persistent://my-property/my-cluster/my-ns/topic-partition-53] Unloading topic
    :
    01:02:13.677 [shutdown-thread-57-1] INFO  org.apache.pulsar.broker.namespace.OwnedBundle - Unloading my-property/my-cluster/my-ns/0xd0000000_0xe0000000 namespace-bundle with 0 topics completed in 60225.0 ms
    :
    01:02:13.675 [shutdown-thread-57-1] ERROR org.apache.pulsar.broker.namespace.OwnedBundle - Failed to close topics in namespace my-property/my-cluster/my-ns/0xd0000000_0xe0000000 in 1/MINUTES timeout
    01:02:13.677 [pulsar-ordered-OrderedExecutor-7-0-EventThread] INFO  org.apache.pulsar.broker.namespace.OwnershipCache - [/namespace/my-property/my-cluster/my-ns/0xd0000000_0xe0000000] Removed zk lock for service unit: OK
    :
    01:02:14.404 [shutdown-thread-57-1] INFO  org.apache.bookkeeper.mledger.impl.ManagedLedgerImpl - [my-property/my-cluster/my-ns/persistent/topic-partition-53] Closing managed ledger
    ```
    
    ### Modification
    
    This fix will make sure that broker closes managed-ledger before giving up bundle ownership to avoid below exception at new broker where bundle moves
    ```
    
    01:02:30.995 [bookkeeper-ml-workers-OrderedExecutor-3-0] ERROR org.apache.bookkeeper.mledger.impl.ManagedCursorImpl - [my-property/my-cluster/my-ns/persistent/topic-partition-53][my-sub] Metadata ledger creation failed
    org.apache.bookkeeper.mledger.ManagedLedgerException$BadVersionException: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion
    Caused by: org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion
            at org.apache.zookeeper.KeeperException.create(KeeperException.java:118) ~[zookeeper-3.4.13.jar:3.4.13-2d71af4dbe22557fda74f9a9b4309b15a7487f03]
            at org.apache.bookkeeper.mledger.impl.MetaStoreImplZookeeper.lambda$null$125(MetaStoreImplZookeeper.java:288) ~[managed-ledger-original-2.4.5-yahoo.jar:2.4.5-yahoo]
            at org.apache.bookkeeper.mledger.util.SafeRun$1.safeRun(SafeRun.java:32) [managed-ledger-original-2.4.5-yahoo.jar:2.4.5-yahoo]
            at org.apache.bookkeeper.common.util.SafeRunnable.run(SafeRunnable.java:36) [bookkeeper-common-4.9.0.jar:4.9.0]
            at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
            at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
            at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) [netty-all-4.1.32.Final.jar:4.1.32.Final]
            at java.lang.Thread.run(Thread.java:834) [?:?]
    ```
    
    (cherry picked from commit 0a259ab)
    rdhabalia authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    2074c7b View commit details
    Browse the repository at this point in the history
  3. Expose bookkeeper expose explicit lac in broker.conf (apache#5822)

    ### Motivation
    
    Expose bookkeeper expose explicit lac configuration in broker.conf
    It's related to apache#3828 apache#4976, some Pulsar SQL users need to enable the explicitLacInterval, so that they can get the last message in Pulsar SQL.
    
    (cherry picked from commit 4fd17d4)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    f98d4c1 View commit details
    Browse the repository at this point in the history
  4. [build] Skip javadoc task for pulsar-client-kafka-compact modules (ap…

    …ache#5836)
    
    *Motivation*
    
    pulsar-client-kafka-compact depends on pulsar-client implementation hence it pulls in
    protobuf dependencies. This results in `class file for com.google.protobuf.GeneratedMessageV3 not found`
    errors when generating javadoc for those modules.
    
    *Modifications*
    
    Skip javadoc tasks for these modules. Because:
    
    - pulsar-client-kafka-compact is a kafka wrapper. Kafka already provides javadoc for this API.
    - we didn't publish the javadoc for this module.
    
    (cherry picked from commit 97f9431)
    sijie authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    0f44118 View commit details
    Browse the repository at this point in the history
  5. add_backlogSize_in_topicStat (apache#5914)

    (cherry picked from commit d1d5cf7)
    k2la authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    33b2e24 View commit details
    Browse the repository at this point in the history
  6. Allow to enable/disable delayed delivery for messages on namespace (a…

    …pache#5915)
    
    * Allow to enable/disable delyed delivery for messages on namespace
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * add isDelayedDeliveryEnabled function
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * add delayed_delivery_time process logic
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * add test case
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * update admin cli docs
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * fix comments
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * fix comments
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * fix comments
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * update import lib
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * avoid import *
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * fix comments
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * fix comments
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * remove unuse code
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * fix comments
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * add test case for delayed delivery messages
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * fix comments
    
    Signed-off-by: xiaolong.ran <[email protected]>
    
    * fix comments
    
    Signed-off-by: xiaolong.ran <[email protected]>
    (cherry picked from commit f0d339e)
    wolfstudy authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    e3877d8 View commit details
    Browse the repository at this point in the history
  7. Fix negative un-ack messages in consumer stats (apache#5929)

    Fixes apache#5755
    
    ### Motivation
    
    Fix negative un-ack messages in consumer stats while set maxUnackedMessagesPerConsumer=0
    
    ### Verifying this change
    
    Added unit test
    
    (cherry picked from commit 9d94860)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    63a66d0 View commit details
    Browse the repository at this point in the history
  8. Upgrade Avro to 1.9.1 (apache#5938)

    ### Motivation
    
    Currently, Pulsar uses Avro 1.8.2, a version released two years ago. The latest version of Avro is 1.9.1, which uses FasterXML's Jackson 2.x instead of Codehaus's Jackson 1.x. Jackson is prone to security issues, so we should not keep using older versions.
    https://blog.godatadriven.com/apache-avro-1-9-release
    
    ### Modifications
    
    Avro 1.9 has some major changes:
    
    - The library used to handle logical datetime values has changed from Joda-Time to JSR-310 (apache/avro#631)
    - Namespaces no longer include "$" when generating schemas containing inner classes using ReflectData (apache/avro#283)
    - Validation of default values has been enabled (apache/avro#288). This results in a validation error when parsing the following schema:
    ```json
    {
      "name": "fieldName",
      "type": [
        "null",
        "string"
      ],
      "default": "defaultValue"
    }
    ```
    The default value of a nullable field must be null (cf. https://issues.apache.org/jira/browse/AVRO-1803), and the default value of the field as above is actually null. However, this PR disables the validation in order to maintain the traditional behavior.
    
    (cherry picked from commit d6f240e)
    Masahiro Sakamoto authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    31fcb38 View commit details
    Browse the repository at this point in the history
  9. Avoid using same OpAddEntry between different ledger handles (apache#…

    …5942)
    
    ### Motivation
    
    Avoid using same OpAddEntry between different ledger handles.
    
    ### Modifications
    
    Add state for OpAddEntry, if op handled by new ledger handle, the op will set to CLOSED state, after the legacy callback happens will check the op state, only INITIATED can be processed.
    
    When ledger rollover happens, pendingAddEntries will be processed. when process pendingAddEntries, will create a new OpAddEntry by the old OpAddEntry to avoid different ledger handles use same OpAddEntry.
    
    (cherry picked from commit 7ec17b2)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    0729333 View commit details
    Browse the repository at this point in the history
  10. Prevent creation of regular topic with the same name as existing part…

    …itioned topic (apache#5943)
    
    ### Motivation
    
    
    Currently, it is not possible to create a partitioned topic with the same name as an existing non-partitioned topic, but the reverse is possible.
    
    ```
    $ ./bin/pulsar-admin topics create persistent://public/default/t1
    $ ./bin/pulsar-admin topics create-partitioned-topic -p 2 persistent://public/default/t1
    
    16:12:50.418 [AsyncHttpClient-5-1] WARN  org.apache.pulsar.client.admin.internal.BaseResource - [http://localhost:8080/admin/v2/persistent/public/default/t1/partitions] Failed to perform http put request: javax.ws.rs.ClientErrorException: HTTP 409 Conflict
    This topic already exists
    
    Reason: This topic already exists
    
    $ ./bin/pulsar-admin topics create-partitioned-topic -p 2 persistent://public/default/t2
    $ ./bin/pulsar-admin topics create persistent://public/default/t2
    $ ./bin/pulsar-admin topics list public/default
    
    "persistent://public/default/t2"
    "persistent://public/default/t1"
    
    $ ./bin/pulsar-admin topics list-partitioned-topics public/default
    
    "persistent://public/default/t2"
    ```
    
    These non-partitioned topics are not available and should not be created.
    
    ### Modifications
    
    When creating a non-partitioned topic, "409 Conflict" error will be returned if a partitioned topic with the same name already exists.
    
    (cherry picked from commit 7fd3f70)
    Masahiro Sakamoto authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    d662a4a View commit details
    Browse the repository at this point in the history
  11. [pulsar-broker] Clean up closed producer to avoid publish-time for pr…

    …oducer (apache#5988)
    
    * [pulsar-broker] Clean up closed producer to avoid publish-time  for producer
    
    * fix test cases
    
    (cherry picked from commit 0bc54c5)
    rdhabalia authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    6be3149 View commit details
    Browse the repository at this point in the history
  12. Fix unit test (apache#6006)

    ### Motivation
    
    Since apache#5599 merged, it introduce some conflict code with master branch, maybe the reason is apache#5599 not rebase with master
    
    ### Verifying this change
    
    This is a test change
    
    (cherry picked from commit 275854e)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    f3ca73f View commit details
    Browse the repository at this point in the history
  13. Expose lastConsumedTimestamp and lastAckedTimestamp to consumer stats (

    …apache#6051)
    
    ---
    
    Master Issue: apache#6046
    
    *Motivation*
    
    Make people can use the timestamp to tell if acknowledge and consumption
    are happening.
    
    *Modifications*
    
    - Add lastConsumedTimestamp and lastAckedTimestamp to consume stats
    
    *Verify this change*
    
    - Pass the test `testConsumerStatsLastTimestamp`
    
    (cherry picked from commit 5728977)
    zymap authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    08ef06d View commit details
    Browse the repository at this point in the history
  14. Fix issue 5505 (apache#6060)

    (cherry picked from commit 56280ea)
    ntysdd authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    ffa0b04 View commit details
    Browse the repository at this point in the history
  15. make acker transient (apache#6064)

    (cherry picked from commit c90854a)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4fbdc95 View commit details
    Browse the repository at this point in the history
  16. PIP-55: Refresh Authentication Credentials (apache#6074)

    * PIP-55: Refresh Authentication Credentials
    
    * Fixed import order
    
    * Do not check for original client credential if it's not coming through proxy
    
    * Fixed import order
    
    * Fixed mocked test assumption
    
    * Addressed comments
    
    * Avoid to print NPE on auth refresh check if auth is disabled
    
    (cherry picked from commit 4af5223)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    5680a81 View commit details
    Browse the repository at this point in the history
  17. Fix zero queue consumer message redelivery (apache#6076)

    Motivation
    Message redelivery is not work well with zero queue consumer when using receive() or listeners to consume messages. This pull request is try to fix it.
    
    Modifications
    Add missed trackMessage() method call at zero queue size consumer.
    
    Verifying this change
    New unit tests added.
    
    (cherry picked from commit 787bee1)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    46a1b28 View commit details
    Browse the repository at this point in the history
  18. Support delete inactive topic when subscriptions caught up (apache#6077)

    ### Motivation
    
    Currently, pulsar support delete inactive topic which has no active producers and no subscriptions. This pull request is support to delete inactive topics that all subscriptions of the topic are caught up and no active producers/consumer. 
    
    ### Modifications
    
    Expose inactive topic delete mode in broker.conf, future more we can support namespace level configuration for the inactive topic delete mode.
    
    (cherry picked from commit dc7abd8)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    196aa81 View commit details
    Browse the repository at this point in the history
  19. Add a message on how to make log refresh immediately when starting a …

    …component (apache#6078)
    
    ### Motivation
    
    Some users may confuse by pulsar/bookie log without flushing immediately.
    
    ### Modifications
    
    Add a message in `bin/pulsar-daemon` when starting a component.
    
    (cherry picked from commit 4f461c3)
    murong00 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    d7aa4d7 View commit details
    Browse the repository at this point in the history
  20. Fix message redelivery for zero queue consumer while using async api …

    …to receive messages (apache#6090)
    
    Fix message redelivery for zero queue consumer while using async api to receive messages
    
    (cherry picked from commit d5fca06)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    693b59d View commit details
    Browse the repository at this point in the history
  21. [Functions] The argument and description for dead letter topic is wro…

    …ng (apache#6101)
    
    *Motivation*
    
    Related to apache#6084
    
     apache#5400 introduces `customRuntimeOptions` in function details. But the description was wrong. The mistake was probably introduced by bad merges.
    
    *Modification*
    
    Fix the argument and description for `deadletterTopic` and `customRuntimeOptions`.
    
    (cherry picked from commit c6e258d)
    sijie authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    e32173e View commit details
    Browse the repository at this point in the history
  22. [Websocket] Websocket doesn't set the correct cluster data (apache#6102

    )
    
    *Motivation*
    
    Fixes apache#5997
    Fixes apache#6079
    
    A regression was introduced in apache#5486. If websocket service as running as part of
    pulsar standalone, the cluster data is set with null service urls. This causes
    service url is not set correctly in the pulsar client and an illegal argument exception
    ("Param serviceUrl must not be blank.") will be thrown.
    
    *Modifications*
    
    1. Pass `null` when constructing the websocket service. So the local cluster data can
       be refreshed when creating pulsar client.
    2. Set the cluster data after both broker service and web service started and ports are allocated.
    
    (cherry picked from commit 49a9897)
    sijie authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    a2c3858 View commit details
    Browse the repository at this point in the history
  23. Fix zeroQueueConsumer using listener (apache#6106)

    ### Motivation
    
    Available permits of ZeroQueueConsuemer must be 1 or less, however ZeroQueueConsuemer using listener may be greater than 1.
    
    
    ### Modifications
    
    If listener is processing message, ZeroQueueConsumer doesn't send permit when it reconnect to broker.
    
    
    ### Reproduction
    1. ZeroQueueConsuemer using listener consume a topic.
    
    2. Unload that topic( or restart a broker) when listener is processing message.
    
    3.  ZeroQueueConsumer sends permit when it reconnect to broker.
    https://github.com/apache/pulsar/blob/v2.5.0/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ZeroQueueConsumerImpl.java#L133
    
    4. ZeroQueueConsumer also sends permit when finished processing message.
    https://github.com/apache/pulsar/blob/v2.5.0/pulsar-client/src/main/java/org/apache/pulsar/client/impl/ZeroQueueConsumerImpl.java#L163
    
    5. Available permits become 2.
    
    (cherry picked from commit c09314c)
    hrsakai authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    13fd6b3 View commit details
    Browse the repository at this point in the history
  24. [Broker]Reset cursor with a non-exists position (apache#6120)

    `ManagedCursorImpl.asyncResetCursor` is used in three kinds of circumstances:
    - REST API: create a subscription with messageId. Per the document: Reset subscription to message position closest to given position.
    - REST API: reset subscription to a given position: Per the document: Reset subscription to message position closest to given position.
    - Consumer seek command.
    
    In all the cases above, when the user provides a MessageId, we should make the best effort to find the closest position, instead of throwing an InvalidCursorPosition Exception. 
    
    This is because if a user provids an invalid position, it's not possible for he or she gets a valid position, since ledger ids for a given topic may not be continuous and only brokers are aware of the order. Therefore, we should avoid throw invalid cursor position but find the nearest position and do the reset stuff.
    
    (cherry picked from commit d2f37a7)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4c3112a View commit details
    Browse the repository at this point in the history
  25. [pulsar-admin] allow tenant admin to manage subscription permission (a…

    …pache#6122)
    
    ### Motivation
    In apache#2981, we have added support to grant subscriber-permission to manage subscription based apis. However, grant-subscription-permission api requires super-user access and it creates too much dependency on system-admin when many tenants want to grant subscription permission.
    So, allow each tenant to manage subscription permission in order to reduce administrative efforts for super user. 
    
    (cherry picked from commit 254e54b)
    rdhabalia authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    2672574 View commit details
    Browse the repository at this point in the history
  26. Add timeout to search for web service URLs to avoid web threads getti…

    …ng stuck (apache#6124)
    
    (cherry picked from commit d42cfa1)
    Masahiro Sakamoto authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    ea134ff View commit details
    Browse the repository at this point in the history
  27. Fix broker client tls settings error (apache#6128)

    when broker create the inside client, it sets tlsTrustCertsFilePath as "getTlsCertificateFilePath()", but it should be "getBrokerClientTrustCertsFilePath()"
    
    (cherry picked from commit 1fcccd6)
    jiazhai authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    7c13b40 View commit details
    Browse the repository at this point in the history
  28. Output resource usage rate to log on broker (apache#6152)

    ### Motivation
    
    When a broker is under heavy load, the following log may be output and some topics may be unloaded.
    
    > Attempting to shed load on broker101.pulsar.xxx.yahoo.co.jp:4080, which has max resource usage above threshold 0.8708186149597168% > 0.85% -- Offloading at least 0.36224863451117845 MByte/s of traffic
    
    This log means that the usage rate of CPU, memory, direct memory, input bandwidth, or output bandwidth has exceeded the threshold, but we don't know which resource usage is high.
    
    ### Modifications
    
    Output these resource usages along with the above log.
    
    > Attempting to shed load on broker101.pulsar.xxx.yahoo.co.jp:4080, which has resource usage 87.08% above threshold 85.0% -- Offloading at least 0.36224863451117845 MByte/s of traffic (cpu: 87.08%, memory: 12.71%, directMemory: 17.19%, bandwidthIn: 11.28%, bandwidthOut: 0.00%)
    
    (cherry picked from commit 9b296d8)
    Masahiro Sakamoto authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    077bd71 View commit details
    Browse the repository at this point in the history
  29. [Issue-5994]: Start proxy pods when at least one broker pod is running (

    apache#6158)
    
    ### Motivation
    Fixes apache#5994:
    If the proxy service comes up before the brokers are up and reachable there will be HTTP 403 when running `bin/pulsar-admin` commands from inside the proxy pod.
     
    The proxy will also not be able to connect to the brokers when data is pushed through binary port with the following error:
    ```bash
    Caused by: org.apache.pulsar.broker.service.BrokerServiceException$PersistenceException: org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty bookies available
    	... 14 more
    Caused by: org.apache.bookkeeper.mledger.ManagedLedgerException: Not enough non-faulty bookies available
    22:11:07.633 [pulsar-web-32-6] INFO  org.eclipse.jetty.server.RequestLog - 172.17.0.6 - - [24/Jan/2020:22:11:07 +0000] "PUT /admin/v2/persistent/public/functions/assignments HTTP/1.1" 500 2528 "-" "Pulsar-Java-v2.5.0" 280
    ```
    
    #### Workaround:
    Restart the proxy pods once brokers pods are running
    
    #### Proposed solution:
    Hold off starting of the proxies until at least one broker is reachable in the cluster. 
    
    ### Modifications
    
    Changes are inside `proxy-deployment.yaml` helm template file that defines a new init container before proxy is started. The init container waits until broker is reachable using the nslookup on the broker service with a sleep of 30 seconds between retries and up to number of brokers times.
    
    Alternative solution that doesn't always work was `'until nslookup broker-service; sleep 2; done;', but 403 would still sometimes (could have been a fluke, but I saw it happening once).
    
    ### Verifying this change
    1) Follow the instructions on how deploying helm and run:
    `helm install pulsar --values pulsar/values-mini.yaml ./pulsar/`. 
    2) Wait until all the services are up and running.  
    3) Connect to proxy pod and run `bin/pulsar-admin broker-stats monitoring-metrics` - no 403 or permission errors should arise
    4) Set up tenant, namespace
    5) Push data into a topic - No errors in the proxy logs and client is able to push data into cluster through proxies
    
    (cherry picked from commit b838c59)
    roman-popenov authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    a9df727 View commit details
    Browse the repository at this point in the history
  30. add missing check to dashboard-ingress (helm chart) (apache#6160)

    ### Motivation
    
    if you deploy pulsar using the helm chart and disable monitoring with
    
    ```
    extras:
      dashboard: no
    
    ```
    
    but you have the ingress of the dashboard set to true
    
    ```
    dashboard:
      ingress:
        enabled: true
    ```
    	
    
    the helm chart will create an ingress that points to a non-existing service because the dashboard itself was not deployed.
    
    
    ### Modifications
    
    I've added the same check that is already in place in dashboard-service and dashboard-deployment
    
    ### Verifying this change
    
    I dont know of any automated tests, i tested it manually. In the end it's the same "if" that is already in place in dashboard-service and dashboard-deployment
    
    
    ### Does this pull request potentially affect one of the following parts:
    
    Affects deployment via helm chart. An unwanted ingress object is suppressed.
    
    ### Documentation
    
     no documentation need
    
    (cherry picked from commit efee516)
    tmemenga authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    850ede6 View commit details
    Browse the repository at this point in the history
  31. Restore clusterDispatchRate policy for compatibility (apache#6176)

    Co-authored-by: Sijie Guo <[email protected]>
    (cherry picked from commit 9b46930)
    Masahiro Sakamoto authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    2d9e063 View commit details
    Browse the repository at this point in the history
  32. Supports evenly distribute topics count when splits bundle (apache#6241)

    ### Motivation
    
    Currently, bundle split splits the bundle into two parts of the same size. When there are fewer topics, bundle split does not work well. The topic assigned to the broker according to the topic name hash value, hashing is not effective in a small number of topics bundle split.
    
    So, this PR introduces an option(-balance-topic-count) for bundle split.  When setting it to true, the given bundle splits to 2 parts, each part has the same amount of topics.
    
    And introduce a new Load Manager implementation named `org.apache.pulsar.broker.loadbalance.impl.BalanceTopicCountModularLoadManager`.  The new Load Manager implementation splits bundle with balance topics count, others are not different from ModularLoadManagerImpl.
    
    (cherry picked from commit 1c099da)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    1736aa1 View commit details
    Browse the repository at this point in the history
  33. Introduce maxMessagePublishBufferSizeInMB configuration to avoid brok…

    …er OOM (apache#6178)
    
    Motivation
    Introduce maxMessagePublishBufferSizeInMB configuration to avoid broker OOM.
    
    Modifications
    If the processing message size exceeds this value, the broker will stop read data from the connection. When available size > half of the maxMessagePublishBufferSizeInMB, start auto-read data from the connection.
    
    (cherry picked from commit 91dfa1a)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    b535ac4 View commit details
    Browse the repository at this point in the history
  34. Namespace level offloader (apache#6183)

    ### Motivation
    
    Currently, the offload operation only have the cluster level configuration, can't set the offload configuration at the namespace level, it's inflexible. 
    
    ### Modifications
    
    Add the namespace offload policies.
    
    (cherry picked from commit fd03be5)
    gaoran10 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    88dda60 View commit details
    Browse the repository at this point in the history
  35. [Issue 5904]Support unload all partitions of a partitioned topic (a…

    …pache#6187)
    
    Fixes apache#5904 
    
    ### Motivation
    Pulsar supports unload a non-partitioned-topic or a partition of a partitioned topic. If there has a partitioned topic with too many partitions, users need to get all partition and unload them one by one. We need to support unload all partition of a partitioned topic.
    
    (cherry picked from commit d35e6c1)
    ltamber authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    e33238f View commit details
    Browse the repository at this point in the history
  36. Create managed ledger path on local zookeeper when create partitions (a…

    …pache#6189)
    
    ### Motivation
    
    Create managed ledger path on local zookeeper when creating partitions for a partitioned topic.
    
    ### Modifications
    
    Change globalZk() to localZk() when creating partitions.
    
    ### Verifying this change
    
    PartitionCreationTest can cover this change, since we use the same zookeeper for the unit test in ProducerConsumerBase, so the test passed before.
    
    (cherry picked from commit 43d89f2)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    c6d59d9 View commit details
    Browse the repository at this point in the history
  37. Corrected the method name for source implementation (apache#6190)

    Motivation
    Corrected the method name for source implementation in io-develop.md
    
    (cherry picked from commit 46bc412)
    abhilashmandaliya authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4c19484 View commit details
    Browse the repository at this point in the history
  38. [deployement] make kubernetes yamls for aws operational (apache#6192)

    ### Motivation
    
    Supplied Kubernetes yaml's for AWS are outdated and just don't work.
    
    ### Modifications
    
    Update yaml files and so that appying them on AWS EKS will actually set up a working Pulsar environment.
    
    ### Verifying this change
    
    This change is a trivial rework / code cleanup without any test coverage.
    
    (cherry picked from commit d631156)
    trexinc authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4c86357 View commit details
    Browse the repository at this point in the history
  39. Fix get schema version in HttpLookupService. (apache#6193)

    ### Motivation
    
    Fix get schema version in HttpLookupService.  The com.yahoo.sketches.Util.bytesToLong method need to flip the byte[]. Otherwise, will get a wrong long value. So use ByteBuffer to convert byte[] version to long. 
    
    This issue will happens when users use http protocol client and multiple version schemas.
    
    ### Verifying this change
    
    New tests added for HttpLookupService and BinaryLookupService.
    
    
    (cherry picked from commit 44dd412)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4324c5b View commit details
    Browse the repository at this point in the history
  40. Pin the netty-transport-native-epoll to avoid conflicts (apache#6194)

    ### Motivation
    
    Currently the version pinning for `netty-transport-native-epoll` is not including the native library artifact. 
    
    That results, depending on the Maven version, to be picking up an earlier version of `transport-native-epoll-4.1.33.Final-linux-x86_64.jar`, where the version is 4.1.33 as opposed to 4.1.43 which is the correct expected version. 
    
    This results in using Java NIO based transport instead of the more effiecient/performant epoll based one.
    
    This affects 2.5.0 as well.
    
    (cherry picked from commit 857d63b)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4f3f5f8 View commit details
    Browse the repository at this point in the history
  41. [ISSUE-6131]: Ensure JVM memory and GC options are set for bookie (ap…

    …ache#6201)
    
    ### Motivation
    Fixes apache#6131 (caused by apache#5675):
    
    When upgrading an existing 2.4.1 bookie cluster to 2.5.0 on kubernetes, the bookie fails to start with the following exception during initialization: io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 2147483648, max: 2147483648). This is caused by the fact that the bookie environment variables `BOOKIE_MEM` and `BOOKIE_FC` defined in conf/bkenv.sh has no effect, and it is always using the defaults values. 
    
    #### Proposed solution:
    Set `BOOKIE_MEM` and `BOOKIE_GC` in the helm deployments charts and default to `PULSAR_MEM` if the `BOOKIE` settings are not set and then use the default settings if none of those environment variables are set.
    
    #### Changes made
    Helm chart deployment `values.yaml` and `values-mini.yaml` along with the `bkenv.sh` configuration script.
    
    ### Documentation
    Currently, the documentation explaining the deployment process and how to change settings is lacking and need to be updated.
    
    (cherry picked from commit 28875d5)
    roman-popenov authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    c40ccf8 View commit details
    Browse the repository at this point in the history
  42. [functions] Default functionAuthProvider when running in k8s (apache#…

    …6203)
    
    In 2.4.x, when running with the KubernetesRuntime, it default to always
    using the KubernetesSecretAuthProvider class. With the change in 2.5 to
    making this behavior pluggable, there is currently a bug in that it
    doesn't keep this behavior and requires a new configuration option to be
    passed.
    
    This commit changes the config so that it defaults to the correct class
    when we are running with a kubernetes runtime. This restores the
    behavior match that of earlier versions
    
    This also moves the WorkerConfig test to the same package where the
    workerConfig resides after the refactor and re-arranges the resources
    files and copied via a maven task
    
    Co-authored-by: Addison Higham <[email protected]>
    (cherry picked from commit 3a3174b)
    addisonj authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    a89bd2d View commit details
    Browse the repository at this point in the history
  43. Fix bug that backlog message that has not yet expired could be delete…

    …d due to TTL (apache#6211)
    
    Fixes apache#5579 
    
    ### Motivation
    
    In Pulsar 2.4.1 and later versions, if message TTL is enabled, `PersistentMessageExpiryMonitor` always deletes one non-expired message every 5 minutes.
    
    The cause of this bug is apache#4744. `PersistentMessageExpiryMonitor` expects `ManagedCursor#asyncFindNewestMatching()` to pass null as its found position to itself as a callback if no expired messages exist.
    https://github.com/apache/pulsar/blob/c5ba52983fee994de61984aae7d1757e9b738caf/pulsar-broker/src/main/java/org/apache/pulsar/broker/service/persistent/PersistentMessageExpiryMonitor.java#L119-L130
    
    However, due to the change in apache#4744, if no entry is found that matches the search condition, the callback will be passed `startPosition` instead of null now. For this reason, the earliest backlog message is always deleted by `PersistentMessageExpiryMonitor`.
    
    This means that unexpected message loss can occur.
    
    ### Modifications
    
    Revert the apache#4744 changes. The motivation of apache#4744 is to avoid NPE caused in pulse-sql, but that seems to be fixed in apache#4757.
    https://github.com/apache/pulsar/blob/2069f761753940ed6a1faca8999af70036f20fd6/pulsar-sql/presto-pulsar/src/main/java/org/apache/pulsar/sql/presto/PulsarSplitManager.java#L363-L382
    
    (cherry picked from commit 54b39e6)
    Masahiro Sakamoto authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    31e8e59 View commit details
    Browse the repository at this point in the history
  44. [authentication] Validate tokens for binary connections (apache#6233)

    Currently, binary connects aren't checked to see if they provide a
    token.
    
    This results in a NPE in the JWT validation as well as a whole bunch of
    log spam. By explictly checking for a null/empty token here, we can
    avoid some exceptions and clean up log spam.
    
    Co-authored-by: Addison Higham <[email protected]>
    (cherry picked from commit 00ce81f)
    addisonj authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    bb3f0bb View commit details
    Browse the repository at this point in the history
  45. Use fully qualified hostname as default to advertise brokers (apache#…

    …6235)
    
    (cherry picked from commit 4018d0b)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    bd439b3 View commit details
    Browse the repository at this point in the history
  46. [Issue 6173][compaction] Fix log compaction for flow control/empty to…

    …pic/last deletion (apache#6237)
    
    Fixes apache#6173
    
    ### Motivation
    
    Fixes problems for log compaction found in issue apache#6173 :
    
    1. Compaction fails for an empty topic. 
    2. Compaction never ends if the value of the last message is an empty batch message when the compaction is triggered. 
    3. Compaction fails for a topic with batch messages because RawReader flow control doesn't handle batch messages properly.
    
    ### Modifications
    
    1. Check if any message is available before compaction phases, and finish the compaction immediately if there is no messages to read to avoid timeout exception.
    2. Add missing check for empty batch message for the condition to end the phase 2 loop.
    3. Increase correct number of available permits in RawConsumer for batch messages.
    
    ### Verifying this change
    
    Producing messages in both batch and not-batch mode in corresponding tests.
    
    (cherry picked from commit d3f6c55)
    fantapsody authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4f1bedc View commit details
    Browse the repository at this point in the history
  47. Fix deploy of WindowFunctions (apache#6246)

    ### Motivation
    
    In pulsar 2.5.0 deploying window functions fails because its class doesn't pass validation.
    The behavior looks the same in current master.
    
    ### Modifications
    
    Add `WindowFunction.class` to the list of allowed function classes
    
    (cherry picked from commit 47b944b)
    seeday authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    ba60a3f View commit details
    Browse the repository at this point in the history
  48. [C++] Fixed memory corruption on ExecutorService destructor (apache#6270

    )
    
    ### Motivation
    
    In the C++ test CI jobs there are spurious tests failing with segfaults. 
    
    Analyzing the test execution with valgrind it's possible to see that the thread that is running the boost asio event loop is accessing the `io_service` after that already got destroyed. 
    
    To ensure that the `io_service` is always valid until the thread exists, we pass a `shared_ptr` so that will ensure the liveness.
    
    Example of valgrind errors: 
    
    ```
    ==10034== Invalid read of size 4
    ==10034==    at 0x4BCB784: __pthread_mutex_unlock_usercnt (pthread_mutex_unlock.c:40)
    ==10034==    by 0x4BCB784: pthread_mutex_unlock (pthread_mutex_unlock.c:357)
    ==10034==    by 0x197DB9: boost::asio::detail::posix_mutex::unlock() (posix_mutex.hpp:58)
    ==10034==    by 0x199492: boost::asio::detail::conditionally_enabled_mutex::scoped_lock::~scoped_lock() (conditionally_enabled_mutex.hpp:66)
    ==10034==    by 0x4F03895: boost::asio::detail::scheduler::run(boost::system::error_code&) (scheduler.ipp:151)
    ==10034==    by 0x4F03F8B: boost::asio::io_context::run() (io_context.ipp:62)
    ==10034==    by 0x4FDE872: pulsar::ExecutorService::startWorker(std::shared_ptr<boost::asio::io_context>) (ExecutorService.cc:39)
    ==10034==    by 0x4FE99A3: void std::__invoke_impl<void, void (pulsar::ExecutorService::*&)(std::shared_ptr<boost::asio::io_context>), pulsar::ExecutorService*&, decltype(nullptr)&>(std::__invoke_memfun_deref, void (pulsar::ExecutorService::*&)(std::shared_ptr<boost::asio::io_context>), pulsar::ExecutorService*&, decltype(nullptr)&) (invoke.h:73)
    ==10034==    by 0x4FE986D: std::__invoke_result<void (pulsar::ExecutorService::*&)(std::shared_ptr<boost::asio::io_context>), pulsar::ExecutorService*&, decltype(nullptr)&>::type std::__invoke<void (pulsar::ExecutorService::*&)(std::shared_ptr<boost::asio::io_context>), pulsar::ExecutorService*&, decltype(nullptr)&>(void (pulsar::ExecutorService::*&)(std::shared_ptr<boost::asio::io_context>), pulsar::ExecutorService*&, decltype(nullptr)&) (invoke.h:95)
    ==10034==    by 0x4FE9767: void std::_Bind<void (pulsar::ExecutorService::*(pulsar::ExecutorService*, decltype(nullptr)))(std::shared_ptr<boost::asio::io_context>)>::__call<void, , 0ul, 1ul>(std::tuple<>&&, std::_Index_tuple<0ul, 1ul>) (functional:400)
    ==10034==    by 0x4FE94A0: void std::_Bind<void (pulsar::ExecutorService::*(pulsar::ExecutorService*, decltype(nullptr)))(std::shared_ptr<boost::asio::io_context>)>::operator()<, void>() (functional:484)
    ==10034==    by 0x4FE9095: boost::asio::detail::posix_thread::func<std::_Bind<void (pulsar::ExecutorService::*(pulsar::ExecutorService*, decltype(nullptr)))(std::shared_ptr<boost::asio::io_context>)> >::run() (posix_thread.hpp:86)
    ==10034==    by 0x4F03E00: boost_asio_detail_posix_thread_function (posix_thread.ipp:74)
    ==10034==  Address 0x8896d08 is 72 bytes inside a block of size 240 free'd
    ==10034==    at 0x483BFBF: operator delete(void*) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==10034==    by 0x1A0001: boost::asio::detail::scheduler::~scheduler() (scheduler.hpp:38)
    ==10034==    by 0x198E5B: boost::asio::detail::service_registry::destroy(boost::asio::execution_context::service*) (service_registry.ipp:110)
    ==10034==    by 0x198D94: boost::asio::detail::service_registry::destroy_services() (service_registry.ipp:54)
    ==10034==    by 0x199294: boost::asio::execution_context::destroy() (execution_context.ipp:46)
    ==10034==    by 0x199222: boost::asio::execution_context::~execution_context() (execution_context.ipp:35)
    ==10034==    by 0x19B90F: boost::asio::io_context::~io_context() (io_context.ipp:55)
    ==10034==    by 0x1B3B7F: std::_Sp_counted_ptr<boost::asio::io_context*, (__gnu_cxx::_Lock_policy)2>::_M_dispose() (shared_ptr_base.h:377)
    ==10034==    by 0x1A283B: std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release() (shared_ptr_base.h:155)
    ==10034==    by 0x19EC34: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count() (shared_ptr_base.h:730)
    ==10034==    by 0x19D123: std::__shared_ptr<boost::asio::io_context, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr() (shared_ptr_base.h:1169)
    ==10034==    by 0x19D143: std::shared_ptr<boost::asio::io_context>::~shared_ptr() (shared_ptr.h:103)
    ==10034==  Block was alloc'd at
    ==10034==    at 0x483AE63: operator new(unsigned long) (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
    ==10034==    by 0x19B7DA: boost::asio::io_context::io_context() (io_context.ipp:38)
    ==10034==    by 0x4FDE622: pulsar::ExecutorService::ExecutorService() (ExecutorService.cc:31)
    ==10034==    by 0x4FE871C: void __gnu_cxx::new_allocator<pulsar::ExecutorService>::construct<pulsar::ExecutorService>(pulsar::ExecutorService*) (new_allocator.h:147)
    ==10034==    by 0x4FE8570: void std::allocator_traits<std::allocator<pulsar::ExecutorService> >::construct<pulsar::ExecutorService>(std::allocator<pulsar::ExecutorService>&, pulsar::ExecutorService*) (alloc_traits.h:484)
    ==10034==    by 0x4FE807F: std::_Sp_counted_ptr_inplace<pulsar::ExecutorService, std::allocator<pulsar::ExecutorService>, (__gnu_cxx::_Lock_policy)2>::_Sp_counted_ptr_inplace<>(std::allocator<pulsar::ExecutorService>) (shared_ptr_base.h:548)
    ==10034==    by 0x4FE77AD: std::__shared_count<(__gnu_cxx::_Lock_policy)2>::__shared_count<pulsar::ExecutorService, std::allocator<pulsar::ExecutorService>>(pulsar::ExecutorService*&, std::_Sp_alloc_shared_tag<std::allocator<pulsar::ExecutorService> >) (shared_ptr_base.h:679)
    ==10034==    by 0x4FE6E3F: std::__shared_ptr<pulsar::ExecutorService, (__gnu_cxx::_Lock_policy)2>::__shared_ptr<std::allocator<pulsar::ExecutorService>>(std::_Sp_alloc_shared_tag<std::allocator<pulsar::ExecutorService> >) (shared_ptr_base.h:1344)
    ==10034==    by 0x4FE62C8: std::shared_ptr<pulsar::ExecutorService>::shared_ptr<std::allocator<pulsar::ExecutorService>>(std::_Sp_alloc_shared_tag<std::allocator<pulsar::ExecutorService> >) (shared_ptr.h:359)
    ==10034==    by 0x4FE4DEF: std::shared_ptr<pulsar::ExecutorService> std::allocate_shared<pulsar::ExecutorService, std::allocator<pulsar::ExecutorService>>(std::allocator<pulsar::ExecutorService> const&) (shared_ptr.h:702)
    ==10034==    by 0x4FE3608: std::shared_ptr<pulsar::ExecutorService> std::make_shared<pulsar::ExecutorService>() (shared_ptr.h:718)
    ==10034==    by 0x4FDEFBD: pulsar::ExecutorServiceProvider::get() (ExecutorService.cc:90)
    ```
    
    (cherry picked from commit 8262ad9)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    ef40db5 View commit details
    Browse the repository at this point in the history
  49. [C++] Fixed handling of canceled timer events on NegativeAcksTracker (a…

    …pache#6272)
    
    When handling a "timer cancelled" event, we cannot lock the mutex since the object itself might already be destroyed.
    
    This causes potentially a memory corruption/segfault.
    
    (cherry picked from commit 54a5195)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    8f56884 View commit details
    Browse the repository at this point in the history
  50. Fix bug that tenants whose allowed clusters include global cannot be …

    …created/updated (apache#6275)
    
    (cherry picked from commit 4264b8d)
    Masahiro Sakamoto authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    d25cb39 View commit details
    Browse the repository at this point in the history
  51. [Issue 4070][pulsar-client-cpp] Fix for possible deadlock when closin…

    …g Pulsar client (apache#6277)
    
    * Attempt at fixing deadlock during client.close()
    
    * Fixed formatting
    
    * Detach the worker thread in the destructor of ExecutorService if it is still unable to be joined
    
    * Possible formatting fixes
    
    (cherry picked from commit 2e1c74a)
    heronr authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    d3063da View commit details
    Browse the repository at this point in the history
  52. Remove problematic semicolon from conf (apache#6303)

    (cherry picked from commit 50d3599)
    klevy-toast authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    bf0f4a7 View commit details
    Browse the repository at this point in the history
  53. Enable get precise backlog and backlog without delayed messages. (apa…

    …che#6310)
    
    Fixes apache#6045 apache#6281 
    
    ### Motivation
    
    Enable get precise backlog and backlog without delayed messages.
    
    ### Verifying this change
    
    Added new unit tests for the change.
    
    (cherry picked from commit df15210)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    ad01245 View commit details
    Browse the repository at this point in the history
  54. Fixed casting in ZooKeeperCache.getDataIfPresent() (apache#6313)

    * Fixed casting in ZooKeeperCache.getDataIfPresent()
    
    * Missed null check
    
    (cherry picked from commit 7cade48)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4ba38bb View commit details
    Browse the repository at this point in the history
  55. KeyValue schema support for pulsar sql (apache#6325)

    Fixes apache#5560
    
    ### Motivation
    
    Currently, Pulsar SQL can't read the keyValue schema data. This PR added support Pulsar SQL reading messages with a key-value schema.
    
    ### Modifications
    
    Add KeyValue schema support for Pulsar SQL. Add prefix __key. for the key field name.
    
    (cherry picked from commit 3cf6be1)
    gaoran10 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    b3af045 View commit details
    Browse the repository at this point in the history
  56. Should flush the last potential duplicated since can't combine potent…

    …ial duplicated messages and non-duplicated messages into a batch. (apache#6326)
    
    Fixes apache#6273
    
    Motivation
    The main reason for apache#6273 is combining potential duplicated messages and non-duplicated messages into a batch. So need to flush the potential duplicated message first and then add the non-duplicated messages to a batch.
    
    (cherry picked from commit b898f49)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    c20071b View commit details
    Browse the repository at this point in the history
  57. Upgrade ZooKeeper to 3.5.7 (apache#6329)

    Upgrade ZK to latest stable version. In particular we need to include:
    
    - Split brain on log disk full https://issues.apache.org/jira/browse/ZOOKEEPER-3701
    - Data loss after upgrading standalone ZK server 3.4.14 to 3.5.6 with snapshot.trust.empty=true https://issues.apache.org/jira/browse/ZOOKEEPER-3644
    
    (cherry picked from commit 5a8f420)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    eb17206 View commit details
    Browse the repository at this point in the history
  58. Windows CMake corrections (apache#6336)

    * Corrected method of specifying Windows path to LLVM tools
    
    * Fixing windows build
    
    * Corrected the dll install path
    
    * Fixing pulsarShared paths
    
    (cherry picked from commit 9b9e79e)
    heronr authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    aaf87f6 View commit details
    Browse the repository at this point in the history
  59. client: make SubscriptionMode a member of ConsumerConfigurationData (a…

    …pache#6337)
    
    Currently, SubscriptionMode is a parameter to create ConsumerImpl, but it is not exported out, and user could not set this value for consumer.  This change tries to make SubscriptionMode a member of ConsumerConfigurationData, so user could set this parameter when create consumer.  
    
    (cherry picked from commit 208af7c)
    jiazhai authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    7112e21 View commit details
    Browse the repository at this point in the history
  60. Avoid get partition metadata while the topic name is a partition name. (

    apache#6339)
    
    Motivation
    
    To avoid get partition metadata while the topic name is a partition name.
    Currently, if users want to skip all messages for a partitioned topic or unload a partitioned topic, the broker will call get topic metadata many times. For a topic with the partition name, it is not necessary to call get partitioned topic metadata again.
    
    (cherry picked from commit 26d569b)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    e44b822 View commit details
    Browse the repository at this point in the history
  61. explicit statement env 'BOOKIE_MEM' and 'BOOKIE_GC' for values-mini.y…

    …aml (apache#6340)
    
    Fixes apache#6338
    
    ### Motivation
    This commit started while I was using helm in my local minikube, noticed that there's a mismatch between `values-mini.yaml` and `values.yaml` files. At first I thought it was a copy/paste error. So I created apache#6338;
    
    Then I looked into the details how these env-vars[ were used](https://github.com/apache/pulsar/blob/28875d5abc4cd13a3e9cc4f59524d2566d9f9f05/conf/bkenv.sh#L36), found out its ok to use `PULSAR_MEM` as an alternative. But it introduce problems:
    1. Since `BOOKIE_GC` was not defined , the default [BOOKIE_EXTRA_OPTS](https://github.com/apache/pulsar/blob/28875d5abc4cd13a3e9cc4f59524d2566d9f9f05/conf/bkenv.sh#L39)  will finally use default value of `BOOKIE_GC`, thus would cover same the JVM parameters defined prior in `PULSAR_MEM`.
    2. May cause problems when bootstrap scripts changed in later dev, better to make it explicitly.
    
    So I create this pr to solve above problems(hidden trouble).
    
    ### Modifications
    
    As mentioned above, I've made such modifications below:
    1. make `BOOKIE_MEM` and `BOOKIE_GC` explicit in `values-mini.yaml` file.  Keep up with the format in`values.yaml` file.
    2. remove all  print-gc-logs related args. Considering the resource constraints of minikube environment. The removed part's content is `-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintHeapAtGC -verbosegc -XX:G1LogLevel=finest`
    3. leave `PULSAR_PREFIX_dbStorage_rocksDB_blockCacheSize` empty as usual, as [conf/standalone.conf#L576](https://github.com/apache/pulsar/blob/df152109415f2b10dd83e8afe50d9db7ab7cbad5/conf/standalone.conf#L576) says it would to use 10% of the direct memory size by default.
    
    (cherry picked from commit 7d4df99)
    liyuntao authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    e939a51 View commit details
    Browse the repository at this point in the history
  62. Fix java doc for key shared policy. (apache#6341)

    The key shared policy does not support setting the maximum key hash range, so fix the java doc.
    
    (cherry picked from commit 77971e4)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    08196be View commit details
    Browse the repository at this point in the history
  63. [Java Reader Client] Start reader inside batch result in read first m…

    …essage in batch. (apache#6345)
    
    Fixes apache#6344 
    Fixes apache#6350
    
    The bug was brought in apache#5622 by changing the skip logic wrongly.
    
    (cherry picked from commit 63ccd43)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    ac59e09 View commit details
    Browse the repository at this point in the history
  64. Fix broker to specify a list of bookie groups. (apache#6349)

    ### Motivation
    
    Fixes apache#6343
    
    ### Modifications
    
    Add a method to cast object value to `String`.
    
    (cherry picked from commit e1f7505)
    murong00 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    37906e3 View commit details
    Browse the repository at this point in the history
  65. Independent schema is set for each consumer generated by topic (apach…

    …e#6356)
    
    ### Motivation
    
    Master Issue: apache#5454 
    
    When one Consumer subscribe multi topic, setSchemaInfoPorvider() will be covered by the consumer generated by the last topic.
    
    ### Modification
    clone schema for each consumer generated by topic.
    ### Verifying this change
    Add the schemaTest for it.
    
    (cherry picked from commit 8003d08)
    congbobo184 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    64690bf View commit details
    Browse the repository at this point in the history
  66. remove future.join() from PulsarSinkEffectivelyOnceProcessor (apache#…

    …6361)
    
    (cherry picked from commit 943c903)
    nlu90 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    b4d7dc5 View commit details
    Browse the repository at this point in the history
  67. [ClientAPI]Fix hasMessageAvailable() (apache#6362)

    Fixes apache#6333 
    
    Previously, `hasMoreMessages` is test against:
    ```
    return lastMessageIdInBroker.compareTo(lastDequeuedMessage) == 0
                    && incomingMessages.size() > 0;
    ```
    However, the `incomingMessages` could be 0 when the consumer/reader has just started and hasn't received any messages yet. 
    
    In this PR, the last entry is retrieved and decoded to get message metadata. for the batchIndex field population.
    
    (cherry picked from commit baf155f)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    e173f60 View commit details
    Browse the repository at this point in the history
  68. Creating a topic does not wait for creating cursor of replicators (ap…

    …ache#6364)
    
    ### Motivation
    
    Creating a topic does not wait for creating cursor of replicators
    
    ## Verifying this change
    
    The exists unit test can cover this change
    
    (cherry picked from commit 336e971)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    e7cf371 View commit details
    Browse the repository at this point in the history
  69. [Issue 6355][HELM] autorecovery - could not find or load main class (a…

    …pache#6373)
    
    This applies the recommended fix from
    apache#6355 (comment)
    
    Fixes apache#6355
    
    ### Motivation
    
    This PR corrects the configmap data which was causing the autorecovery pod to crashloop
    with `could not find or load main class`
    
    ### Modifications
    
    Updated the configmap var data per [this comment](apache#6355 (comment)) from @sijie 
    
    (cherry picked from commit af4773b)
    jharris- authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    7fe98b8 View commit details
    Browse the repository at this point in the history
  70. [Pulsar-Client] Stop shade snappy-java in pulsar-client-shaded (apach…

    …e#6375)
    
    Fixes apache#6260 
    
    Snappy, like other compressions (LZ4, ZSTD), depends on native libraries to do the real encode/decode stuff. When we shade them in a fat jar, only the java implementations of snappy class are shaded, however, left the JNI incompatible with the underlying c++ code.
    
    We should just remove the shade for snappy, and let maven import its lib as a dependency.
    
    I've tested the shaded jar locally generated by this pr, it works for all compression codecs.
    
    (cherry picked from commit 3197dcd)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    59f055b View commit details
    Browse the repository at this point in the history
  71. fix duplicate key to send propertys (apache#6390)

    **Motivation**
    Fix when sending a message, set duplicate key to properties, can't pull the message while concumer apache#6388 
    ```javascript
    //org.apache.pulsar.client.impl.MessageImpl
    if (msgMetadata.getPropertiesCount() > 0) {
                this.properties = Collections.unmodifiableMap(msgMetadataBuilder.getPropertiesList().stream()
                        .collect(Collectors.toMap(KeyValue::getKey, KeyValue::getValue)));
            } else {
                properties = Collections.emptyMap();
            }
            this.schema = schema;
    ```
    Collectors.toMap can not allowed duplicate key
    
    **Changes**
    Replace old value with new value
    ```javascript
    if (msgMetadata.getPropertiesCount() > 0) {
                this.properties = Collections.unmodifiableMap(msgMetadataBuilder.getPropertiesList().stream()
                        .collect(Collectors.toMap(KeyValue::getKey, KeyValue::getValue,
                                (oldValue,newValue) -> newValue)));
            } else {
                properties = Collections.emptyMap();
            }
            this.schema = schema;
    ```
    
    (cherry picked from commit 79abc88)
    liudezhi2098 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    8f1e0d0 View commit details
    Browse the repository at this point in the history
  72. [Issue 6168] Fix Unacked Message Tracker by Using Time Partition on C…

    …++ (apache#6391)
    
    ### Motivation
    Fix apache#6168 .
    >On C++ lib, like the following log, unacked messages are redelivered after about 2 * unAckedMessagesTimeout.
    
    ### Modifications
    As same apache#3118, by using TimePartition, fixed ` UnackedMessageTracker` .
    - Add `TickDurationInMs`
    - Add `redeliverUnacknowledgedMessages` which require `MessageIds` to `ConsumerImpl`, `MultiTopicsConsumerImpl` and `PartitionedConsumerImpl`.
    
    (cherry picked from commit 333888a)
    k2la authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    8174617 View commit details
    Browse the repository at this point in the history
  73. [Reader] Should set either start message id or start message from rol…

    …l back duration. (apache#6392)
    
    Currently, when constructing a reader, users can set both start message id and start time. 
    
    This is strange and the behavior should be forbidden. 
    
    (cherry picked from commit f862961)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    eaba596 View commit details
    Browse the repository at this point in the history
  74. Seek to the first one >= timestamp (apache#6393)

    The current logic for `resetCursor` by timestamp is odd. The first message it returns is the last message earlier or equal to the designated timestamp. This "earlier" message should be avoided to emit.
    
    (cherry picked from commit 81f8afd)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    3313e6c View commit details
    Browse the repository at this point in the history
  75. [Minor] Fix java code errors reported by lgtm. (apache#6398)

    Four kinds of errors are fixed in this PR:
    
    - Array index out of bounds
    - Inconsistent equals and hashCode
    - Missing format argument
    - Reference equality test of boxed types
    
    According to https://lgtm.com/projects/g/apache/pulsar/alerts/?mode=tree&severity=error&id=&lang=java
    
    (cherry picked from commit 7fb9aff)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    0cbda7f View commit details
    Browse the repository at this point in the history
  76. Close ZK before canceling future with exception (apache#6228) (apache…

    …#6399)
    
    Fixes apache#6228
    
    (cherry picked from commit e6a631d)
    pawellozinski authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    8cebbbb View commit details
    Browse the repository at this point in the history
  77. Fixed enum package not found (apache#6401)

    Fixes apache#6400
    
    ### Motivation
    This problem is blocking the current test. 1.1.8 version of `enum34` seems to have some problems, and the problem reproduces:
    
    Use pulsar latest code:
    ```
    cd pulsar
    mvn clean install -DskipTests
    dokcer pull apachepulsar/pulsar-build:ubuntu-16.04
    docker run -it -v $PWD:/pulsar --name pulsar apachepulsar/pulsar-build:ubuntu-16.04 /bin/bash
    docker exec -it pulsar /bin/bash
    cmake .
    make -j4 && make install 
    cd python
    python setup.py bdist_wheel
    pip install dist/pulsar_client-*-linux_x86_64.whl
    ```
    `pip show enum34`
    ```
    Name: enum34
    Version: 1.1.8
    Summary: Python 3.4 Enum backported to 3.3, 3.2, 3.1, 2.7, 2.6, 2.5, and 2.4
    Home-page: https://bitbucket.org/stoneleaf/enum34
    Author: Ethan Furman
    Author-email: [email protected]
    License: BSD License
    Location: /usr/local/lib/python2.7/dist-packages
    Requires:
    Required-by: pulsar-client, grpcio
    ```
    
    ```
    root@55e06c5c770f:/pulsar/pulsar-client-cpp/python# python
    Python 2.7.12 (default, Oct  8 2019, 14:14:10)
    [GCC 5.4.0 20160609] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from enum import Enum, EnumMeta
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ImportError: No module named enum
    >>> exit()
    ```
    
    There is no problem with using 1.1.9 in the test.
    
    ### Modifications
    
    * Upgrade enum34 from 1.1.8 to 1.1.9
    
    ### Verifying this change
    
    local test pass
    
    (cherry picked from commit 2f42077)
    tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    d36eee2 View commit details
    Browse the repository at this point in the history
  78. Consumer received duplicated deplayed messages upon restart

    Fix when send a delayed message ,there is a case when a consumer restarts and pull duplicate messages. apache#6403
    
    (cherry picked from commit e71b9fc)
    liudezhi2098 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    3ee06e3 View commit details
    Browse the repository at this point in the history
  79. Add verification for SchemaDefinitionBuilderImpl.java (apache#6405)

    ### Motivation
    
    Add verification for SchemaDefinitionBuilderImpl.java
    
    ### Verifying this change
    
    Added a new unit test.
    
    (cherry picked from commit 848ad30)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    379bdec View commit details
    Browse the repository at this point in the history
  80. [Issue 3762][Schema] Fix the problem with parsing of an Avro schema r…

    …elated to shading in pulsar-client. (apache#6406)
    
    Motivation
    Avro schemas are quite important for proper data flow and it is a pity that the apache#3762 issue stayed untouched for so long. There were some workarounds on how to make Pulsar use an original avro schema, but in the end, it is pretty hard to run an enterprise solution on workarounds. With this PR I would like to find a solution to the problem caused by shading avro in pulsar-client. As it was discussed in the issue, there are two possible solutions for this problem:
    
    Unshade the avro library in the pulsar-client library. (IMHO it seems like a proper solution for this problem, but it also brings a risk of unknown side-effects)
    Use reflection to get original schemas from generated classes. (I went for this solution)
    Could you please comment if this is a proper solution for the problem? I will add tests when my approach will be confirmed.
    
    Modifications
    First, we try to extract an original avro schema from the "$SCHEMA" field using reflection. If it doesn't work, the process falls back generation of the schema from POJO.
    
    (cherry picked from commit dab14ac)
    vzhikserg authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    0e9c56a View commit details
    Browse the repository at this point in the history
  81. [Java client] MultiTopics discovery is broken due to discovery task s…

    …cheduled twice instead of pendingBatchRecei… (apache#6407)
    
    * fix topic discovery task scheduled twice instead of pendingBatchReceiveTask
    
    * remove wildcard imports
    
    Co-authored-by: avim <[email protected]>
    (cherry picked from commit 40995a0)
    avimas authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    31e3d19 View commit details
    Browse the repository at this point in the history
  82. Update BatchReceivePolicy.java (apache#6423)

    BatchReceivePolicy implements Serializable.
    
    (cherry picked from commit 792ab17)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    52d5d7c View commit details
    Browse the repository at this point in the history
  83. Bump netty version to 4.1.45.Final (apache#6424)

    netty 4.1.43 has a bug preventing it from using Linux native Epoll transport
    
    This results in pulsar brokers failing over to NioEventLoopGroup even when running on Linux.
    
    The bug is fixed in netty releases 4.1.45.Final
    
    (cherry picked from commit 760bd1a)
    dzmitryk authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    f6fb44d View commit details
    Browse the repository at this point in the history
  84. Fix publish buffer limit does not take effect

    Motivation
    If set up maxMessagePublishBufferSizeInMB > Integer.MAX_VALUE / 1024 / 1024, the publish buffer limit does not take effect. The reason is maxMessagePublishBufferBytes always 0 when use following calculation method :
    
    pulsar.getConfiguration().getMaxMessagePublishBufferSizeInMB() * 1024 * 1024;
    So, changed to
    
    pulsar.getConfiguration().getMaxMessagePublishBufferSizeInMB() * 1024L * 1024L;
    
    (cherry picked from commit 75a321d)
    ltamber authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    6186aef View commit details
    Browse the repository at this point in the history
  85. [Flink-Connector]Get PulsarClient from cache should always return an …

    …open instance (apache#6436)
    
    
    
    (cherry picked from commit 2ed2eb8)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    22fbdc1 View commit details
    Browse the repository at this point in the history
  86. fix the bug of authenticationData is't initialized. (apache#6440)

    Motivation
    fix the bug of authenticationData is't initialized.
    
    the method org.apache.pulsar.proxy.server.ProxyConnection#handleConnect can't init the value of authenticationData.
    cause of the bug that you will get the null value form the method org.apache.pulsar.broker.authorization.AuthorizationProvider#canConsumeAsync
    when implements org.apache.pulsar.broker.authorization.AuthorizationProvider interface.
    
    Modifications
    init the value of authenticationData from the method org.apache.pulsar.proxy.server.ProxyConnection#handleConnect.
    
    Verifying this change
    implements org.apache.pulsar.broker.authorization.AuthorizationProvider interface, and get the value of authenticationData.
    
    (cherry picked from commit b8f0ca0)
    bilahepan authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    0afcf1b View commit details
    Browse the repository at this point in the history
  87. Fixed the max backoff configuration for lookups (apache#6444)

    * Fixed the max backoff configuration for lookups
    
    * Fixed test expectation
    
    * More test fixes
    
    (cherry picked from commit 6ff87ee)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    56c7079 View commit details
    Browse the repository at this point in the history
  88. Use System.nanoTime() instead of System.currentTimeMillis() (apache#6454

    )
    
    Fixes apache#6453 
    
    ### Motivation
    `ConsumerBase` and `ProducerImpl` use `System.currentTimeMillis()` to measure the elapsed time in the 'operations' inner classes (`ConsumerBase$OpBatchReceive` and `ProducerImpl$OpSendMsg`).
    
    An instance variable `createdAt` is initialized with `System.currentTimeMills()`, but it is not used for reading wall clock time, the variable is only used for computing elapsed time (e.g. timeout for a batch).
    
    When the variable is used to compute elapsed time, it would more sense to use `System.nanoTime()`.
    
    ### Modifications
    
    The instance variable `createdAt` in `ConsumerBase$OpBatchReceive` and  `ProducerImpl$OpSendMsg` is initialized with `System.nanoTime()`. Usage of the variable is updated to reflect that the variable holds nano time; computations of elapsed time takes the difference between the current system nano time and the `createdAt` variable.
    
    The `createdAt` field is package protected, and is currently only used in the declaring class and outer class, limiting the chances for unwanted side effects.
    
    (cherry picked from commit 459ec6e)
    racorn authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    eb7aa2a View commit details
    Browse the repository at this point in the history
  89. [Broker] Create namespace failed when TLS is enabled in PulsarStandal…

    …one (apache#6457)
    
    When starting Pulsar in standalone mode with TLS enabled, it will fail to create two namespaces during start. 
    
    This is because it's using the unencrypted URL/port while constructing the PulsarAdmin client. 
    
    (cherry picked from commit 3e1b8f6)
    yjshen authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4becef9 View commit details
    Browse the repository at this point in the history
  90. Improve cpp-client-lib: provide another libpulsarwithdeps.a in dep/…

    …rpm (apache#6458)
    
    Fix apache#6439 
    We shouldn't static link libssl in libpulsar.a, as this is a security red flag. we should just use whatever the libssl the system provides. Because if there is a security problem in libssl, all the machines can just update their own libssl library without rebuilding libpulsar.a.
    As suggested, this change not change the old behavior, and mainly provides 2 other additional pulsar cpp client library in deb/rpm, and add related docs of how to use 4 libs in doc.
    The additional 2 libs: 
    - pulsarSharedNossl (libpulsarnossl.so), similar to pulsarShared(libpulsar.so), with no ssl statically linked.
    - pulsarStaticWithDeps(libpulsarwithdeps.a), similar to pulsarStatic(libpulsar.a), and archived in the dependencies libraries of `libboost_regex`,  `libboost_system`, `libcurl`, `libprotobuf`, `libzstd` and `libz` statically.
    
    Passed 4 libs rpm/deb build, install, and compile with a pulsar-client example code.
    
    * also add libpulsarwithdeps.a together with libpulsar.a into cpp client release
    
    * add documentation for libpulsarwithdeps.a, add g++ build examples
    
    * add pulsarSharedNossl target to build libpulsarnossl.so
    
    * update doc
    
    * verify 4 libs in rpm/deb build, installed, use all good
    
    (cherry picked from commit 33eea88)
    jiazhai authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    f94eb89 View commit details
    Browse the repository at this point in the history
  91. pulsar-proxy: fix correct name for proxy thread executor name (apache…

    …#6460)
    
    ### Motivation
    fix correct name for proxy thread executor name
    
    (cherry picked from commit 5c2c058)
    rdhabalia authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    85257b5 View commit details
    Browse the repository at this point in the history
  92. [pulsar-proxy] fix logging for published messages (apache#6474)

    ### Motivation
    Proxy-logging fetches incorrect producerId for `Send` command because of that logging always gets producerId as 0 and it fetches invalid topic name for the logging.
    
    ### Modification
    Fixed topic logging by fetching correct producerId for `Send` command.
    
    (cherry picked from commit 65cc303)
    sijie authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    89e44ec View commit details
    Browse the repository at this point in the history
  93. Fix create partitioned topic with a substring of an existing topic na…

    …me. (apache#6478)
    
    Fixes apache#6468
    
    Fix create a partitioned topic with a substring of an existing topic name. And make create partitioned topic async.
    
    (cherry picked from commit 19ccfd5)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    b5322bc View commit details
    Browse the repository at this point in the history
  94. Avoid calling ConsumerImpl::redeliverMessages() when message list is …

    …empty (apache#6480)
    
    (cherry picked from commit 6604f54)
    merlimat authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    c4902d6 View commit details
    Browse the repository at this point in the history
  95. Fix some async method problems at PersistentTopicsBase. (apache#6483)

    (cherry picked from commit 47ca8e6)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    1e1dd06 View commit details
    Browse the repository at this point in the history
  96. Fix memory leak when running topic compaction. (apache#6485)

    Fixes apache#6482
    
    ### Motivation
    Prevent topic compaction from leaking direct memory
    
    ### Modifications
    
    Several leaks were discovered using Netty leak detection and code review.
    * `CompactedTopicImpl.readOneMessageId` would get an `Enumeration` of `LedgerEntry`, but did not release the underlying buffers. Fix: iterate though the `Enumeration` and release underlying buffer. Instead of logging the case where the `Enumeration` did not contain any elements, complete the future exceptionally with the message (will be logged by Caffeine).
    * Two main sources of leak in `TwoPhaseCompactor`. The `RawBacthConverter.rebatchMessage` method failed to close/release a `ByteBuf` (uncompressedPayload). Also, the return ByteBuf of `RawBacthConverter.rebatchMessage` was not closed. The first one was easy to fix (release buffer), to fix the second one and make the code easier to read, I decided to not let `RawBacthConverter.rebatchMessage`  close the message read from the topic, instead the message read from the topic can be closed in a try/finally clause surrounding most of the method body handing a message from a topic (in phase two loop). Then if a new message was produced by `RawBacthConverter.rebatchMessage` we check that after we have added the message to the compact ledger and release the message.
    
    ### Verifying this change
    Modified `RawReaderTest.testBatchingRebatch` to show new contract.
    
    One can run the test described to reproduce the issue, to verify no leak is detected.
    
    (cherry picked from commit f2ec1b4)
    racorn authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    5911b8b View commit details
    Browse the repository at this point in the history
  97. [proxy] Fix proxy routing to functions worker (apache#6486)

    ### Motivation
    
    
    Currently, the proxy only works to proxy v1/v2 functions routes to the
    function worker.
    
    ### Modifications
    
    This changes this code to proxy all routes for the function worker when
    those routes match. At the moment this is still a static list of
    prefixes, but in the future it may be possible to have this list of
    prefixes be dynamically fetched from the REST routes.
    
    ### Verifying this change
    - added some tests to ensure the routing works as expected
    
    (cherry picked from commit 329e231)
    addisonj authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    4724659 View commit details
    Browse the repository at this point in the history
  98. [pulsar-client] fix deadlock on send failure (apache#6488)

    (cherry picked from commit ad5415a)
    rdhabalia authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    e05b786 View commit details
    Browse the repository at this point in the history
  99. [broker] Timeout API calls in BrokerService (apache#6489)

    See apache#6416. This change ensures that all futures within BrokerService
    have a guranteed timeout. As stated in apache#6416, we see cases where it
    appears that loading or creating a topic fails to resolve the future for
    unknown reasons. It appears that these futures *may* not be returning.
    This seems like a sane change to make to ensure that these futures
    finish, however, it still isn't understood under what conditions these
    futures may not be returning, so this fix is mostly a workaround for
    some underlying issues
    
    Co-authored-by: Addison Higham <[email protected]>
    (cherry picked from commit 4a4cce9)
    addisonj authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    53b4073 View commit details
    Browse the repository at this point in the history
  100. [pulsar-client-cpp] Fix Redelivery of Messages on UnackedMessageTrack…

    …er When Ack Messages . (apache#6498)
    
    ### Motivation
    Because of apache#6391 , acked messages were counted as unacked messages. 
    Although messages from brokers were acknowledged, the following log was output.
    
    ```
    2020-03-06 19:44:51.790 INFO  ConsumerImpl:174 | [persistent://public/default/t1, sub1, 0] Created consumer on broker [127.0.0.1:58860 -> 127.0.0.1:6650]
    my-message-0: Fri Mar  6 19:45:05 2020
    my-message-1: Fri Mar  6 19:45:05 2020
    my-message-2: Fri Mar  6 19:45:05 2020
    2020-03-06 19:45:15.818 INFO  UnAckedMessageTrackerEnabled:53 | [persistent://public/default/t1, sub1, 0] : 3 Messages were not acked within 10000 time
    
    ```
    
    This behavior happened on master branch.
    
    (cherry picked from commit 67f8cf3)
    k2la authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    1b36a7a View commit details
    Browse the repository at this point in the history
  101. Start namespace service and schema registry service before start brok…

    …er. (apache#6499)
    
    ### Motivation
    
    If the broker service is started, the client can connect to the broker and send requests depends on the namespace service, so we should create the namespace service before starting the broker. Otherwise, NPE occurs.
    
    ![image](https://user-images.githubusercontent.com/12592133/76090515-a9961400-5ff6-11ea-9077-cb8e79fa27c0.png)
    
    ![image](https://user-images.githubusercontent.com/12592133/76099838-b15db480-6006-11ea-8f39-31d820563c88.png)
    
    
    ### Modifications
    
    Move the namespace service creation and the schema registry service creation before start broker service.
    
    (cherry picked from commit 5285c68)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    77f2c82 View commit details
    Browse the repository at this point in the history
  102. Instead of always using admin access for topic, use read/write/admin …

    …access for topic (apache#6504)
    
    Co-authored-by: Sanjeev Kulkarni <[email protected]>
    (cherry picked from commit 36ea153)
    srkukarni authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    394791d View commit details
    Browse the repository at this point in the history
  103. Fix admin getLastMessageId return batchIndex (apache#6511)

    Fix apache#6462 
    ### Motivation
    admin api add getLastMessageId return batchIndex
    
    (cherry picked from commit 757824f)
    congbobo184 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    b1c9c2f View commit details
    Browse the repository at this point in the history
  104. Disable channel auto read when publish rate or publish buffer exceeded (

    apache#6550)
    
    ### Motivation
    
    Disable channel auto-read when publishing rate or publish buffer exceeded. Currently, ServerCnx set channel auto-read to false when getting a new message and publish rate exceeded or publish buffer exceeded. So, it depends on reading more one message. If there are too many ServerCnx(too many topics or clients), this will result in publish rate limitations with a large deviation. Here is an example to show the problem.
    
    Enable publish rate limit in broker.conf
    ```
    brokerPublisherThrottlingTickTimeMillis=1
    brokerPublisherThrottlingMaxByteRate=10000000
    ```
    
    Use Pulsar perf to test 100 partition message publishing:
    ```
    bin/pulsar-perf produce -s 500000 -r 100000 -t 1 100p
    ```
    
    The test result:
    ```
    10:45:28.844 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    367.8  msg/s ---   1402.9 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 710.008 ms - med: 256.969 - 95pct: 2461.439 - 99pct: 3460.255 - 99.9pct: 4755.007 - 99.99pct: 4755.007 - Max: 4755.007
    10:45:38.919 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    456.6  msg/s ---   1741.9 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 2551.341 ms - med: 2347.599 - 95pct: 6852.639 - 99pct: 9630.015 - 99.9pct: 10824.319 - 99.99pct: 10824.319 - Max: 10824.319
    10:45:48.959 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    432.0  msg/s ---   1648.0 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 4373.505 ms - med: 3972.047 - 95pct: 11754.687 - 99pct: 15713.663 - 99.9pct: 17638.527 - 99.99pct: 17705.727 - Max: 17705.727
    10:45:58.996 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    430.6  msg/s ---   1642.6 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 5993.563 ms - med: 4291.071 - 95pct: 18022.527 - 99pct: 21649.663 - 99.9pct: 24885.375 - 99.99pct: 25335.551 - Max: 25335.551
    10:46:09.195 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    403.2  msg/s ---   1538.3 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 7883.304 ms - med: 6184.159 - 95pct: 23625.343 - 99pct: 29524.991 - 99.9pct: 30813.823 - 99.99pct: 31467.775 - Max: 31467.775
    10:46:19.314 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    401.1  msg/s ---   1530.1 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 9587.407 ms - med: 6907.007 - 95pct: 28524.927 - 99pct: 34815.999 - 99.9pct: 36759.551 - 99.99pct: 37581.567 - Max: 37581.567
    10:46:29.389 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    372.8  msg/s ---   1422.0 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 11984.595 ms - med: 10095.231 - 95pct: 34515.967 - 99pct: 40754.175 - 99.9pct: 43553.535 - 99.99pct: 43603.199 - Max: 43603.199
    10:46:39.459 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    374.6  msg/s ---   1429.1 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 12208.459 ms - med: 7807.455 - 95pct: 38799.871 - 99pct: 46936.575 - 99.9pct: 50500.095 - 99.99pct: 50500.095 - Max: 50500.095
    10:46:49.537 [main] INFO  org.apache.pulsar.testclient.PerformanceProducer - Throughput produced:    295.6  msg/s ---   1127.5 Mbit/s --- failure      0.0 msg/s --- Latency: mean: 14503.565 ms - med: 10753.087 - 95pct: 45041.407 - 99pct: 54307.327 - 99.9pct: 57786.623 - 99.99pct: 57786.623 - Max: 57786.623
    ```
    
    Analyze the reasons for such a large deviation is the producer sent batch messages and ServerCnx read more one message. 
    
    This PR can not completely solve the problem but can alleviate this problem. When the message publish rate exceeded, the broker set channel auto-read to false for all topics. This will avoid parts of ServerCnx read more one message.
    
    ### Does this pull request potentially affect one of the following parts:
    
    *If `yes` was chosen, please highlight the changes*
    
      - Dependencies (does it add or upgrade a dependency): (no)
      - The public API: (no)
      - The schema: (no)
      - The default values of configurations: (no)
      - The wire protocol: (no)
      - The rest endpoints: (no)
      - The admin cli options: (no)
      - Anything that affects deployment: (no)
    
    ### Documentation
    
      - Does this pull request introduce a new feature? (no)
    
    (cherry picked from commit ec31d54)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    b2df780 View commit details
    Browse the repository at this point in the history
  105. Don't increment unacked messages for the consumer with Exclusive/Fail…

    …over subscription mode. (apache#6558)
    
    Fixes apache#6552
    
    ### Motivation
    
    apache#6552 is introduced by apache#5929, so this PR stop increase unacked messages for the consumer with Exclusive/Failover subscription mode.
    
    (cherry picked from commit 2449696)
    codelipenghui authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    bf14a08 View commit details
    Browse the repository at this point in the history
  106. Fix: topic with one partition cannot be updated (apache#6560)

    * Fix: topic with one partition cannot be updated
    
    (cherry picked from commit 9602c9b)
    jerrypeng authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    e7459b4 View commit details
    Browse the repository at this point in the history
  107. Fix NPE while call getLastMessageId. (apache#6562)

    ### Motivation
    
    Fixes apache#6561
    
    ### Modifications
    
    Initialize `BatchMessageAckerDisabled` with a `new BitSet()` Object.
    
    (cherry picked from commit 2007de6)
    murong00 authored and tuteng committed Mar 21, 2020
    Configuration menu
    Copy the full SHA
    58e52e0 View commit details
    Browse the repository at this point in the history
  108. Configuration menu
    Copy the full SHA
    68e5b79 View commit details
    Browse the repository at this point in the history