-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with "partition.assignment.strategy=cooperative-sticky" #3306
Comments
thanks @shanson7 - can reproduced both, investigating. |
i believe the shuffling is happening because the is not using |
the COMMITFAIL is apparently causing the partitions to be lost as well, so additional rebalance. i don't have a good idea how/why this could be happening. will investigate more Tuesdayish. |
Also possibly related: 1b40aad |
rkgm_assignment, which is the member assignment after running the assignor, was mixed up with rkgm_owned, which is the current member assignment before running the assignor. This resulted in the sticky assignor not taking the current assignment into consideration on rebalance and thus not being able to provide the stickyness.
rkgm_assignment, which is the member assignment after running the assignor, was mixed up with rkgm_owned, which is the current member assignment before running the assignor. This resulted in the sticky assignor not taking the current assignment into consideration on rebalance and thus not being able to provide the stickyness.
Is this fixed in 1.7.0? |
@shanson7 Yes! |
Description
We are making changes to support cooperative/incremental rebalance. We are seeing errors during rebalance and what seems to be sub-optimal rebalancing (both when using the default
rebalance_cb
and our custom one, which just logs the partitions that are passed in and callsincremental_(un)assign
).During rebalance it is not uncommon to get errors like:
Additionally, it seems when a consumer leaves the group, partitions are being shuffled between the remaining consumers (I would expect that cooperative rebalance would only assign partitions from the consumer that left without revoking any).
How to reproduce
Bring up a consumer and let it start consuming. Bring up a second and let it rebalance. Bring up a third. bring down the third.
When the third consumer is added, we see the above error. When the third consumer leaves, we see partitions shuffled unexpectedly. I can reproduce consistently, so I can upload load with whatever debug settings are relevant (at least "cgrp" I imagine).
Checklist
IMPORTANT: We will close issues where the checklist has not been completed.
Please provide the following information:
1.6.1
kafka_2.13-2.6.0
partition.assignment.strategy=cooperative-sticky
rhel7
debug=..
as necessary) from librdkafkaThe text was updated successfully, but these errors were encountered: