-
Notifications
You must be signed in to change notification settings - Fork 605
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redundant set/unset throttling during the rebalance #1972
Comments
@CCisGG I would like to contribute on this. |
@CCisGG Thank you Akarshit! I'd recommend to take a look at this https://github.com/linkedin/cruise-control/blob/migrate_to_kafka_2_5/CONTRIBUTING.md before start coding. |
Hey, I am new to Open Source Contributor. @AKARSHITJOSHI Are you still looking into this? If not I wouldd like to take it. |
Hi @CCisGG, thanks for sharing the context. looking into it, will appreciate any help or guidance |
Hi @the-sky7, have you had a chance to make any progress on this? If not, I’d be happy to take a look |
@CCisGG When you have a moment, could you please take a look at this PR? I would greatly appreciate your insights and any feedback you might have. Thank you very much for your time! |
Today, Cruise Control are setting/resetting replication rate throttles on a per-task base. E.g. it set throttling before a task executes:
cruise-control/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/executor/Executor.java
Line 1415 in 155bf3c
Then clear the throttling for the broker after task finished and if there is no ongoing-task on the same broker:
cruise-control/cruise-control/src/main/java/com/linkedin/kafka/cruisecontrol/executor/Executor.java
Line 1437 in 155bf3c
Note that each set/clear throttle will be an individual Kafka admin request, will may take a couple of seconds.
The problem is: there can be a lot of tasks in the backlog that is involving the same broker. It doesn't make senses to set/clear the throttling for the same broker again and again on every task execution batch. From local unit test running, it takes ~20 seconds for adminClient to handle each set/clear throttle request.
If we set the throttling for broker at the beginning of the execution, and only clean up the throttling after all tasks finished execution, that should speed up the rebalance.
Note: this is not the concurrency throttling, but the replication bytes throttling. This throttle is set by Cruise Control sending request to kafka brokers to cap the max replication bytes rate.
The text was updated successfully, but these errors were encountered: