-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[improve] [pip] PIP-354: apply topK mechanism to ModularLoadManagerIm…
…pl (#22765)
- Loading branch information
1 parent
aa230bc
commit 3534cdd
Showing
1 changed file
with
50 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# PIP-354: apply topK mechanism to ModularLoadManagerImpl | ||
|
||
# Background knowledge | ||
|
||
There are mainly two `LoadManager` implementation in Pulsar broker: `ExtensibleLoadManager` and `ModularLoadManagerImpl`. `ModularLoadManagerImpl` is the default load manager, and `ExtensibleLoadManager` is a new load manager which is proposed after 3.0.0 version. | ||
|
||
## ModularLoadManagerImpl | ||
`ModularLoadManagerImpl` rely on zk to store and synchronize metadata about load, which pose greate pressure on zk, threatening the stability of system. Every broker will upload its `LocalBrokerData` to zk, and leader broker will retrieve all `LocalBrokerData` from zk, generate all `BundleData` from each `LocalBrokerData`, and update all `BundleData` to zk. | ||
|
||
## ExtensibleLoadManager | ||
`ExtensibleLoadManager` depends on system topics and table views for load balance metadata store and replication. Though not using zk to store and synchronize metadata about load, it is still necessary to control the number of bundles that need to be updated, for which there is a `loadBalancerMaxNumberOfBundlesInBundleLoadReport` configuration in `ExtensibleLoadManager` that select the top k bundles. | ||
|
||
|
||
# Motivation | ||
|
||
As every bundle in the cluster corresponds to a zk node, it is common that there are thousands of zk nodes in a cluster, which results into thousands of read/update operations to zk. This will cause a lot of pressure on zk. | ||
|
||
**As All Load Shedding Algorithm pick bundles from top to bottom based on throughput/msgRate, bundles with low throughput/msgRate are rarely be selected for shedding. So that we don't need to contain these bundles in the bundle load report.** | ||
|
||
|
||
|
||
# Goals | ||
|
||
Reuse the configuration `loadBalancerMaxNumberOfBundlesInBundleLoadReport` in `ExtensibleLoadManager`, apply the topK mechanism to `ModularLoadManagerImpl`. | ||
|
||
# Detailed Design | ||
|
||
If `loadBalancerMaxNumberOfBundlesInBundleLoadReport` is set to a positive number, `ModularLoadManagerImpl` will only select the `topK * brokerCount` bundles based on throughput/msgRate to update to zk. | ||
|
||
If `loadBalancerMaxNumberOfBundlesInBundleLoadReport` <= 0, `ModularLoadManagerImpl` will update all bundles to zk. | ||
|
||
As the default value of `loadBalancerMaxNumberOfBundlesInBundleLoadReport` is 10, `ModularLoadManagerImpl` will only update the top 10 * brokerCount bundles to zk by default. | ||
|
||
WARNING: too small `loadBalancerMaxNumberOfBundlesInBundleLoadReport` could result in a long load balance time. | ||
|
||
For users who don't want to use this feature, they can set `loadBalancerMaxNumberOfBundlesInBundleLoadReport` to 0. | ||
|
||
# Backward & Forward Compatibility | ||
|
||
This is a modular load balancer behavior change(not forward-compatible). | ||
|
||
User can set `loadBalancerMaxNumberOfBundlesInBundleLoadReport` to 0 to disable this feature. | ||
|
||
# Links | ||
|
||
<!-- | ||
Updated afterwards | ||
--> | ||
* Mailing List discussion thread: https://lists.apache.org/thread/wwybg8og80yz9gvj6bfdbv1znx2dfp4w | ||
* Mailing List voting thread: https://lists.apache.org/thread/67r3nv33gfoxhvo74ql41dydh2rmyvjw |