Change status of intra broker balancing from beta to GA #855

kbatuigas · 2024-11-13T18:26:57Z

Description

Removes the admonition regarding intra-broker balancing being in beta and needing to enable to explicitly enable the node_local_core_assignment feature flag. The flag is now enabled by default as of 24.3.
Adds a separate admonition regarding Decreasing core count (see comment in PR 850) availability in >=24.2. We explicitly state this because once our docs version for 24.3 beta goes to current, readers might not realize that this wasn't possible before 24.2. Decreasing the core count is a separate capability that is supported and is possible only if node_local_core_assignment is enabled - the same feature flag that enables intra-broker partition balancing.

Resolves https://github.com/redpanda-data/documentation-private/issues/
Review deadline: 15 Nov

Page previews

24.3 > Cluster balancing > Intra-broker partition balancing

Checks

New feature
Content gap
Support Follow-up
Small fix (typos, links, copyedits, etc)

netlify · 2024-11-13T18:27:13Z

✅ Deploy Preview for redpanda-docs-preview ready!

Name	Link
🔨 Latest commit	`47cd169`
🔍 Latest deploy log	https://app.netlify.com/sites/redpanda-docs-preview/deploys/673b5f0e1ed0e800086ee420
😎 Deploy Preview	https://deploy-preview-855--redpanda-docs-preview.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

kbatuigas · 2024-11-13T18:42:59Z

@daisukebe @wzzzrd86 added you as reviewers so you can eyeball and compare with #850, thanks!

ztlpn · 2024-11-13T21:59:14Z

modules/manage/pages/cluster-maintenance/cluster-balancing.adoc

 In Redpanda, every partition replica is assigned to a CPU core on a broker. While Redpanda's default <<partition-replica-balancing,partition balancing>> monitors cluster-level events, such as the addition of new brokers or broker failure to balance partition assignments, it does not account for the distribution of partitions _within_ an individual broker. 

 Prior to Redpanda version 24.2, this meant that some cores on a broker could inadvertently host many partitions of heavily-used topics and cause the CPU to be xref:manage:monitoring.adoc#cpu-usage[overburdened]. Additionally, when the partition rebalance moved some partitions away from a broker, the remaining partitions did not necessarily rebalance across the broker's cores. Or, if a broker's core count was increased, Redpanda did not assign any partitions to the new cores until new partitions were created or old partitions were moved out.

 Starting in v24.2, topic-aware intra-broker partition balancing allows for dynamically reassigning partitions within a broker.  Redpanda prioritizes an even distribution of a topic's partition replicas across all cores in a broker. If a broker's core count changes, when the broker starts back up, Redpanda can check partition assignments across the broker's cores and reassign partitions, so that a balanced assignment is maintained across all cores. Redpanda can also check partition assignments when partitions are added to or removed from a broker, and rebalance the remaining partitions between cores.

+NOTE: Decreasing the number of CPU cores in a running cluster is supported from v24.2 only.


Will 24.2 docs still contain info about the curl command? Otherwise there is no way for the users to enable it for 24.2

yes 24.2 is currently what's in main. Before we release 24.3, we cut a new maintenance branch off main for 24.2 and then merge v-WIP/24.3 into main.

I think we can remove this note. Add the ability to decrease core count in production to the ‘Whats new’

I think we can remove this note. Add the ability to decrease core count in production to the ‘Whats new’

Agreed. End users might not realize that the ability to decrease core count is tied to intra-broker partition balancing. Additionally, this is a significant improvement and deserves a mention in the 'What's New' section.

Thank you @JakeSCahill @daisukebe , we'll add a note in our upcoming What's New: https://github.com/redpanda-data/docs/pull/865/files#diff-7e9028daec2320182888e36f1be6d5a941c79362fab72135786f5b0cb0456a30R20

JakeSCahill · 2024-11-15T13:39:58Z

modules/manage/pages/cluster-maintenance/cluster-balancing.adoc

-curl -X PUT -d '{"state": "active"}' http://127.0.0.1:9644/v1/features/node_local_core_assignment
-```
-====
-
 In Redpanda, every partition replica is assigned to a CPU core on a broker. While Redpanda's default <<partition-replica-balancing,partition balancing>> monitors cluster-level events, such as the addition of new brokers or broker failure to balance partition assignments, it does not account for the distribution of partitions _within_ an individual broker. 

 Prior to Redpanda version 24.2, this meant that some cores on a broker could inadvertently host many partitions of heavily-used topics and cause the CPU to be xref:manage:monitoring.adoc#cpu-usage[overburdened]. Additionally, when the partition rebalance moved some partitions away from a broker, the remaining partitions did not necessarily rebalance across the broker's cores. Or, if a broker's core count was increased, Redpanda did not assign any partitions to the new cores until new partitions were created or old partitions were moved out.


I think we should remove this. Anything prior to 24.2 is irrelevant for 24.3 users.

JakeSCahill · 2024-11-15T14:02:05Z

modules/manage/pages/cluster-maintenance/cluster-balancing.adoc

-curl -X PUT -d '{"state": "active"}' http://127.0.0.1:9644/v1/features/node_local_core_assignment
-```
-====
-
 In Redpanda, every partition replica is assigned to a CPU core on a broker. While Redpanda's default <<partition-replica-balancing,partition balancing>> monitors cluster-level events, such as the addition of new brokers or broker failure to balance partition assignments, it does not account for the distribution of partitions _within_ an individual broker. 

 Prior to Redpanda version 24.2, this meant that some cores on a broker could inadvertently host many partitions of heavily-used topics and cause the CPU to be xref:manage:monitoring.adoc#cpu-usage[overburdened]. Additionally, when the partition rebalance moved some partitions away from a broker, the remaining partitions did not necessarily rebalance across the broker's cores. Or, if a broker's core count was increased, Redpanda did not assign any partitions to the new cores until new partitions were created or old partitions were moved out.

 Starting in v24.2, topic-aware intra-broker partition balancing allows for dynamically reassigning partitions within a broker.  Redpanda prioritizes an even distribution of a topic's partition replicas across all cores in a broker. If a broker's core count changes, when the broker starts back up, Redpanda can check partition assignments across the broker's cores and reassign partitions, so that a balanced assignment is maintained across all cores. Redpanda can also check partition assignments when partitions are added to or removed from a broker, and rebalance the remaining partitions between cores.


Again, referencing the version in versioned docs is distracting.

kbatuigas · 2024-11-15T17:25:42Z

@ztlpn per Jake's comment the 24.2 version of the docs will retain the info about enabling the flag. We actually have a separate PR to make some changes to the 24.2 version of the page: #850

Feediver1

lgtm

Remove beta note for intra-broker balancing

1649ab3

kbatuigas requested a review from a team as a code owner November 13, 2024 18:26

kbatuigas mentioned this pull request Nov 13, 2024

Clarify when changing number of cores is supported #850

Merged

4 tasks

kbatuigas requested review from ztlpn, mattschumpert, daisukebe and wzzzrd86 November 13, 2024 18:41

ztlpn reviewed Nov 13, 2024

View reviewed changes

JakeSCahill reviewed Nov 15, 2024

View reviewed changes

kbatuigas added 2 commits November 15, 2024 12:16

Limitation on decreasing core count no longer applies

f1849ff

Remove explicit mention of version number

f41619d

kbatuigas requested a review from ztlpn November 15, 2024 17:20

kbatuigas requested a review from JakeSCahill November 15, 2024 17:25

Add decrease core count to What's New instead

47cd169

kbatuigas mentioned this pull request Nov 18, 2024

DOC-758 Update What's New for 24.3 GA #865

Merged

4 tasks

ztlpn approved these changes Nov 18, 2024

View reviewed changes

Feediver1 approved these changes Nov 18, 2024

View reviewed changes

kbatuigas merged commit ac16003 into v-WIP/24.3 Nov 19, 2024
7 checks passed

kbatuigas deleted the DOC-673-Change-status-of-intra-broker-balancing-from-beta-to-GA branch November 19, 2024 12:11

Deflaimun pushed a commit that referenced this pull request Nov 19, 2024

Change status of intra broker balancing from beta to GA (#855)

332aeed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change status of intra broker balancing from beta to GA #855

Change status of intra broker balancing from beta to GA #855

kbatuigas commented Nov 13, 2024 •

edited

Loading

netlify bot commented Nov 13, 2024 •

edited

Loading

kbatuigas commented Nov 13, 2024

ztlpn Nov 13, 2024

JakeSCahill Nov 15, 2024

JakeSCahill Nov 16, 2024

daisukebe Nov 18, 2024

kbatuigas Nov 19, 2024

daisukebe Nov 20, 2024

JakeSCahill Nov 15, 2024

JakeSCahill Nov 15, 2024

kbatuigas commented Nov 15, 2024

Feediver1 left a comment

Change status of intra broker balancing from beta to GA #855

Change status of intra broker balancing from beta to GA #855

Conversation

kbatuigas commented Nov 13, 2024 • edited Loading

Description

Page previews

Checks

netlify bot commented Nov 13, 2024 • edited Loading

✅ Deploy Preview for redpanda-docs-preview ready!

kbatuigas commented Nov 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kbatuigas commented Nov 15, 2024

Feediver1 left a comment

Choose a reason for hiding this comment

kbatuigas commented Nov 13, 2024 •

edited

Loading

netlify bot commented Nov 13, 2024 •

edited

Loading