From d619bfffee9a5b52e35d53b415f6705749f2d95a Mon Sep 17 00:00:00 2001 From: enomott Date: Sun, 11 Jan 2026 16:30:10 +0900 Subject: [PATCH] docs: add shard allocation report for v3.4.0 --- docs/features/index.md | 1 + docs/features/opensearch/shard-allocation.md | 118 ++++++++++++++++++ .../features/opensearch/shard-allocation.md | 105 ++++++++++++++++ docs/releases/v3.4.0/index.md | 1 + 4 files changed, 225 insertions(+) create mode 100644 docs/features/opensearch/shard-allocation.md create mode 100644 docs/releases/v3.4.0/features/opensearch/shard-allocation.md diff --git a/docs/features/index.md b/docs/features/index.md index 3c6ae879d..f7ed553b5 100644 --- a/docs/features/index.md +++ b/docs/features/index.md @@ -153,6 +153,7 @@ - [Segment Warmer](opensearch/segment-warmer.md) - [Semantic Version Field Type](opensearch/semantic-version-field-type.md) - [Settings Management](opensearch/settings-management.md) +- [Shard Allocation](opensearch/shard-allocation.md) - [Skip List](opensearch/skip-list.md) - [Snapshot Restore Enhancements](opensearch/snapshot-restore-enhancements.md) - [Star Tree Index](opensearch/star-tree-index.md) diff --git a/docs/features/opensearch/shard-allocation.md b/docs/features/opensearch/shard-allocation.md new file mode 100644 index 000000000..a0a074f28 --- /dev/null +++ b/docs/features/opensearch/shard-allocation.md @@ -0,0 +1,118 @@ +# Shard Allocation + +## Summary + +Shard allocation in OpenSearch determines how shards are distributed across cluster nodes. The `BalancedShardsAllocator` uses a `WeightFunction` to calculate optimal node weights based on configurable balance factors. This feature includes settings for primary shard balancing, which helps distribute primary shards evenly across nodes for better performance and fault tolerance. + +## Details + +### Architecture + +```mermaid +graph TB + subgraph "Cluster Manager" + CS[Cluster Settings] --> BSA[BalancedShardsAllocator] + BSA --> WF[WeightFunction] + WF --> AC[AllocationConstraints] + WF --> RC[RebalanceConstraints] + end + + subgraph "Constraint Types" + AC --> ISPN[INDEX_SHARD_PER_NODE_BREACH] + AC --> IPSB[INDEX_PRIMARY_SHARD_BALANCE] + AC --> CPSB[CLUSTER_PRIMARY_SHARD_BALANCE] + RC --> IPSB2[INDEX_PRIMARY_SHARD_BALANCE] + RC --> CPSR[CLUSTER_PRIMARY_SHARD_REBALANCE] + end + + subgraph "Allocation Decision" + WF --> |weight calculation| AD[Allocation Decision] + AD --> N1[Node 1] + AD --> N2[Node 2] + AD --> N3[Node N] + end +``` + +### Weight Calculation + +The `WeightFunction` calculates node weights using the formula: + +``` +weight(node, index) = θ₀ × (node.numShards - avgShardsPerNode) + + θ₁ × (node.numShards(index) - avgShardsPerNode(index)) +``` + +Where: +- `θ₀ = shardBalance / (indexBalance + shardBalance)` +- `θ₁ = indexBalance / (indexBalance + shardBalance)` + +### Components + +| Component | Description | +|-----------|-------------| +| `BalancedShardsAllocator` | Main allocator that orchestrates shard distribution | +| `WeightFunction` | Calculates node weights for allocation decisions | +| `AllocationConstraints` | Constraints applied during initial shard allocation | +| `RebalanceConstraints` | Constraints applied during shard rebalancing | +| `LocalShardsBalancer` | Performs actual allocation and rebalancing operations | + +### Configuration + +| Setting | Description | Default | +|---------|-------------|---------| +| `cluster.routing.allocation.balance.shard` | Weight factor for total shards per node | `0.45` | +| `cluster.routing.allocation.balance.index` | Weight factor for shards per index per node | `0.55` | +| `cluster.routing.allocation.balance.threshold` | Minimum optimization value for operations | `1.0` | +| `cluster.routing.allocation.balance.prefer_primary` | Enable primary shard balancing | `false` | +| `cluster.routing.allocation.rebalance.primary.enable` | Enable primary shard rebalancing | `false` | +| `cluster.routing.allocation.rebalance.primary.buffer` | Buffer for primary shard rebalancing | `0.10` | +| `cluster.routing.allocation.primary_constraint.threshold` | Threshold for primary constraint | `10` | + +### Usage Example + +Enable primary shard balancing for segment replication workloads: + +```json +PUT /_cluster/settings +{ + "persistent": { + "cluster.routing.allocation.balance.prefer_primary": true, + "cluster.routing.allocation.rebalance.primary.enable": true, + "cluster.routing.allocation.rebalance.primary.buffer": 0.10 + } +} +``` + +Adjust balance factors for specific workloads: + +```json +PUT /_cluster/settings +{ + "persistent": { + "cluster.routing.allocation.balance.shard": 0.50, + "cluster.routing.allocation.balance.index": 0.50 + } +} +``` + +## Limitations + +- Primary shard balancing is best-effort and may not achieve perfect distribution in all scenarios +- Enabling primary shard balance does not guarantee equal primary shards on each node, especially during failover +- Changing `prefer_primary` to `false` after enabling does not trigger redistribution + +## Related PRs + +| Version | PR | Description | +|---------|-----|-------------| +| v3.4.0 | [#19012](https://github.com/opensearch-project/OpenSearch/pull/19012) | Fix WeightFunction constraint reset bug | + +## References + +- [Issue #13429](https://github.com/opensearch-project/OpenSearch/issues/13429): Bug report for constraint reset issue +- [Cluster Settings Documentation](https://docs.opensearch.org/3.0/install-and-configure/configuring-opensearch/cluster-settings/): Official cluster routing allocation settings +- [Segment Replication Documentation](https://docs.opensearch.org/3.0/tuning-your-cluster/availability-and-recovery/segment-replication/index/): Recommended settings for segment replication + +## Change History + +- **v3.4.0** (2025-10-10): Fixed bug where allocation and rebalance constraints were incorrectly reset when updating balance factors diff --git a/docs/releases/v3.4.0/features/opensearch/shard-allocation.md b/docs/releases/v3.4.0/features/opensearch/shard-allocation.md new file mode 100644 index 000000000..9ec76315d --- /dev/null +++ b/docs/releases/v3.4.0/features/opensearch/shard-allocation.md @@ -0,0 +1,105 @@ +# Shard Allocation + +## Summary + +This release fixes a bug where the `WeightFunction` allocation and rebalance constraints for primary shard balancing were incorrectly reset to default values when updating certain cluster settings. The fix ensures that primary shard balance settings (`cluster.routing.allocation.balance.prefer_primary` and `cluster.routing.allocation.rebalance.primary.enable`) remain effective even when other balance-related settings are modified. + +## Details + +### What's New in v3.4.0 + +The `BalancedShardsAllocator` uses a `WeightFunction` to calculate node weights for shard allocation decisions. This function includes constraints that control primary shard balancing behavior. Prior to this fix, updating settings like `indexBalanceFactor`, `shardBalanceFactor`, or `preferPrimaryShardRebalanceBuffer` would create a new `WeightFunction` instance that lost the previously configured primary shard balance constraints. + +### Technical Changes + +#### Root Cause + +The bug occurred because: + +1. Settings like `PREFER_PRIMARY_SHARD_BALANCE` and `PREFER_PRIMARY_SHARD_REBALANCE` updated constraints on the existing `WeightFunction` instance +2. Settings like `INDEX_BALANCE_FACTOR_SETTING`, `SHARD_BALANCE_FACTOR_SETTING`, and `PRIMARY_SHARD_REBALANCE_BUFFER` triggered `updateWeightFunction()` which created a new `WeightFunction` +3. The new `WeightFunction` was constructed without the current primary balance constraint states + +```mermaid +graph TB + subgraph "Before Fix" + A[Set prefer_primary=true] --> B[Update weightFunction constraints] + C[Update shard_balance_factor] --> D[Create NEW weightFunction] + D --> E[Constraints reset to defaults] + end + + subgraph "After Fix" + F[Set prefer_primary=true] --> G[Store in instance variable] + H[Update shard_balance_factor] --> I[Create NEW weightFunction] + I --> J[Pass stored constraint values] + J --> K[Constraints preserved] + end +``` + +#### Code Changes + +The fix modifies the `WeightFunction` constructor to accept the current constraint states: + +| Component | Change | +|-----------|--------| +| `WeightFunction` constructor | Added `preferPrimaryShardBalance` and `preferPrimaryShardRebalance` parameters | +| `updateWeightFunction()` | Now passes current constraint values to new `WeightFunction` | +| `WeightFunction` initialization | Applies constraint settings during construction | + +#### Modified Files + +| File | Description | +|------|-------------| +| `BalancedShardsAllocator.java` | Extended `WeightFunction` constructor and `updateWeightFunction()` | +| `SegmentReplicationAllocationIT.java` | Added integration test to verify fix | +| `BalanceConfigurationTests.java` | Added unit test for settings update scenario | +| `OpenSearchAllocationTestCase.java` | Added test helper method | + +### Usage Example + +The bug manifested when settings were updated in a specific order: + +```json +// Step 1: Enable primary shard balance +PUT /_cluster/settings +{ + "persistent": { + "cluster.routing.allocation.balance.prefer_primary": true + } +} + +// Step 2: Update shard balance factor (this previously reset prefer_primary) +PUT /_cluster/settings +{ + "persistent": { + "cluster.routing.allocation.balance.shard": 0.5 + } +} + +// After fix: prefer_primary constraint remains active +``` + +### Migration Notes + +No migration required. This is a bug fix that ensures existing settings work as documented. + +## Limitations + +- The fix only addresses the constraint reset issue; it does not change the fundamental behavior of primary shard balancing +- Primary shard balance is still a best-effort optimization and may not achieve perfect balance in all scenarios + +## Related PRs + +| PR | Description | +|----|-------------| +| [#19012](https://github.com/opensearch-project/OpenSearch/pull/19012) | Fix Allocation and Rebalance Constraints of WeightFunction are incorrectly reset | + +## References + +- [Issue #13429](https://github.com/opensearch-project/OpenSearch/issues/13429): Original bug report +- [Cluster Settings Documentation](https://docs.opensearch.org/3.0/install-and-configure/configuring-opensearch/cluster-settings/): Official cluster routing allocation settings +- [Segment Replication Documentation](https://docs.opensearch.org/3.0/tuning-your-cluster/availability-and-recovery/segment-replication/index/): Recommended settings for segment replication + +## Related Feature Report + +- [Full feature documentation](../../../../features/opensearch/shard-allocation.md) diff --git a/docs/releases/v3.4.0/index.md b/docs/releases/v3.4.0/index.md index 1459e4fb3..f64cd5b64 100644 --- a/docs/releases/v3.4.0/index.md +++ b/docs/releases/v3.4.0/index.md @@ -38,6 +38,7 @@ - [Pull-based Ingestion Bugfixes](features/opensearch/pull-based-ingestion-bugfixes.md) - Fix out-of-bounds offset handling and remove persisted pointers for at-least-once guarantees - [Query Bugfixes](features/opensearch/query-bugfixes.md) - Fix crashes in wildcard queries, aggregations, highlighters, and script score queries - [Reactor Netty Transport](features/opensearch/reactor-netty-transport.md) - Fix HTTP channel tracking and release during node shutdown +- [Shard Allocation](features/opensearch/shard-allocation.md) - Fix WeightFunction constraint reset when updating balance factors - [Shard & Segment Bugfixes](features/opensearch/shard-segment-bugfixes.md) - Fix merged segment warmer exceptions, ClusterService state assertion, and EngineConfig builder - [Snapshot & Restore Bugfixes](features/opensearch/snapshot-restore-bugfixes.md) - Fix NullPointerException when restoring remote snapshot with missing shard size information