-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-13922. Increase available space requirement during pipeline/container allocation #9362
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
peterxcli
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @rich7420 for this patch! left some comments
.github/workflows/check.yml
Outdated
| cat target/${{ inputs.script }}/summary.md >> $GITHUB_STEP_SUMMARY | ||
| fi | ||
| hadoop-ozone/dev-support/checks/_summary.sh target/${{ inputs.script }}/summary.txt | ||
| hadoop-ozone/dev-support/checks/_summary.sh target/${{ inputs.script }}/summary.txt || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| hadoop-ozone/dev-support/checks/_summary.sh target/${{ inputs.script }}/summary.txt || true | |
| hadoop-ozone/dev-support/checks/_summary.sh target/${{ inputs.script }}/summary.txt |
| DEVELOCITY_ACCESS_KEY: ${{ secrets.DEVELOCITY_ACCESS_KEY }} | ||
| - name: Summary of failures | ||
| run: hadoop-ozone/dev-support/checks/_summary.sh target/unit/summary.txt | ||
| run: hadoop-ozone/dev-support/checks/_summary.sh target/unit/summary.txt || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| run: hadoop-ozone/dev-support/checks/_summary.sh target/unit/summary.txt || true | |
| run: hadoop-ozone/dev-support/checks/_summary.sh target/unit/summary.txt |
| continue-on-error: true | ||
| - name: Summary of failures | ||
| run: hadoop-ozone/dev-support/checks/_summary.sh target/${{ github.job }}/summary.txt | ||
| run: hadoop-ozone/dev-support/checks/_summary.sh target/${{ github.job }}/summary.txt || true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| run: hadoop-ozone/dev-support/checks/_summary.sh target/${{ github.job }}/summary.txt || true | |
| run: hadoop-ozone/dev-support/checks/_summary.sh target/${{ github.job }}/summary.txt |
| "ozone.scm.container.size"; | ||
| public static final String OZONE_SCM_CONTAINER_SIZE_DEFAULT = "5GB"; | ||
|
|
||
| public static final String OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m thinking this patch only needs to add a config to set the required data size (dataSizeRequired), which we can then read from ReplicationManagerUtil here: https://github.com/peterxcli/ozone/blob/7868a862ff616154e4bd9ab52dada929e15042ec/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManagerUtil.java#L93-L94.
@siddhantsangwan does this approach look correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @peterxcli ! I've tried to simplify the approach removed the new config entirely and directly use HddsServerUtil.requiredReplicationSpace() which already provides the 2x multiplier.
| </property> | ||
| <property> | ||
| <name>ozone.scm.container.space.requirement.multiplier</name> | ||
| <value>2.0</value> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would solve the problem with 2 concurrent allocations, which would both pass. But with increased concurrency we could expect the same failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we even need to add one more config here, instead this value can be 5 or 10x of container size , concurrency prediction is unknown here. Even if concurrency is very high on same DN at the same time, client always has a mechanism to retry on another pipeline.
|
I'd rather suggest graceful handling of Out of space event from DN during container creation and if it fails to use one of already opened containers. The other approach would be serializing container creation if DN has less than 5 -10% of free space. |
|
Correct me if im wrong. But I havent looked into the related code so I'm not sure whether we can do the same thing at SCM side. |
On DN we have committed bytes so it is simple and easy way to handle this scenario. But in SCM we don't have committed bytes to control the same as DN way. And it will be too complex to add committed bytes in SCM. |
You're right that 2x only provides a buffer for limited concurrency. However, implementing the more robust solutions you suggested (graceful out-of-space handling, serialized allocation) would require significant architectural changes. I think the current approach using the existing |
|
@rich7420 , thanks for working on this task. IIUC, SCM doesn't track the new containers allocated with a specific DN, so in concurrent container allocation requests case, space check in SCM will pass while there is no enough space on DN to serve all new containers(DN side container allocation space reservation is now serialized(https://issues.apache.org/jira/browse/HDDS-12810), so DN will fail the requests when no enough space), so increase the available space requirement check from 5GB in SCM cannot guarantee solve the problem, since the concurrency is not a fixed status, it' hard to decide which number is enough in SCM. |
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this will not completely fix the problem but it can improve the situation. If everyone agrees to go ahead with this change, we can make it configurable and have at least 5x, that is 25 GB, as default. This will need to be checked at both times - when creating a pipeline and when creating a new container.
|
Do we have any alternative ways to solve this problem? |
I'm not sure what the client retry behaviour is. We should be having some retry mechanism to pick a different pipeline. But write performance is already degraded at that point, so we need to make better decisions at the SCM level about which pipeline is given to a client in the first place. I think we need to test the correctness and performance of |
.github/workflows/check.yml
Outdated
| hadoop-ozone/dev-support/checks/_summary.sh target/${{ inputs.script }}/summary.txt | ||
| if [[ -f "hadoop-ozone/dev-support/checks/_summary.sh" ]]; then | ||
| hadoop-ozone/dev-support/checks/_summary.sh target/${{ inputs.script }}/summary.txt | ||
| fi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These look unrelated. Why are CI workflow files being changed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for asking @adoroszlai ! This PR changed this part Because I got error in without _summary.sh this file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got error in without
_summary.shthis file.
_summary.sh does exist in the source repo. Please check for the root cause instead of masking it.
summary.md is checked because it's generated by some CI checks, but not all.
|
Thanks for all the valuable feedback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a configurable multiplier for space requirements during container allocation to prevent concurrent allocation failures. The implementation adds ozone.scm.container.space.requirement.multiplier (default 5.0) to increase the space check threshold, though there's a discrepancy with the PR description which states the default is 2.0.
Key Changes:
- Added configurable space requirement multiplier with default value of 5.0
- Updated container allocation logic to use multiplied space requirement (25GB with default 5GB container size)
- Modified pipeline providers (Ratis and EC) to use the new space calculation
Reviewed changes
Copilot reviewed 8 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| hadoop-hdds/common/src/main/resources/ozone-default.xml | Adds configuration property for space requirement multiplier with default value 5.0 |
| hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/ScmConfigKeys.java | Defines the new configuration key and default constant (5.0) |
| hadoop-hdds/framework/src/main/java/org/apache/hadoop/hdds/utils/HddsServerUtil.java | Adds overloaded method to calculate required space using the configurable multiplier |
| hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/ContainerManagerImpl.java | Updates container allocation to use the new space requirement calculation |
| hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/RatisPipelineProvider.java | Updates Ratis pipeline creation to use multiplied space requirement |
| hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/pipeline/ECPipelineProvider.java | Updates EC pipeline creation to use multiplied space requirement |
| hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/container/TestContainerManagerImpl.java | Adds test to verify space requirement calculation is used correctly |
| hadoop-hdds/server-scm/src/test/java/org/apache/hadoop/hdds/scm/pipeline/TestECPipelineProvider.java | Updates test to use the new space requirement calculation |
| hadoop-ozone/dist/src/main/k8s/examples/*/config-configmap.yaml | Adds container size configuration (1GB) to K8s deployment examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| public static final String OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER = | ||
| "ozone.scm.container.space.requirement.multiplier"; | ||
| public static final double OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER_DEFAULT = 5.0; |
Copilot
AI
Dec 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description states the default multiplier is 2.0, but the implementation sets it to 5.0. This inconsistency should be resolved. Either update the PR description to reflect the actual default of 5.0, or change the default value in the code to 2.0 to match the description.
| public static final double OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER_DEFAULT = 5.0; | |
| public static final double OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER_DEFAULT = 2.0; |
| <description> | ||
| Multiplier for container space requirement when checking if a datanode | ||
| has enough space for container allocation. The required space is calculated | ||
| as container size multiplied by this value. This prevents concurrent clients | ||
| from all passing the space check when there's only enough space for one | ||
| container. For example, with default container size of 5GB and multiplier | ||
| of 5.0, the system will require 25GB of available space before allocating | ||
| a new container. This ensures that if only 6GB is remaining, the check will | ||
| fail, preventing multiple clients from attempting to create containers | ||
| concurrently when there's only space for one. | ||
| </description> |
Copilot
AI
Dec 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example in the documentation is incorrect. With a default container size of 5GB and a multiplier of 5.0, the required space would be 25GB (5GB * 5.0), not 10GB as mentioned in the PR description. The documentation correctly states 25GB, but this contradicts the PR description which mentions increasing from 5GB to 10GB (which would require a multiplier of 2.0).
| </property> | ||
| <property> | ||
| <name>ozone.scm.container.space.requirement.multiplier</name> | ||
| <value>5.0</value> |
Copilot
AI
Dec 6, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR description states the default multiplier is 2.0, but the implementation sets it to 5.0. This inconsistency should be resolved. Either update the PR description to reflect the actual default of 5.0, or change the default value in the code to 2.0 to match the description.
|
Changed to draft due to failures in acceptance tests: https://github.com/rich7420/ozone/actions/runs/19981844525 CI runs with limited space and limited number of nodes. We should either set multiplier=1 or reduce container size. |
|
I had a separate discussion with @ChenSammi and @ashishkumar50. It was decided it's better to pause this for now, test CapacityPipelineChoosePolicy and then decide if we should go ahead with this. The reason is that we already have a lot of configurations in this area and it's complicated. Adding another configuration or buffer becomes hard to explain to ozone users. I'm working on testing and exploring CapacityPipelineChoosePolicy, will create a jira soon. Based on testing we can decide whether to go ahead with this pull request (or a better approach). @rich7420 what do you think? |
|
Sounds good to me, thanks for driving this. Let’s close this PR and proceed with testing CapacityPipelineChoosePolicy. I'm happy to help review the new JIRA/patch or adjust this PR based on the results. thanks @siddhantsangwan |
What changes were proposed in this pull request?
This PR added a configurable multiplier (ozone.scm.container.space.requirement.multiplier, default 2.0) that increases the space check from 5GB to 10GB during container allocation. This gives a buffer so multiple allocations don't fail when there's barely enough space.
What is the link to the Apache JIRA
HDDS-13922
How was this patch tested?
test all passed in:
https://github.com/rich7420/ozone/actions/runs/19655213061