-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-8254. Close containers when volume reaches utilisation threshold #4583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sadanand48 Thanks for working over this, IMO,
Currently, this PR, flow is,
- if condition met, it notify close of container to SCM, SCM will trigger close after some time
- and continue to write
Where,
hdds.datanode.du.reserved -- will always create new container and proceed for write,
So this will have impact:
- create many small containers
- client may face failure of closed container
But still write will happen till hdds.datanode.du.reserved is met for allocating container, and this property will not provide much value in this regards.
|
Thanks @sumitagrawl for the review, I have now aligned the config to take effect even during container allocation on a volume during its creation. So the flow would be
On client retry new containers should be allocated in this case. |
|
For bigger volumes like 20TB, the default soft limit of 0.9 still leaves 2TB of space that's available for writes. I'm wondering if it's better to define the limit in a different format such as the raw available capacity - something like 5GB? |
|
Hi @sadanand48 , can we reuse the ""hdds.datanode.storage.utilization.critical.threshold" property? |
Thanks @siddhantsangwan for the comment. This makes sense, I have updated my patch to use capacity instead of percentage. Please take a look. |
Thanks @ChenSammi for the comment, Now that I have changed the current patch to use capacity, should I still change it to use this property as hdds.datanode.storage.utilization.critical.threshold takes a float that represents percentage. |
Hi @sadanand48 , we already have following properties in Ozone now.
From user's point of view, "hdds.datanode.volume.min.free.space" looks very similar to "hdds.datanode.dir.du.reserved" functionally, like another kind of reserved space. Maybe we can use "hdds.datanode.dir.du.reserved" directly? Currently the default value of "hdds.datanode.dir.du.reserved" and " hdds.datanode.dir.du.reserved.percent" are 0. We can change their default value(5GB, like in this patch, and 0.95 for percent), what do you think? |
I just gave this a thought and realised that we had introduced this property because the flow is such that
If we are okay with crossing reserved space by a little then we can set a default for reserved space, else we need to have a small buffer like what is defined in this PR , before the reserved space is reached. |
Had a offline discussion with @sadanand48 , here are the agreed points,
|
|
@sadanand48 is this PR ready for review now or are you planning to push more commits? |
...iner-service/src/main/java/org/apache/hadoop/ozone/container/common/impl/HddsDispatcher.java
Outdated
Show resolved
Hide resolved
|
@siddhantsangwan , This is ready for review now, I have added both options for percentage and value and the user can choose any. |
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sadanand48 Thanks for working on this. In general, I think we should use available space instead of capacity - used. This is because when reserved space hasn't been configured, capacity - used just means subtracting the space used by ozone from the total capacity. It doesn't take into account space used by other applications besides ozone. VolumeInfo#getAvailable will take this into account so it's likely to be accurate. What do you think?
...-service/src/test/java/org/apache/hadoop/ozone/container/common/impl/TestHddsDispatcher.java
Outdated
Show resolved
Hide resolved
Thanks, Good catch, updated the patch. |
sumitagrawl
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sadanand48 LGTM +1
siddhantsangwan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for the work @sadanand48
| long volumeFreeSpaceToSpare = | ||
| VolumeUsage.getMinVolumeFreeSpace(conf, volumeCapacity); | ||
| long volumeAvailable = volume.getAvailable(); | ||
| return (volumeAvailable <= volumeFreeSpaceToSpare); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sadanand48 , "- vol.getCommittedBytes()" is missing here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sadanand48 . The last patch LGTM, +1.
* master: (78 commits) HDDS-8575. Intermittent failure in TestCloseContainerEventHandler.testCloseContainerWithDelayByLeaseManager (apache#4688) HDDS-7241. EC: Reconstruction could fail with orphan blocks. (apache#4718) HDDS-8577. [Snapshot] Disable compaction log when loading metadata for snapshot (apache#4697) HDDS-7080. EC: Offline reconstruction needs better logging (apache#4719) HDDS-8626. Config thread pool in ReplicationServer (apache#4715) HDDS-8616. Underreplication not fixed if all replicas start decommissioning (apache#4711) HDDS-8254. Close containers when volume reaches utilisation threshold (apache#4583) HDDS-8254. Close containers when volume reaches utilisation threshold (apache#4583) HDDS-8615. Explicitly show EC block type in 'ozone debug chunkinfo' command output (apache#4706) HDDS-8623. Delete duplicate getBucketInfo in OMKeyCommitRequest (apache#4712) HDDS-8339. Recon Show the number of keys marked for Deletion in Recon UI. (apache#4519) HDDS-8572. Support CodecBuffer for protobuf v3 codecs. (apache#4693) HDDS-8010. Improve DN warning message when getBlock does not find the block. (apache#4698) HDDS-8621. IOException is never thrown in SCMRatisServer.getRatisRoles(). (apache#4710) HDDS-8463. S3 key uniqueness in deletedTable (apache#4660) HDDS-8584. Hadoop client write slowly when stream enabled (apache#4703) HDDS-7732. EC: Verify block deletion from missing EC containers (apache#4705) HDDS-8581. Avoid random ports in integration tests (apache#4699) HDDS-8504. ReplicationManager: Pass used and excluded node separately for Under and Mis-Replication (apache#4694) HDDS-8576. Close RocksDB instance in RDBStore if RDBStore's initialization fails after RocksDB instance creation (apache#4692) ...
…apache#4583) (cherry picked from commit ab5265b)
…apache#4583) (cherry picked from commit ab5265b)
What changes were proposed in this pull request?
Close containers when volume reaches utilisation threshold, If volume is configured with a reserved space, the softlimit would hit when
(capacity-reserved) - used <= minFreeSpaceOnVolume.By default the value is 5GB.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-8254
How was this patch tested?
Unit tests
PushReplicatorandDownloadAndImportReplicatorwhen replicating a container to a target datanode respect the VolumeChoosingPolicy which checks if volume has enough space or not , to place this container there and if there is no space available (required space < available (includes reserved space) ) it doesn't choose the volume