-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-13286. Fail stream write when the volume is full. #8644
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces a check for volume full conditions during stream write operations, ensuring that a write failure is triggered with a proper error message and exception when disk space is exhausted.
- Added a new method assertSpaceAvailability in the base stream data channel class
- Updated the KeyValueStreamDataChannel to perform a space availability check before writing
- Introduced unit tests to validate the disk full failure scenario
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| TestKeyValueStreamDataChannel.java | Added unit tests to simulate and assert behavior when the volume is full |
| StreamDataChannelBase.java | Implemented assertSpaceAvailability method to throw StorageContainerException on insufficient space |
| KeyValueStreamDataChannel.java | Invoked assertSpaceAvailability before writing to the channel |
...test/java/org/apache/hadoop/ozone/container/keyvalue/impl/TestKeyValueStreamDataChannel.java
Show resolved
Hide resolved
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl , thanks for working on this! Please see the comments inlined.
| throws IOException { | ||
| getMetrics().incContainerOpsMetrics(getType()); | ||
| assertOpen(); | ||
| assertSpaceAvailability(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should account for referenceCounted.get().remaining().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
| if (containerData.getVolume().isVolumeFull()) { | ||
| throw new StorageContainerException("write failed for container " + containerData.getContainerID() | ||
| + " due to volume " + containerData.getVolume().getStorageID() + " out of space " | ||
| + containerData.getVolume().getCurrentUsage(), DISK_OUT_OF_SPACE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should print also the reserved space and the required space; otherwise, the message will be confusing since it will show space available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Please move method to
HddsVolumeasassertSpaceForWrite. - Accept required size in parameter, as @szetszwo mentioned.
- In the implementation, add local variables for the values (
currentUsage,minFreeSpace), and use them in both calculation and log message. Do not callboolean isVolumeFull(), because then the message needs to obtain the same values again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adoroszlai others are done, but unable to move to HddsVolume as log message needs containerId which is not present as part of HddsVolume. For log completeness, it should be done at outer layer, or need encapsulate exception again adding more details at outer layer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pass container-specific part of the message as Supplier<String>.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, please rename to assertSpaceForWrite for clarity and simplicity, and rename parameter remaining (which is in terms of the buffer) to required.
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl , thanks for the update! Please see the comment inlined.
...ice/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/StreamDataChannelBase.java
Outdated
Show resolved
Hide resolved
...ice/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/StreamDataChannelBase.java
Show resolved
Hide resolved
|
On a related note: I don't think it's good practice to have two or more pull requests in parallel with similar code change handling different case (stream vs. non-stream write path). You inevitably get duplication in one or more areas:
Depending on the size of the changes, please either:
|
| final SpaceUsageSource currentUsage = volume.getCurrentUsage(); | ||
| final long spared = volume.getFreeSpaceToSpare(currentUsage.getCapacity()); | ||
|
|
||
| if (currentUsage.getAvailable() - spared <= sizeRequested) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sumitagrawl , It should be < since == means that the space is enough. Sorry that I did not see it earlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
szetszwo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 the change looks good.
|
@sumitagrawl , thanks for working on this! @adoroszlai , thanks for reviewing this! |
…239-container-reconciliation Commits: 9 05567e6 HDDS-13317. Table should support empty array/String (apache#8676) 803fa31 HDDS-13313. Bump maven-javadoc-plugin to 3.11.2 (apache#8672) 774b3ee HDDS-13315. Bump log4j2 to 2.25.0 (apache#8670) b86fad8 HDDS-13316. Bump commons-text to 1.13.1 (apache#8669) c043be7 HDDS-13312. Bump maven-patch-plugin to 1.3 (apache#8673) db6ed84 HDDS-13294. Remove the Table.close() method. (apache#8658) e1365a7 HDDS-13286. Fail stream write when the volume is full. (apache#8644) af3a0f0 HDDS-13307. integration (flaky) fails with all tests passing (apache#8667) d207c18 HDDS-13301. Fix TestSchemaOneBackwardsCompatibility (apache#8666) Conflicts: hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java
…pache#8644) (cherry picked from commit e1365a7) Conflicts: ContainerUtils: Had to use volume.getVolumeInfo() for getting SpaceUsageSource. Added volume.getVolumeInfo().isPresent() check. Used ReferenceCountedObject.wrap(putBlockBuf, () -> {}, () -> {}) instead of ReferenceCountedObject.wrap(putBlockBuf) as the single arg method is not present in this version. hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/helpers/ContainerUtils.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/KeyValueStreamDataChannel.java hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/keyvalue/impl/StreamDataChannelBase.java hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/keyvalue/impl/TestKeyValueStreamDataChannel.java
What changes were proposed in this pull request?
When volume if full, means no space available for data write, need to make write failure as Disk out of space.
This needs to be done for write operation during stream write.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13286
How was this patch tested?