Skip to content

Conversation

@errose28
Copy link
Contributor

@errose28 errose28 commented Apr 19, 2024

What changes were proposed in this pull request?

Currently there are two ways to reserve space in datanode volumes:

  1. hdds.datanode.dir.du.reserved.percent allows specifying a percentage of the volume's space to be unused. It applies to all volumes
  2. hdds.datanode.dir.du.reserved allows specifying a map of volume name to bytes reserved. Since it depends on a volume path, it cannot have a default value.

By default Ozone should not allow datanode volumes to get 100% full. This can cause the drive to "lock up" because some operations like block delete that would free up space still need extra disk space before they can complete because they must append to the RocksDB WAL. Once encountered, such issues are difficult to resolve. Add a default value for hdds.datanode.dir.du.reserved.percent to prevent this from happening.

A default value of 0.0001f is currently chosen. This is 0.01% which reserves 1GB out of a 10TB volume, 1MB out of a 1TB volume, etc. Ideally we could reserve a fixed size (like 1GB) regardless of drive size, but we would need to re-work the configs before we can do that which might need more discussion. See HDDS-10721.

This PR also fixes a few other bugs that prevented tests from passing after the change:

  • A non-zero default value for hdds.datanode.dir.du.reserved.percent would not be used.
  • Canonicalization was not done on paths in hdds.datanode.dir.du.reserved. This may have passed in CI but was failing due to my local filesystem setup.
  • Invalid space reserved configurations would fall back to 0 (hardcoded) instead of the default value.

What is the link to the Apache JIRA

HDDS-10720

How was this patch tested?

Unit test added.

@errose28 errose28 marked this pull request as draft April 20, 2024 00:19
@errose28
Copy link
Contributor Author

Looks like there's other tests that were depending on 0 being the default value. Let me get them fixed on my fork before running CI here.

@errose28
Copy link
Contributor Author

Green CI on my fork, ready for review.

@errose28 errose28 marked this pull request as ready for review April 22, 2024 17:36
Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @errose28 for the patch, LGTM.

@adoroszlai adoroszlai merged commit 1324e95 into apache:master May 3, 2024
@errose28
Copy link
Contributor Author

errose28 commented May 3, 2024

Thanks for the review @adoroszlai

jojochuang pushed a commit to jojochuang/ozone that referenced this pull request May 29, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 17, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Jul 18, 2024
swamirishi pushed a commit to swamirishi/ozone that referenced this pull request Dec 3, 2025
…ve a non-zero default value. (apache#6561)

(cherry picked from commit 1324e95)

Conflicts:
TestReservedVolumeSpace#testPathsCanonicalized: Some junit TemporaryFolder methods were unavailable, so did the following instead:
    Path symlink = new File(temp.getRoot(), "link").toPath();
    Files.createSymbolicLink(symlink, folder.getRoot().toPath());

Other conflicts:
	hadoop-hdds/common/src/main/java/org/apache/hadoop/hdds/scm/ScmConfigKeys.java
	hadoop-hdds/container-service/src/main/java/org/apache/hadoop/ozone/container/common/volume/VolumeUsage.java
	hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/common/volume/TestCapacityVolumeChoosingPolicy.java
	hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/common/volume/TestReservedVolumeSpace.java
	hadoop-hdds/container-service/src/test/java/org/apache/hadoop/ozone/container/common/volume/TestRoundRobinVolumeChoosingPolicy.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants