-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-12622. Refactor minFreeSpace calculation #8119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -112,7 +112,6 @@ public class HddsDispatcher implements ContainerDispatcher, Auditor { | |
| private ContainerMetrics metrics; | ||
| private final TokenVerifier tokenVerifier; | ||
| private long slowOpThresholdNs; | ||
| private VolumeUsage.MinFreeSpaceCalculator freeSpaceCalculator; | ||
|
|
||
| /** | ||
| * Constructs an OzoneContainer that receives calls from | ||
|
|
@@ -146,7 +145,6 @@ public HddsDispatcher(ConfigurationSource config, ContainerSet contSet, | |
| LOG, | ||
| HddsUtils::processForDebug, | ||
| HddsUtils::processForDebug); | ||
| this.freeSpaceCalculator = new VolumeUsage.MinFreeSpaceCalculator(conf); | ||
| } | ||
|
|
||
| @Override | ||
|
|
@@ -619,7 +617,7 @@ private boolean isVolumeFull(Container container) { | |
| if (isOpen) { | ||
| HddsVolume volume = container.getContainerData().getVolume(); | ||
| SpaceUsageSource usage = volume.getCurrentUsage(); | ||
| long volumeFreeSpaceToSpare = freeSpaceCalculator.get(usage.getCapacity()); | ||
| long volumeFreeSpaceToSpare = volume.getFreeSpaceToSpare(usage.getCapacity()); | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's fine, just not looking that straightforward that need a capacity parameter as the volume has the capacity value.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I also don't like to pass capacity to volume, so I plan to further refactor in follow-up. But wanted to avoid bloating this patch.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Created HDDS-12670 for follow-up.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @ChenSammi @peterxcli #8167 addresses this. |
||
| return !VolumeUsage.hasVolumeEnoughSpace(usage.getAvailable(), volume.getCommittedBytes(), 0, | ||
| volumeFreeSpaceToSpare); | ||
| } | ||
|
|
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -18,7 +18,11 @@ | |
| package org.apache.hadoop.ozone.container.common.statemachine; | ||
|
|
||
| import static java.util.concurrent.TimeUnit.MICROSECONDS; | ||
| import static org.apache.hadoop.hdds.conf.ConfigTag.CONTAINER; | ||
| import static org.apache.hadoop.hdds.conf.ConfigTag.DATANODE; | ||
| import static org.apache.hadoop.hdds.conf.ConfigTag.MANAGEMENT; | ||
| import static org.apache.hadoop.hdds.conf.ConfigTag.OZONE; | ||
| import static org.apache.hadoop.hdds.conf.ConfigTag.STORAGE; | ||
| import static org.apache.hadoop.ozone.container.common.statemachine.DatanodeConfiguration.CONFIG_PREFIX; | ||
|
|
||
| import java.time.Duration; | ||
|
|
@@ -28,6 +32,7 @@ | |
| import org.apache.hadoop.hdds.conf.ConfigType; | ||
| import org.apache.hadoop.hdds.conf.PostConstruct; | ||
| import org.apache.hadoop.hdds.conf.ReconfigurableConfig; | ||
| import org.apache.hadoop.hdds.conf.StorageSize; | ||
| import org.slf4j.Logger; | ||
| import org.slf4j.LoggerFactory; | ||
|
|
||
|
|
@@ -68,6 +73,13 @@ public class DatanodeConfiguration extends ReconfigurableConfig { | |
| "hdds.datanode.disk.check.min.gap"; | ||
| public static final String DISK_CHECK_TIMEOUT_KEY = | ||
| "hdds.datanode.disk.check.timeout"; | ||
| public static final String HDDS_DATANODE_VOLUME_MIN_FREE_SPACE = | ||
| "hdds.datanode.volume.min.free.space"; | ||
| public static final String HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_DEFAULT = | ||
| "5GB"; | ||
| public static final String HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_PERCENT = | ||
| "hdds.datanode.volume.min.free.space.percent"; | ||
| static final byte MIN_FREE_SPACE_UNSET = -1; | ||
|
|
||
| public static final String WAIT_ON_ALL_FOLLOWERS = | ||
| "hdds.datanode.wait.on.all.followers"; | ||
|
|
@@ -319,6 +331,25 @@ public void setBlockDeletionLimit(int limit) { | |
| this.blockLimitPerInterval = limit; | ||
| } | ||
|
|
||
| @Config(key = "hdds.datanode.volume.min.free.space", | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. DatanodeConfiguration has the key prefix "hdds.datanode" defined.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is now valid after HDDS-12424. It makes searching for config key easier. |
||
| defaultValue = "-1", | ||
| type = ConfigType.SIZE, | ||
| tags = { OZONE, CONTAINER, STORAGE, MANAGEMENT }, | ||
| description = "This determines the free space to be used for closing containers" + | ||
| " When the difference between volume capacity and used reaches this number," + | ||
| " containers that reside on this volume will be closed and no new containers" + | ||
| " would be allocated on this volume." | ||
| ) | ||
| private long minFreeSpace = MIN_FREE_SPACE_UNSET; | ||
|
|
||
| @Config(key = "hdds.datanode.volume.min.free.space.percent", | ||
| defaultValue = "-1", | ||
| type = ConfigType.FLOAT, | ||
| tags = { OZONE, CONTAINER, STORAGE, MANAGEMENT }, | ||
| description = "" // not documented | ||
| ) | ||
| private float minFreeSpaceRatio = MIN_FREE_SPACE_UNSET; | ||
|
|
||
| @Config(key = "periodic.disk.check.interval.minutes", | ||
| defaultValue = "60", | ||
| type = ConfigType.LONG, | ||
|
|
@@ -719,6 +750,39 @@ public void validate() { | |
| rocksdbDeleteObsoleteFilesPeriod = | ||
| ROCKSDB_DELETE_OBSOLETE_FILES_PERIOD_MICRO_SECONDS_DEFAULT; | ||
| } | ||
|
|
||
| validateMinFreeSpace(); | ||
| } | ||
|
|
||
| /** | ||
| * If 'hdds.datanode.volume.min.free.space' is defined, | ||
| * it will be honored first. If it is not defined and | ||
| * 'hdds.datanode.volume.min.free.space.percent' is defined, it will honor this | ||
| * else it will fall back to 'hdds.datanode.volume.min.free.space.default' | ||
| */ | ||
| private void validateMinFreeSpace() { | ||
| if (minFreeSpaceRatio > 1) { | ||
| LOG.warn("{} = {} is invalid, should be between 0 and 1", | ||
| HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_PERCENT, minFreeSpaceRatio); | ||
| minFreeSpaceRatio = MIN_FREE_SPACE_UNSET; | ||
| } | ||
|
|
||
| final boolean minFreeSpaceConfigured = minFreeSpace >= 0; | ||
| final boolean minFreeSpaceRatioConfigured = minFreeSpaceRatio >= 0; | ||
|
|
||
| if (minFreeSpaceConfigured && minFreeSpaceRatioConfigured) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This previous logic doesn't make all sense, based on the L758 comments, I think we should honors min.free.space first if it's defined, then honor min.free.space.percent if min.free.space is not defined, then fallback to default 5GB if none of them are defined.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't really want to mix logic change with refactoring. We could tweak it in a dedicated task. While it would be a small code change in this place, tests also need to be updated, and the title "refactoring" would hide it.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Filed HDDS-12676. |
||
| LOG.warn("Only one of {}={} and {}={} should be set. With both set, default value ({}) will be used.", | ||
| HDDS_DATANODE_VOLUME_MIN_FREE_SPACE, | ||
| minFreeSpace, | ||
| HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_PERCENT, | ||
| minFreeSpaceRatio, | ||
| HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_DEFAULT); | ||
| } | ||
|
|
||
| if (minFreeSpaceConfigured == minFreeSpaceRatioConfigured) { | ||
| minFreeSpaceRatio = MIN_FREE_SPACE_UNSET; | ||
| minFreeSpace = getDefaultFreeSpace(); | ||
| } | ||
| } | ||
|
|
||
| public void setContainerDeleteThreads(int containerDeleteThreads) { | ||
|
|
@@ -737,6 +801,20 @@ public int getContainerCloseThreads() { | |
| return containerCloseThreads; | ||
| } | ||
|
|
||
| public long getMinFreeSpace(long capacity) { | ||
| return minFreeSpaceRatio >= 0 | ||
| ? ((long) (capacity * minFreeSpaceRatio)) | ||
| : minFreeSpace; | ||
| } | ||
|
|
||
| public long getMinFreeSpace() { | ||
| return minFreeSpace; | ||
| } | ||
|
|
||
| public float getMinFreeSpaceRatio() { | ||
| return minFreeSpaceRatio; | ||
| } | ||
|
|
||
| public long getPeriodicDiskCheckIntervalMinutes() { | ||
| return periodicDiskCheckIntervalMinutes; | ||
| } | ||
|
|
@@ -966,4 +1044,10 @@ public void setAutoCompactionSmallSstFileThreads( | |
| this.autoCompactionSmallSstFileThreads = | ||
| autoCompactionSmallSstFileThreads; | ||
| } | ||
|
|
||
| static long getDefaultFreeSpace() { | ||
| final StorageSize measure = StorageSize.parse(HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_DEFAULT); | ||
| return Math.round(measure.getUnit().toBytes(measure.getValue())); | ||
| } | ||
|
|
||
| } | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -35,7 +35,6 @@ | |
| import org.apache.hadoop.hdds.conf.ConfigurationSource; | ||
| import org.apache.hadoop.hdds.upgrade.HDDSLayoutFeature; | ||
| import org.apache.hadoop.hdfs.server.datanode.checker.VolumeCheckResult; | ||
| import org.apache.hadoop.ozone.container.common.statemachine.DatanodeConfiguration; | ||
| import org.apache.hadoop.ozone.container.common.utils.DatanodeStoreCache; | ||
| import org.apache.hadoop.ozone.container.common.utils.HddsVolumeUtil; | ||
| import org.apache.hadoop.ozone.container.common.utils.RawDB; | ||
|
|
@@ -142,7 +141,6 @@ private HddsVolume(Builder b) throws IOException { | |
| volumeIOStats = null; | ||
| volumeInfoMetrics = new VolumeInfoMetrics(b.getVolumeRootStr(), this); | ||
| } | ||
|
|
||
| } | ||
|
|
||
| @Override | ||
|
|
@@ -264,14 +262,13 @@ public synchronized VolumeCheckResult check(@Nullable Boolean unused) | |
| throws Exception { | ||
| VolumeCheckResult result = super.check(unused); | ||
|
|
||
| DatanodeConfiguration df = getConf().getObject(DatanodeConfiguration.class); | ||
| if (isDbLoadFailure()) { | ||
| LOG.warn("Volume {} failed to access RocksDB: RocksDB parent directory is null, " + | ||
| "the volume might not have been loaded properly.", getStorageDir()); | ||
| return VolumeCheckResult.FAILED; | ||
| } | ||
| if (result != VolumeCheckResult.HEALTHY || | ||
| !df.getContainerSchemaV3Enabled() || !isDbLoaded()) { | ||
| !getDatanodeConfig().getContainerSchemaV3Enabled() || !isDbLoaded()) { | ||
| return result; | ||
| } | ||
|
|
||
|
|
@@ -305,6 +302,10 @@ public long getCommittedBytes() { | |
| return committedBytes.get(); | ||
| } | ||
|
|
||
| public long getFreeSpaceToSpare(long volumeCapacity) { | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This function would get same result in each volume if input capacity is same?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If you mean input capacity: yes. I think this is the same question as #8119 (comment)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry for the typo😅 |
||
| return getDatanodeConfig().getMinFreeSpace(volumeCapacity); | ||
| } | ||
|
|
||
| public void setDbVolume(DbVolume dbVolume) { | ||
| this.dbVolume = dbVolume; | ||
| } | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please keep this in ozone-default.xml. Also we should add the other property here too. I prefer a centralized place ozone-default.xml over looking around the code base to find the property, which is not very friendly for normal Ozone users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ChenSammi for the review.
We should not duplicate property between
ozone-default.xmland annotated config objects. While looking atozone-default.xmlis convenient, searching (not looking) for property in code is not much more difficult (find in file vs. find in project). Both kinds of properties are listed in/configand/confendpoints. (I've just found that descriptions are missing, also for both kind of properties. Filed HDDS-12656.)Config objects have several benefits, which were described in the design doc accepted several years ago (HDDS-1466). We also try to further improve them (see HDDS-12424 as the most recent example).