Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -59,14 +59,6 @@ public final class HddsConfigKeys {
public static final String HDDS_DATANODE_VOLUME_CHOOSING_POLICY =
"hdds.datanode.volume.choosing.policy";

public static final String HDDS_DATANODE_VOLUME_MIN_FREE_SPACE =
"hdds.datanode.volume.min.free.space";
public static final String HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_DEFAULT =
"5GB";

public static final String HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_PERCENT =
"hdds.datanode.volume.min.free.space.percent";

public static final String HDDS_DB_PROFILE = "hdds.db.profile";

// Once a container usage crosses this threshold, it is eligible for
Expand Down
11 changes: 0 additions & 11 deletions hadoop-hdds/common/src/main/resources/ozone-default.xml
Original file line number Diff line number Diff line change
Expand Up @@ -218,17 +218,6 @@
This volume choosing policy randomly chooses two volumes with remaining space and then picks the one with lower utilization.
</description>
</property>
<property>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep this in ozone-default.xml. Also we should add the other property here too. I prefer a centralized place ozone-default.xml over looking around the code base to find the property, which is not very friendly for normal Ozone users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ChenSammi for the review.

We should not duplicate property between ozone-default.xml and annotated config objects. While looking at ozone-default.xml is convenient, searching (not looking) for property in code is not much more difficult (find in file vs. find in project). Both kinds of properties are listed in /config and /conf endpoints. (I've just found that descriptions are missing, also for both kind of properties. Filed HDDS-12656.)

Config objects have several benefits, which were described in the design doc accepted several years ago (HDDS-1466). We also try to further improve them (see HDDS-12424 as the most recent example).

<name>hdds.datanode.volume.min.free.space</name>
<value>5GB</value>
<tag>OZONE, CONTAINER, STORAGE, MANAGEMENT</tag>
<description>
This determines the free space to be used for closing containers
When the difference between volume capacity and used reaches this number,
containers that reside on this volume will be closed and no new containers
would be allocated on this volume.
</description>
</property>
<property>
<name>hdds.container.ratis.enabled</name>
<value>false</value>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,18 @@ void set(ConfigurationTarget target, String key, Object value,
Config config) {
target.setDouble(key, (double) value);
}
},
FLOAT {
@Override
Float parse(String value, Config config, Class<?> type, String key) {
return Float.parseFloat(value);
}

@Override
void set(ConfigurationTarget target, String key, Object value,
Config config) {
target.setFloat(key, (float) value);
}
};

abstract Object parse(String value, Config config, Class<?> type, String key)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ default void setDouble(String name, double value) {
set(name, Double.toString(value));
}

default void setFloat(String name, float value) {
set(name, Float.toString(value));
}

default void setBoolean(String name, boolean value) {
set(name, Boolean.toString(value));
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,6 @@ public class HddsDispatcher implements ContainerDispatcher, Auditor {
private ContainerMetrics metrics;
private final TokenVerifier tokenVerifier;
private long slowOpThresholdNs;
private VolumeUsage.MinFreeSpaceCalculator freeSpaceCalculator;

/**
* Constructs an OzoneContainer that receives calls from
Expand Down Expand Up @@ -146,7 +145,6 @@ public HddsDispatcher(ConfigurationSource config, ContainerSet contSet,
LOG,
HddsUtils::processForDebug,
HddsUtils::processForDebug);
this.freeSpaceCalculator = new VolumeUsage.MinFreeSpaceCalculator(conf);
}

@Override
Expand Down Expand Up @@ -619,7 +617,7 @@ private boolean isVolumeFull(Container container) {
if (isOpen) {
HddsVolume volume = container.getContainerData().getVolume();
SpaceUsageSource usage = volume.getCurrentUsage();
long volumeFreeSpaceToSpare = freeSpaceCalculator.get(usage.getCapacity());
long volumeFreeSpaceToSpare = volume.getFreeSpaceToSpare(usage.getCapacity());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine, just not looking that straightforward that need a capacity parameter as the volume has the capacity value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also don't like to pass capacity to volume, so I plan to further refactor in follow-up. But wanted to avoid bloating this patch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created HDDS-12670 for follow-up.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ChenSammi @peterxcli #8167 addresses this.

return !VolumeUsage.hasVolumeEnoughSpace(usage.getAvailable(), volume.getCommittedBytes(), 0,
volumeFreeSpaceToSpare);
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,12 +19,10 @@

import java.io.IOException;
import org.apache.hadoop.fs.StorageType;
import org.apache.hadoop.hdds.conf.ConfigurationSource;
import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.MetadataStorageReportProto;
import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.StorageReportProto;
import org.apache.hadoop.hdds.protocol.proto.StorageContainerDatanodeProtocolProtos.StorageTypeProto;
import org.apache.hadoop.ozone.container.common.interfaces.StorageLocationReportMXBean;
import org.apache.hadoop.ozone.container.common.volume.VolumeUsage;

/**
* Storage location stats of datanodes that provide back store for containers.
Expand Down Expand Up @@ -168,11 +166,6 @@ private static StorageType getStorageType(StorageTypeProto proto) throws
* @throws IOException In case, the storage type specified is invalid.
*/
public StorageReportProto getProtoBufMessage() throws IOException {
return getProtoBufMessage(null);
}

public StorageReportProto getProtoBufMessage(ConfigurationSource conf)
throws IOException {
StorageReportProto.Builder srb = StorageReportProto.newBuilder();
return srb.setStorageUuid(getId())
.setCapacity(getCapacity())
Expand All @@ -182,8 +175,7 @@ public StorageReportProto getProtoBufMessage(ConfigurationSource conf)
.setStorageType(getStorageTypeProto())
.setStorageLocation(getStorageLocation())
.setFailed(isFailed())
.setFreeSpaceToSpare(conf != null ?
new VolumeUsage.MinFreeSpaceCalculator(conf).get(getCapacity()) : 0)
.setFreeSpaceToSpare(getFreeSpaceToSpare())
.build();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,11 @@
package org.apache.hadoop.ozone.container.common.statemachine;

import static java.util.concurrent.TimeUnit.MICROSECONDS;
import static org.apache.hadoop.hdds.conf.ConfigTag.CONTAINER;
import static org.apache.hadoop.hdds.conf.ConfigTag.DATANODE;
import static org.apache.hadoop.hdds.conf.ConfigTag.MANAGEMENT;
import static org.apache.hadoop.hdds.conf.ConfigTag.OZONE;
import static org.apache.hadoop.hdds.conf.ConfigTag.STORAGE;
import static org.apache.hadoop.ozone.container.common.statemachine.DatanodeConfiguration.CONFIG_PREFIX;

import java.time.Duration;
Expand All @@ -28,6 +32,7 @@
import org.apache.hadoop.hdds.conf.ConfigType;
import org.apache.hadoop.hdds.conf.PostConstruct;
import org.apache.hadoop.hdds.conf.ReconfigurableConfig;
import org.apache.hadoop.hdds.conf.StorageSize;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand Down Expand Up @@ -68,6 +73,13 @@ public class DatanodeConfiguration extends ReconfigurableConfig {
"hdds.datanode.disk.check.min.gap";
public static final String DISK_CHECK_TIMEOUT_KEY =
"hdds.datanode.disk.check.timeout";
public static final String HDDS_DATANODE_VOLUME_MIN_FREE_SPACE =
"hdds.datanode.volume.min.free.space";
public static final String HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_DEFAULT =
"5GB";
public static final String HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_PERCENT =
"hdds.datanode.volume.min.free.space.percent";
static final byte MIN_FREE_SPACE_UNSET = -1;

public static final String WAIT_ON_ALL_FOLLOWERS =
"hdds.datanode.wait.on.all.followers";
Expand Down Expand Up @@ -319,6 +331,25 @@ public void setBlockDeletionLimit(int limit) {
this.blockLimitPerInterval = limit;
}

@Config(key = "hdds.datanode.volume.min.free.space",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DatanodeConfiguration has the key prefix "hdds.datanode" defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is now valid after HDDS-12424. It makes searching for config key easier.

defaultValue = "-1",
type = ConfigType.SIZE,
tags = { OZONE, CONTAINER, STORAGE, MANAGEMENT },
description = "This determines the free space to be used for closing containers" +
" When the difference between volume capacity and used reaches this number," +
" containers that reside on this volume will be closed and no new containers" +
" would be allocated on this volume."
)
private long minFreeSpace = MIN_FREE_SPACE_UNSET;

@Config(key = "hdds.datanode.volume.min.free.space.percent",
defaultValue = "-1",
type = ConfigType.FLOAT,
tags = { OZONE, CONTAINER, STORAGE, MANAGEMENT },
description = "" // not documented
)
private float minFreeSpaceRatio = MIN_FREE_SPACE_UNSET;

@Config(key = "periodic.disk.check.interval.minutes",
defaultValue = "60",
type = ConfigType.LONG,
Expand Down Expand Up @@ -719,6 +750,39 @@ public void validate() {
rocksdbDeleteObsoleteFilesPeriod =
ROCKSDB_DELETE_OBSOLETE_FILES_PERIOD_MICRO_SECONDS_DEFAULT;
}

validateMinFreeSpace();
}

/**
* If 'hdds.datanode.volume.min.free.space' is defined,
* it will be honored first. If it is not defined and
* 'hdds.datanode.volume.min.free.space.percent' is defined, it will honor this
* else it will fall back to 'hdds.datanode.volume.min.free.space.default'
*/
private void validateMinFreeSpace() {
if (minFreeSpaceRatio > 1) {
LOG.warn("{} = {} is invalid, should be between 0 and 1",
HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_PERCENT, minFreeSpaceRatio);
minFreeSpaceRatio = MIN_FREE_SPACE_UNSET;
}

final boolean minFreeSpaceConfigured = minFreeSpace >= 0;
final boolean minFreeSpaceRatioConfigured = minFreeSpaceRatio >= 0;

if (minFreeSpaceConfigured && minFreeSpaceRatioConfigured) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This previous logic doesn't make all sense, based on the L758 comments, I think we should honors min.free.space first if it's defined, then honor min.free.space.percent if min.free.space is not defined, then fallback to default 5GB if none of them are defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really want to mix logic change with refactoring. We could tweak it in a dedicated task. While it would be a small code change in this place, tests also need to be updated, and the title "refactoring" would hide it.

Copy link
Contributor

@ChenSammi ChenSammi Mar 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Filed HDDS-12676.

LOG.warn("Only one of {}={} and {}={} should be set. With both set, default value ({}) will be used.",
HDDS_DATANODE_VOLUME_MIN_FREE_SPACE,
minFreeSpace,
HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_PERCENT,
minFreeSpaceRatio,
HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_DEFAULT);
}

if (minFreeSpaceConfigured == minFreeSpaceRatioConfigured) {
minFreeSpaceRatio = MIN_FREE_SPACE_UNSET;
minFreeSpace = getDefaultFreeSpace();
}
}

public void setContainerDeleteThreads(int containerDeleteThreads) {
Expand All @@ -737,6 +801,20 @@ public int getContainerCloseThreads() {
return containerCloseThreads;
}

public long getMinFreeSpace(long capacity) {
return minFreeSpaceRatio >= 0
? ((long) (capacity * minFreeSpaceRatio))
: minFreeSpace;
}

public long getMinFreeSpace() {
return minFreeSpace;
}

public float getMinFreeSpaceRatio() {
return minFreeSpaceRatio;
}

public long getPeriodicDiskCheckIntervalMinutes() {
return periodicDiskCheckIntervalMinutes;
}
Expand Down Expand Up @@ -966,4 +1044,10 @@ public void setAutoCompactionSmallSstFileThreads(
this.autoCompactionSmallSstFileThreads =
autoCompactionSmallSstFileThreads;
}

static long getDefaultFreeSpace() {
final StorageSize measure = StorageSize.parse(HDDS_DATANODE_VOLUME_MIN_FREE_SPACE_DEFAULT);
return Math.round(measure.getUnit().toBytes(measure.getValue()));
}

}
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,7 @@ public boolean test(HddsVolume vol) {
long free = usage.getAvailable();
long committed = vol.getCommittedBytes();
long available = free - committed;
long volumeFreeSpaceToSpare =
new VolumeUsage.MinFreeSpaceCalculator(vol.getConf()).get(volumeCapacity);
long volumeFreeSpaceToSpare = vol.getFreeSpaceToSpare(volumeCapacity);
boolean hasEnoughSpace = VolumeUsage.hasVolumeEnoughSpace(free, committed,
requiredSpace, volumeFreeSpaceToSpare);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@
import org.apache.hadoop.hdds.conf.ConfigurationSource;
import org.apache.hadoop.hdds.upgrade.HDDSLayoutFeature;
import org.apache.hadoop.hdfs.server.datanode.checker.VolumeCheckResult;
import org.apache.hadoop.ozone.container.common.statemachine.DatanodeConfiguration;
import org.apache.hadoop.ozone.container.common.utils.DatanodeStoreCache;
import org.apache.hadoop.ozone.container.common.utils.HddsVolumeUtil;
import org.apache.hadoop.ozone.container.common.utils.RawDB;
Expand Down Expand Up @@ -142,7 +141,6 @@ private HddsVolume(Builder b) throws IOException {
volumeIOStats = null;
volumeInfoMetrics = new VolumeInfoMetrics(b.getVolumeRootStr(), this);
}

}

@Override
Expand Down Expand Up @@ -264,14 +262,13 @@ public synchronized VolumeCheckResult check(@Nullable Boolean unused)
throws Exception {
VolumeCheckResult result = super.check(unused);

DatanodeConfiguration df = getConf().getObject(DatanodeConfiguration.class);
if (isDbLoadFailure()) {
LOG.warn("Volume {} failed to access RocksDB: RocksDB parent directory is null, " +
"the volume might not have been loaded properly.", getStorageDir());
return VolumeCheckResult.FAILED;
}
if (result != VolumeCheckResult.HEALTHY ||
!df.getContainerSchemaV3Enabled() || !isDbLoaded()) {
!getDatanodeConfig().getContainerSchemaV3Enabled() || !isDbLoaded()) {
return result;
}

Expand Down Expand Up @@ -305,6 +302,10 @@ public long getCommittedBytes() {
return committedBytes.get();
}

public long getFreeSpaceToSpare(long volumeCapacity) {
Copy link
Member

@peterxcli peterxcli Mar 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function would get same result in each volume if input capacity is same?
But I guess it's intentional to reduce the test code update?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same result in each volume if input volume is same?

If you mean input capacity: yes. I think this is the same question as #8119 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the typo😅

return getDatanodeConfig().getMinFreeSpace(volumeCapacity);
}

public void setDbVolume(DbVolume dbVolume) {
this.dbVolume = dbVolume;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -469,6 +469,7 @@ public StorageLocationReport[] getStorageReport() {
long remaining = 0;
long capacity = 0;
long committed = 0;
long spare = 0;
String rootDir = "";
failed = true;
if (volumeInfo.isPresent()) {
Expand All @@ -478,8 +479,9 @@ public StorageLocationReport[] getStorageReport() {
scmUsed = usage.getUsedSpace();
remaining = usage.getAvailable();
capacity = usage.getCapacity();
committed = (volume instanceof HddsVolume) ?
((HddsVolume) volume).getCommittedBytes() : 0;
HddsVolume hddsVolume = volume instanceof HddsVolume ? (HddsVolume) volume : null;
committed = hddsVolume != null ? hddsVolume.getCommittedBytes() : 0;
spare = hddsVolume != null ? hddsVolume.getFreeSpaceToSpare(capacity) : 0;
failed = false;
} catch (UncheckedIOException ex) {
LOG.warn("Failed to get scmUsed and remaining for container " +
Expand All @@ -500,6 +502,7 @@ public StorageLocationReport[] getStorageReport() {
.setRemaining(remaining)
.setScmUsed(scmUsed)
.setCommitted(committed)
.setFreeSpaceToSpare(spare)
.setStorageType(volume.getStorageType());
StorageLocationReport r = builder.build();
reports[counter++] = r;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -115,7 +115,8 @@ public enum VolumeState {
private long cTime; // creation time of the file system state
private int layoutVersion; // layout version of the storage data

private ConfigurationSource conf;
private final ConfigurationSource conf;
private final DatanodeConfiguration dnConf;

private final File storageDir;
private String workingDirName;
Expand Down Expand Up @@ -150,10 +151,9 @@ protected StorageVolume(Builder<?> b) throws IOException {
this.state = VolumeState.NOT_INITIALIZED;
this.clusterID = b.clusterID;
this.datanodeUuid = b.datanodeUuid;
this.conf = b.conf;

DatanodeConfiguration dnConf =
conf.getObject(DatanodeConfiguration.class);
this.conf = b.conf;
this.dnConf = conf.getObject(DatanodeConfiguration.class);
this.ioTestCount = dnConf.getVolumeIOTestCount();
this.ioFailureTolerance = dnConf.getVolumeIOFailureTolerance();
this.ioTestSlidingWindow = new LinkedList<>();
Expand All @@ -167,6 +167,8 @@ protected StorageVolume(Builder<?> b) throws IOException {
this.state = VolumeState.FAILED;
this.ioTestCount = 0;
this.ioFailureTolerance = 0;
this.conf = null;
this.dnConf = null;
}
}

Expand Down Expand Up @@ -529,6 +531,10 @@ public ConfigurationSource getConf() {
return conf;
}

public DatanodeConfiguration getDatanodeConfig() {
return dnConf;
}

public void failVolume() {
setState(VolumeState.FAILED);
volumeInfo.ifPresent(VolumeInfo::shutdownUsageThread);
Expand Down
Loading