Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -359,6 +359,10 @@ public final class ScmConfigKeys {
"ozone.scm.container.size";
public static final String OZONE_SCM_CONTAINER_SIZE_DEFAULT = "5GB";

public static final String OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m thinking this patch only needs to add a config to set the required data size (dataSizeRequired), which we can then read from ReplicationManagerUtil here: https://github.com/peterxcli/ozone/blob/7868a862ff616154e4bd9ab52dada929e15042ec/hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ReplicationManagerUtil.java#L93-L94.
@siddhantsangwan does this approach look correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @peterxcli ! I've tried to simplify the approach removed the new config entirely and directly use HddsServerUtil.requiredReplicationSpace() which already provides the 2x multiplier.

"ozone.scm.container.space.requirement.multiplier";
public static final double OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER_DEFAULT = 5.0;
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description states the default multiplier is 2.0, but the implementation sets it to 5.0. This inconsistency should be resolved. Either update the PR description to reflect the actual default of 5.0, or change the default value in the code to 2.0 to match the description.

Suggested change
public static final double OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER_DEFAULT = 5.0;
public static final double OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER_DEFAULT = 2.0;

Copilot uses AI. Check for mistakes.

public static final String OZONE_SCM_CONTAINER_LOCK_STRIPE_SIZE =
"ozone.scm.container.lock.stripes";
public static final int OZONE_SCM_CONTAINER_LOCK_STRIPE_SIZE_DEFAULT = 512;
Expand Down
16 changes: 16 additions & 0 deletions hadoop-hdds/common/src/main/resources/ozone-default.xml
Original file line number Diff line number Diff line change
Expand Up @@ -1035,6 +1035,22 @@
balances the amount of metadata.
</description>
</property>
<property>
<name>ozone.scm.container.space.requirement.multiplier</name>
<value>5.0</value>
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description states the default multiplier is 2.0, but the implementation sets it to 5.0. This inconsistency should be resolved. Either update the PR description to reflect the actual default of 5.0, or change the default value in the code to 2.0 to match the description.

Copilot uses AI. Check for mistakes.
<tag>OZONE, SCM, MANAGEMENT</tag>
<description>
Multiplier for container space requirement when checking if a datanode
has enough space for container allocation. The required space is calculated
as container size multiplied by this value. This prevents concurrent clients
from all passing the space check when there's only enough space for one
container. For example, with default container size of 5GB and multiplier
of 5.0, the system will require 25GB of available space before allocating
a new container. This ensures that if only 6GB is remaining, the check will
fail, preventing multiple clients from attempting to create containers
concurrently when there's only space for one.
</description>
Comment on lines +1042 to +1052
Copy link

Copilot AI Dec 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example in the documentation is incorrect. With a default container size of 5GB and a multiplier of 5.0, the required space would be 25GB (5GB * 5.0), not 10GB as mentioned in the PR description. The documentation correctly states 25GB, but this contradicts the PR description which mentions increasing from 5GB to 10GB (which would require a multiplier of 2.0).

Copilot uses AI. Check for mistakes.
</property>
<property>
<name>ozone.scm.container.lock.stripes</name>
<value>512</value>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -432,6 +432,13 @@ public static long requiredReplicationSpace(long defaultContainerSize) {
return 2 * defaultContainerSize;
}

public static long requiredReplicationSpace(long defaultContainerSize, ConfigurationSource conf) {
double multiplier = conf.getDouble(
ScmConfigKeys.OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER,
ScmConfigKeys.OZONE_SCM_CONTAINER_SPACE_REQUIREMENT_MULTIPLIER_DEFAULT);
return (long) (multiplier * defaultContainerSize);
}

public static Collection<String> getDatanodeStorageDirs(ConfigurationSource conf) {
Collection<String> rawLocations = conf.getTrimmedStringCollection(HDDS_DATANODE_DIR_KEY);
if (rawLocations.isEmpty()) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@
import org.apache.hadoop.conf.StorageUnit;
import org.apache.hadoop.hdds.client.ECReplicationConfig;
import org.apache.hadoop.hdds.client.ReplicationConfig;
import org.apache.hadoop.hdds.conf.OzoneConfiguration;
import org.apache.hadoop.hdds.protocol.proto.HddsProtos;
import org.apache.hadoop.hdds.protocol.proto.HddsProtos.ContainerInfoProto;
import org.apache.hadoop.hdds.protocol.proto.HddsProtos.LifeCycleEvent;
Expand All @@ -46,6 +47,7 @@
import org.apache.hadoop.hdds.scm.ha.SequenceIdGenerator;
import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
import org.apache.hadoop.hdds.scm.pipeline.PipelineManager;
import org.apache.hadoop.hdds.utils.HddsServerUtil;
import org.apache.hadoop.hdds.utils.db.Table;
import org.apache.hadoop.ozone.common.statemachine.InvalidStateTransitionException;
import org.apache.hadoop.util.Time;
Expand All @@ -71,6 +73,7 @@ public class ContainerManagerImpl implements ContainerManager {

private final SCMHAManager haManager;
private final SequenceIdGenerator sequenceIdGen;
private final OzoneConfiguration conf;

// TODO: Revisit this.
// Metrics related to operations should be moved to ProtocolServer
Expand Down Expand Up @@ -99,6 +102,11 @@ public ContainerManagerImpl(
this.pipelineManager = pipelineManager;
this.haManager = scmHaManager;
this.sequenceIdGen = sequenceIdGen;
if (conf instanceof OzoneConfiguration) {
this.conf = (OzoneConfiguration) conf;
} else {
this.conf = new OzoneConfiguration(conf);
}
this.containerStateManager = ContainerStateManagerImpl.newBuilder()
.setConfiguration(conf)
.setPipelineManager(pipelineManager)
Expand Down Expand Up @@ -352,23 +360,25 @@ public ContainerInfo getMatchingContainer(final long size, final String owner,
synchronized (pipeline.getId()) {
containerIDs = getContainersForOwner(pipeline, owner);
if (containerIDs.size() < getOpenContainerCountPerPipeline(pipeline)) {
if (pipelineManager.hasEnoughSpace(pipeline, maxContainerSize)) {
long requiredSpace = HddsServerUtil.requiredReplicationSpace(maxContainerSize, conf);
if (pipelineManager.hasEnoughSpace(pipeline, requiredSpace)) {
allocateContainer(pipeline, owner);
containerIDs = getContainersForOwner(pipeline, owner);
} else {
LOG.debug("Cannot allocate a new container because pipeline {} does not have the required space {}.",
pipeline, maxContainerSize);
pipeline, requiredSpace);
}
}
containerIDs.removeAll(excludedContainerIDs);
containerInfo = containerStateManager.getMatchingContainer(
size, owner, pipeline.getId(), containerIDs);
if (containerInfo == null) {
if (pipelineManager.hasEnoughSpace(pipeline, maxContainerSize)) {
long requiredSpace = HddsServerUtil.requiredReplicationSpace(maxContainerSize, conf);
if (pipelineManager.hasEnoughSpace(pipeline, requiredSpace)) {
containerInfo = allocateContainer(pipeline, owner);
} else {
LOG.debug("Cannot allocate a new container because pipeline {} does not have the required space {}.",
pipeline, maxContainerSize);
pipeline, requiredSpace);
}
}
return containerInfo;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
import org.apache.hadoop.hdds.scm.node.NodeManager;
import org.apache.hadoop.hdds.scm.node.NodeStatus;
import org.apache.hadoop.hdds.scm.node.states.NodeNotFoundException;
import org.apache.hadoop.hdds.utils.HddsServerUtil;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

Expand Down Expand Up @@ -86,9 +87,10 @@ public synchronized Pipeline create(ECReplicationConfig replicationConfig)
protected Pipeline create(ECReplicationConfig replicationConfig,
List<DatanodeDetails> excludedNodes, List<DatanodeDetails> favoredNodes)
throws IOException {
long requiredSpace = HddsServerUtil.requiredReplicationSpace(containerSizeBytes, conf);
List<DatanodeDetails> dns = placementPolicy
.chooseDatanodes(excludedNodes, favoredNodes,
replicationConfig.getRequiredNodes(), 0, this.containerSizeBytes);
replicationConfig.getRequiredNodes(), 0, requiredSpace);
return create(replicationConfig, dns);
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
import org.apache.hadoop.hdds.scm.pipeline.leader.choose.algorithms.LeaderChoosePolicy;
import org.apache.hadoop.hdds.scm.pipeline.leader.choose.algorithms.LeaderChoosePolicyFactory;
import org.apache.hadoop.hdds.server.events.EventPublisher;
import org.apache.hadoop.hdds.utils.HddsServerUtil;
import org.apache.hadoop.ozone.protocol.commands.ClosePipelineCommand;
import org.apache.hadoop.ozone.protocol.commands.CommandForDatanode;
import org.apache.hadoop.ozone.protocol.commands.CreatePipelineCommand;
Expand Down Expand Up @@ -162,10 +163,11 @@ public synchronized Pipeline create(RatisReplicationConfig replicationConfig,

final ReplicationFactor factor =
replicationConfig.getReplicationFactor();
long requiredSpace = HddsServerUtil.requiredReplicationSpace(containerSizeBytes, conf);
switch (factor) {
case ONE:
dns = pickNodesNotUsed(replicationConfig, minRatisVolumeSizeBytes,
containerSizeBytes, conf);
requiredSpace, conf);
break;
case THREE:
List<DatanodeDetails> excludeDueToEngagement = filterPipelineEngagement();
Expand All @@ -178,7 +180,7 @@ public synchronized Pipeline create(RatisReplicationConfig replicationConfig,
}
dns = placementPolicy.chooseDatanodes(excludedNodes,
favoredNodes, factor.getNumber(), minRatisVolumeSizeBytes,
containerSizeBytes);
requiredSpace);
break;
default:
throw new IllegalStateException("Unknown factor: " + factor.name());
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,11 @@
import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.ArgumentMatchers.anyLong;
import static org.mockito.ArgumentMatchers.eq;
import static org.mockito.Mockito.atLeast;
import static org.mockito.Mockito.doReturn;
import static org.mockito.Mockito.mock;
import static org.mockito.Mockito.reset;
import static org.mockito.Mockito.spy;
import static org.mockito.Mockito.times;
import static org.mockito.Mockito.verify;
Expand Down Expand Up @@ -57,6 +60,7 @@
import org.apache.hadoop.hdds.scm.pipeline.MockPipelineManager;
import org.apache.hadoop.hdds.scm.pipeline.Pipeline;
import org.apache.hadoop.hdds.scm.pipeline.PipelineManager;
import org.apache.hadoop.hdds.utils.HddsServerUtil;
import org.apache.hadoop.hdds.utils.db.DBStore;
import org.apache.hadoop.hdds.utils.db.DBStoreBuilder;
import org.apache.hadoop.ozone.common.statemachine.InvalidStateTransitionException;
Expand Down Expand Up @@ -179,6 +183,40 @@ public void testGetMatchingContainerReturnsContainerWhenEnoughSpaceInDatanodes()
assertNotNull(container);
}

@Test
public void testContainerSpaceRequirement() throws IOException {
long sizeRequired = 256 * 1024 * 1024;
long containerSize = 5L * 1024 * 1024 * 1024;

PipelineManager spyPipelineManager = spy(pipelineManager);
File tempDir = new File(testDir, "tempDir");
OzoneConfiguration conf = SCMTestUtils.getConf(tempDir);
long expectedSpaceRequirement = HddsServerUtil.requiredReplicationSpace(containerSize, conf);

ContainerManager manager = new ContainerManagerImpl(conf,
scmhaManager, sequenceIdGen, spyPipelineManager,
SCMDBDefinition.CONTAINERS.getTable(dbStore), pendingOpsMock);

Pipeline pipeline = spyPipelineManager.getPipelines().iterator().next();

doReturn(false).when(spyPipelineManager)
.hasEnoughSpace(any(Pipeline.class), anyLong());
ContainerInfo container = manager
.getMatchingContainer(sizeRequired, "test", pipeline, Collections.emptySet());
assertNull(container);
verify(spyPipelineManager, atLeast(1))
.hasEnoughSpace(eq(pipeline), eq(expectedSpaceRequirement));

reset(spyPipelineManager);
doReturn(true).when(spyPipelineManager)
.hasEnoughSpace(any(Pipeline.class), anyLong());
container = manager
.getMatchingContainer(sizeRequired, "test", pipeline, Collections.emptySet());
assertNotNull(container);
verify(spyPipelineManager, atLeast(1))
.hasEnoughSpace(eq(pipeline), eq(expectedSpaceRequirement));
}

@Test
void testUpdateContainerState() throws Exception {
final ContainerInfo container = containerManager.allocateContainer(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@
import org.apache.hadoop.hdds.scm.node.NodeManager;
import org.apache.hadoop.hdds.scm.node.NodeStatus;
import org.apache.hadoop.hdds.scm.node.states.NodeNotFoundException;
import org.apache.hadoop.hdds.utils.HddsServerUtil;
import org.junit.jupiter.api.BeforeEach;
import org.junit.jupiter.api.Test;

Expand All @@ -68,17 +69,18 @@ public class TestECPipelineProvider {
private PipelineStateManager stateManager =
mock(PipelineStateManager.class);
private PlacementPolicy placementPolicy = mock(PlacementPolicy.class);
private long containerSizeBytes;
private long containerSpaceRequirement;

@BeforeEach
public void setup() throws IOException, NodeNotFoundException {
OzoneConfiguration conf = new OzoneConfiguration();
provider = new ECPipelineProvider(
nodeManager, stateManager, conf, placementPolicy);
this.containerSizeBytes = (long) conf.getStorageSize(
long containerSizeBytes = (long) conf.getStorageSize(
ScmConfigKeys.OZONE_SCM_CONTAINER_SIZE,
ScmConfigKeys.OZONE_SCM_CONTAINER_SIZE_DEFAULT,
StorageUnit.BYTES);
this.containerSpaceRequirement = HddsServerUtil.requiredReplicationSpace(containerSizeBytes, conf);
// Placement policy will always return EC number of random nodes.
when(placementPolicy.chooseDatanodes(anyList(),
anyList(), anyInt(), anyLong(),
Expand Down Expand Up @@ -200,7 +202,7 @@ public void testExcludedAndFavoredNodesPassedToPlacementPolicy()
assertEquals(ecConf.getData() + ecConf.getParity(), pipeline.getNodes().size());

verify(placementPolicy).chooseDatanodes(excludedNodes, favoredNodes,
ecConf.getRequiredNodes(), 0, containerSizeBytes);
ecConf.getRequiredNodes(), 0, containerSpaceRequirement);
}

private Set<ContainerReplica> createContainerReplicas(int number) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ data:
OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
OZONE-SITE.XML_ozone.datanode.pipeline.limit: "1"
OZONE-SITE.XML_hdds.datanode.volume.min.free.space: "1GB"
OZONE-SITE.XML_ozone.scm.container.size: "1GB"
OZONE-SITE.XML_ozone.metadata.dirs: "/data/metadata"
OZONE-SITE.XML_ozone.om.address: "om-0.om"
OZONE-SITE.XML_ozone.recon.address: "recon-0.recon"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ data:
OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
OZONE-SITE.XML_ozone.datanode.pipeline.limit: "1"
OZONE-SITE.XML_hdds.datanode.volume.min.free.space: "1GB"
OZONE-SITE.XML_ozone.scm.container.size: "1GB"
OZONE-SITE.XML_ozone.metadata.dirs: /data/metadata
OZONE-SITE.XML_ozone.om.address: om-0.om
OZONE-SITE.XML_ozone.recon.address: recon-0.recon
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ data:
OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
OZONE-SITE.XML_ozone.datanode.pipeline.limit: "1"
OZONE-SITE.XML_hdds.datanode.volume.min.free.space: "1GB"
OZONE-SITE.XML_ozone.scm.container.size: "1GB"
OZONE-SITE.XML_ozone.metadata.dirs: /data/metadata
OZONE-SITE.XML_ozone.om.address: om-0.om
OZONE-SITE.XML_ozone.recon.address: recon-0.recon
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ data:
OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
OZONE-SITE.XML_ozone.datanode.pipeline.limit: "1"
OZONE-SITE.XML_hdds.datanode.volume.min.free.space: "1GB"
OZONE-SITE.XML_ozone.scm.container.size: "1GB"
OZONE-SITE.XML_ozone.metadata.dirs: /data/metadata
OZONE-SITE.XML_ozone.om.address: om-0.om
OZONE-SITE.XML_ozone.recon.address: recon-0.recon
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ data:
OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
OZONE-SITE.XML_ozone.datanode.pipeline.limit: "1"
OZONE-SITE.XML_hdds.datanode.volume.min.free.space: "1GB"
OZONE-SITE.XML_ozone.scm.container.size: "1GB"
OZONE-SITE.XML_ozone.metadata.dirs: /data/metadata
OZONE-SITE.XML_ozone.om.address: om-0.om
OZONE-SITE.XML_ozone.recon.address: recon-0.recon
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ data:
OZONE-SITE.XML_hdds.scm.safemode.min.datanode: "3"
OZONE-SITE.XML_ozone.datanode.pipeline.limit: "1"
OZONE-SITE.XML_hdds.datanode.volume.min.free.space: "1GB"
OZONE-SITE.XML_ozone.scm.container.size: "1GB"
OZONE-SITE.XML_ozone.metadata.dirs: /data/metadata
OZONE-SITE.XML_ozone.om.address: om-0.om
OZONE-SITE.XML_ozone.recon.address: recon-0.recon
Expand Down
Loading