Skip to content

Conversation

@siddhantsangwan
Copy link
Contributor

@siddhantsangwan siddhantsangwan commented Apr 4, 2023

What changes were proposed in this pull request?

ContainerBalancerSelectionCriteria has a Comparator orderContainersByUsedBytes which is used for comparing containers on the basis of used space. This comparator calls isContainerMoreUsed, which does not have consistent behaviour when used space for two containers is equal.

This bug was exposed when fixing the setup of TestContainerBalancerTask.
createReplicasForContainers() in TestContainerBalancerTask creates additional container replicas but does not add them to datanodeToContainersMap, the map being used to track these. This PR intends to fix the bug and the test setup.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8358

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @siddhantsangwan for the fixes.

When merging, I'd update the commit message to indicate a bug in container space comparator (which may affect production usage) is being fixed, instead of the test case fix that exposed this issue.

@siddhantsangwan siddhantsangwan changed the title HDDS-8358. Fix the test setup in TestContainerBalancerTask HDDS-8358. Fix the space usage comparator in ContainerBalancerSelectionCriteria Apr 4, 2023
@siddhantsangwan
Copy link
Contributor Author

Thanks for reviewing @adoroszlai. That's a good suggestion - I updated the title in the PR and the jira as well. Also updated javadoc.

@adoroszlai adoroszlai merged commit 34de64f into apache:master Apr 4, 2023
errose28 added a commit to errose28/ozone that referenced this pull request Apr 6, 2023
* master: (155 commits)
  update readme (apache#4535)
  HDDS-8374. Disable flaky unit test: TestContainerStateCounts
  HDDS-8016. updated the ozone doc for linked bucket and deletion async limitation (apache#4526)
  HDDS-8237. [Snapshot] loadDb() used by SstFiltering service creates extraneous directories. (apache#4446)
  HDDS-8035. Intermittent timeout in TestOzoneManagerHAWithData.testOMHAMetrics (apache#4362)
  HDDS-8039. Allow container inspector to run from ozone debug. (apache#4337)
  HDDS-8304. [Snapshot] Reduce flakiness in testSkipTrackingWithZeroSnapshot (apache#4487)
  HDDS-7974. [Snapshot] KeyDeletingService to be aware of Ozone snapshots (apache#4486)
  HDDS-8368. ReplicationManager: Create ContainerReplicaOp with correct target Datanode (apache#4532)
  HDDS-8358. Fix the space usage comparator in ContainerBalancerSelectionCriteria (apache#4527)
  HDDS-8359. ReplicationManager: Fix getContainerReplicationHealth() so that it builds ContainerCheckRequest correctly (apache#4528)
  HDDS-8361. Useless object in TestOzoneBlockTokenIdentifier (apache#4517)
  HDDS-8325. Consolidate and refine RocksDB metrics of services (apache#4506)
  HDDS-8135. Incorrect synchronization during certificate renewal in DefaultCertificateClient. (apache#4381)
  HDDS-8127. Exclude deleted containers from Recon container count (apache#4440)
  HDDS-8364. ReadReplicas may give wrong results with topology-aware read enabled (apache#4522)
  HDDS-8354. Avoid WARNING about ObjectEndpoint#get (apache#4515)
  HDDS-8324. DN data cache gets removed randomly asking for data from disk (apache#4499)
  HDDS-8291. Upgrade to Hadoop 3.3.5 (apache#4484)
  HDDS-8355. Mark TestOMRatisSnapshots#testInstallSnapshot as flaky
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants