Skip to content

Conversation

@sodonnel
Copy link
Contributor

What changes were proposed in this pull request?

In ContainerStateMap, there are several maps to hold various details, eg:

private final Map<ContainerID, ContainerInfo> containerMap;
private final Map<ContainerID, Set<ContainerReplica>> replicaMap;

When we add a new container, we add an entry to both of these sets. When a container is removed, we don’t see to remove from replicaMap (see below). There doesn’t seem to be any way to remove the replicas later once the containerMap entry is gone, so removing the container is leaking the replicas.

  public void removeContainer(final ContainerID id) {
    Preconditions.checkNotNull(id, "ContainerID cannot be null");
    if (contains(id)) {
      // Should we revert back to the original state if any of the below
      // remove operation fails?
      final ContainerInfo info = containerMap.remove(id);
      lifeCycleStateMap.remove(info.getState(), id);
      ownerMap.remove(info.getOwner(), id);
      repConfigMap.remove(info.getReplicationConfig(), id);
      typeMap.remove(info.getReplicationType(), id);
      // Flush the cache of this container type.
      flushCache(info);
      LOG.trace("Container {} removed from ContainerStateMap.", id);
    }
  } 

You cannot remove the replicas anyway later, as the methods check if the container exists first, which it no longer will, eg:

public void removeContainerReplica(final ContainerID containerID,
    final ContainerReplica replica) {
  Preconditions.checkNotNull(containerID);
  Preconditions.checkNotNull(replica);
  if (contains(containerID)) {
    replicaMap.get(containerID).remove(replica);
  }
} 

Note that deleting a container seems to be a rare operation (eg delete it manually from the CLI). Empty containers are currently marked as deleted, but as far as I can tell, they are not actually removed from SCM.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-6313

How was this patch tested?

Existing tests

Copy link
Contributor

@swagle swagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM

@JacksonYao287
Copy link
Contributor

LGTM +1!

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch.

@sodonnel sodonnel merged commit 41651e8 into apache:master Feb 15, 2022
errose28 added a commit to errose28/ozone that referenced this pull request Feb 17, 2022
* master: (43 commits)
  HDDS-6212. SCM Container DB bootstrap on Recon startup for secure cluster (apache#3027)
  HDDS-6234. Repair containers affected by incorrect used bytes and block count. (apache#3042)
  HDDS-6262. ozone insight log stops working after OM DBUpdates message (apache#3044)
  HDDS-6290. operational-state and node-state options in datanode list CLI not working correctly (apache#3105)
  HDDS-6314. ConcurrentModificationException getting SCMContainerMetrics (apache#3101)
  HDDS-6284. Add BlockDeletingService worker size config (apache#3056)
  HDDS-6324. Do not trigger CI by reopening PR (apache#3092)
  HDDS-6283. Change ContainerStateMachine ContainerOpExecutor name (apache#3055)
  HDDS-6331. Remove toString in debug log parameters within SCMCommonPlacementPolicy (apache#3098)
  HDDS-6330. Remove unnecessary duplicate semicolons (apache#3097)
  HDDS-6305: Add metrics - number of FSO bucket creates (apache#3077)
  HDDS-6311. Fix number of keys displayed in Recon Overview. (apache#3081)
  HDDS-6325. Fix interface ClientProtocol methods typo setThreadLocalS3Auth and clearThreadLocalS3Auth (apache#3093)
  HDDS-6322. Fix Recon getting inccorrect sequenceNumber from OM (apache#3090)
  HDDS-5913. Avoid integer overflow when setting dfs.container.ratis.lo… (apache#2785)
  HDDS-6313. Remove replicas in ContainerStateMap when a container is deleted (apache#3086)
  HDDS-6186. Selective checks: skip integration check for unit test changes (apache#3061)
  HDDS-6310. Update json-smart to 2.4.7. (apache#3080)
  HDDS-6190. Cleanup unnecessary datanode id path checks. (apache#2993)
  HDDS-6304. Add enforcer to make sure ozone.version equals project.version (apache#3075)
  ...
errose28 added a commit to errose28/ozone that referenced this pull request Mar 30, 2022
* HDDS-4944: (268 commits)
  HDDS-6366. [Multi-Tenant] Disallow specifying custom accessId in OzoneManager (apache#3166)
  HDDS-6275. [Multi-Tenant] Add feature documentation and CLI quick start guide (apache#3095)
  HDDS-6063. [Multi-Tenant] Use VOLUME_LOCK in read and write requests, and some minor refactoring (apache#3051)
  HDDS-6214. [Multi-Tenant] Fix KMS Encryption/Decryption (apache#3010)
  HDDS-6322. Fix Recon getting inccorrect sequenceNumber from OM (apache#3090)
  HDDS-5913. Avoid integer overflow when setting dfs.container.ratis.lo… (apache#2785)
  HDDS-6313. Remove replicas in ContainerStateMap when a container is deleted (apache#3086)
  HDDS-6186. Selective checks: skip integration check for unit test changes (apache#3061)
  HDDS-6310. Update json-smart to 2.4.7. (apache#3080)
  HDDS-6190. Cleanup unnecessary datanode id path checks. (apache#2993)
  HDDS-6304. Add enforcer to make sure ozone.version equals project.version (apache#3075)
  HDDS-6309. Update ozone-runner version to 20220212-1 (apache#3079)
  HDDS-6293. Allow using custom ozone-runner image (apache#3072)
  HDDS-4126. Freon key generator should support >2GB files. (apache#3054)
  HDDS-6088. Implement O3FS/OFS getFileChecksum() using file checksum helpers - addendum: fix checkstyle
  HDDS-6088. Implement O3FS/OFS getFileChecksum() using file checksum helpers. (apache#2935)
  HDDS-6084. [Multi-Tenant] Handle upgrades to version supporting S3 multi-tenancy (apache#3018)
  HDDS-6257. Wrong stack trace for S3 errors (apache#3066)
  HDDS-6278 Improve memory profile for listStatus API call. (apache#3053)
  HDDS-6285. ozonesecure-mr intermittently failing with timeout downloading packages (apache#3057)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants