Skip to content

HDDS-13639. Optimize container iterator for frequent operation#9147

Merged
sumitagrawl merged 4 commits intoapache:masterfrom
sarvekshayr:HDDS-13639
Oct 24, 2025
Merged

HDDS-13639. Optimize container iterator for frequent operation#9147
sumitagrawl merged 4 commits intoapache:masterfrom
sarvekshayr:HDDS-13639

Conversation

@sarvekshayr
Copy link
Contributor

What changes were proposed in this pull request?

In a datanode environment with 900k containers, frequent operations like ContainerController.getContainerCount(HddsVolume) and iterating through containers per volume were extremely slow. Looping 1k times took 2.5 minutes.

Added two new data structures to ContainerSet:

  • volumeToContainersMap: ConcurrentHashMap<String, ConcurrentSkipListSet>
    Maps volume UUID -> sorted set of container IDs on that volume
    Updated on every addContainer() and removeContainer().

  • volumeContainerCountCache: ConcurrentHashMap<String, AtomicLong>
    Caches the count of containers per volume

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-13639

How was this patch tested?

Added testContainerCountPerVolume and testContainerIteratorPerVolume in TestContainerSet.

@priyeshkaratha
Copy link
Contributor

priyeshkaratha commented Oct 15, 2025

Hi @sarvekshayr, thanks for working on this. The overall code looks good to me.
I have a few suggestions:
I noticed that two new maps have been introduced — the first one’s size will be proportional to the number of containers. It might be helpful to include a note in the PR about the expected memory overhead. For example, with around 900k containers and 50 volumes, the additional heap usage could be roughly 50–70 MB.

From my understanding, since the computation primarily happens during container additions, the performance impact should be minimal. However, it would be good to document these details in the PR description, covering both the heap memory footprint and performance considerations.

@sumitagrawl what is your thoughts on this?

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sarvekshayr We need change implementation to keep only containerId in HddsVolume itself, being populated / managed via ContainerSet.

@sumitagrawl
Copy link
Contributor

@priyeshkaratha Good point for consideration of memory foot print,
We can use ConcurrentHashSet, whose memory footprint for every Long element added is 56 bytes, so overall will be 900k * 56 bytes in this exampl => 50.4 MB.

Performance impact is while below cases:

  1. metrics publish for container count for the volume
  2. container iterator for the volume
  3. getting volume used bytes

So earlier to get any volume specific information, it need iterate 900k times, but volume is having only 10k containers as applicable.

As per test locally, if iterate 10k containers to get data, performance is approx 3 times faster. But for Size as used in metrics of volume container count, its just constant time

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sumitagrawl sumitagrawl merged commit 65fb295 into apache:master Oct 24, 2025
43 checks passed
swamirishi pushed a commit to swamirishi/ozone that referenced this pull request Dec 3, 2025
jojochuang pushed a commit to jojochuang/ozone that referenced this pull request Jan 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants