Skip to content

Conversation

@devmadhuu
Copy link
Contributor

What changes were proposed in this pull request?

This PR addresses the issue of potential. memory overflow in case there are millions of containers because ContainerHealthTask tries to load all containers available in memory using containerManager.getContainers(), this API loads maximum available containers. So to avoid any memory overflow, this PR changes the API to use paginated based API.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-9819

How was this patch tested?

This patch was tested using existing junit test cases in order not to break anything.

@devmadhuu
Copy link
Contributor Author

@sumitagrawl @adoroszlai pls review.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @devmadhuu for the patch.

Comment on lines 299 to 300
Map<String, Long>>
unhealthyContainerStateStatsMap) {
unhealthyContainerStateStatsMap) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please avoid unnecessary space changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

if (container.isUnderReplicated()
&& !recordForStateExists.contains(
UnHealthyContainerStates.UNDER_REPLICATED.toString())) {
UnHealthyContainerStates.UNDER_REPLICATED.toString())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please avoid unnecessary space changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.


private void checkAndProcessContainers(
Map<UnHealthyContainerStates, Map<String, Long>>
unhealthyContainerStateStatsMap, long start, long currentTime) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start should not be a parameter, it should be set at the start of each iteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I made the changes. Thanks.

@adoroszlai
Copy link
Contributor

@adoroszlai adoroszlai changed the title HDDS-9819.Recon - Potential memory overflow in Container Health Task. HDDS-9819. Recon - Potential memory overflow in Container Health Task. Jan 3, 2024
@devmadhuu
Copy link
Contributor Author

@devmadhuu please check if test failure in TestReconTasks is related:

https://github.com/devmadhuu/ozone/actions/runs/7283607925/job/19848007957#step:6:167

Thanks @adoroszlai for review. I have resolved the test case failure.

@adoroszlai
Copy link
Contributor

Thanks @devmadhuu for updating the patch.

@ArafatKhan2198 please review

@adoroszlai adoroszlai requested a review from dombizita January 5, 2024 09:09
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the work on this @devmadhuu I have left a few comments.

unhealthyContainerStateStatsMap, long currentTime) {
ContainerID startID = ContainerID.valueOf(1);
List<ContainerInfo> containers = containerManager.getContainers(startID,
Integer.parseInt(DEFAULT_FETCH_COUNT));
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 Jan 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we replace the usage of Integer.parseInt(DEFAULT_FETCH_COUNT) with a named constant as DEFAULT_FETCH_COUNT is a constant.

private static final int FETCH_COUNT = Integer.parseInt(DEFAULT_FETCH_COUNT);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok.

containers = containerManager.getContainers(startID,
Integer.parseInt(DEFAULT_FETCH_COUNT));
}
containers.clear();
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 Jan 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The containers.clear(); statement inside the if block is meant to clear the list before fetching the next batch. However, the last containers.clear(); statement just outside the if block seems redundant since the loop condition (while (!containers.isEmpty())) ensures that it's only executed when containers is empty.?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 I couldn't understand this comment, as I don't see containers.clear() outside the loop. Can you pls clarify ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the earlier mistake in the comment. I have now corrected it. Please take a look.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, we have a case where last iteration in batch will not execute the if condition , so need to clear -off any in memory.

@ArafatKhan2198
Copy link
Contributor

Thanks for the changes @devmadhuu LGTM!

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@devmadhuu Thanks for working over this, have few comments

Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @devmadhuu for updating the patch. Please be careful when merging changes from master (sorry for the conflict).


List<UnhealthyContainers> all = unHealthyContainersTableHandle.findAll();
assertThat(all).isEmpty();
assertTrue(all.isEmpty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep assertThat, added by HDDS-10034.

Suggested change
assertTrue(all.isEmpty());
assertThat(all).isEmpty();

Comment on lines 335 to 336
assertTrue(taskStatus.getLastUpdatedTimestamp() >
currentTime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep assertThat, added by HDDS-10034.

Suggested change
assertTrue(taskStatus.getLastUpdatedTimestamp() >
currentTime);
assertThat(taskStatus.getLastUpdatedTimestamp()).isGreaterThan(currentTime);

Comment on lines 198 to 199
assertTrue(taskStatus.getLastUpdatedTimestamp() >
currentTime);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep assertThat, added by HDDS-10034.

Suggested change
assertTrue(taskStatus.getLastUpdatedTimestamp() >
currentTime);
assertThat(taskStatus.getLastUpdatedTimestamp()).isGreaterThan(currentTime);


List<UnhealthyContainers> all = unHealthyContainersTableHandle.findAll();
assertThat(all).isEmpty();
assertTrue(all.isEmpty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep assertThat, added by HDDS-10034.

Suggested change
assertTrue(all.isEmpty());
assertThat(all).isEmpty();

Comment on lines 22 to 26
import static org.assertj.core.api.Assertions.assertThat;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertNotNull;
import static org.junit.jupiter.api.Assertions.assertTrue;
import static org.mockito.ArgumentMatchers.any;
import static org.mockito.ArgumentMatchers.anyInt;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep assertThat instead of restoring assertTrue.

@adoroszlai adoroszlai merged commit 1398f58 into apache:master Jan 14, 2024
@adoroszlai
Copy link
Contributor

Thanks @devmadhuu for the patch, @ArafatKhan2198, @sumitagrawl for the review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants