Skip to content

Releasing child request builder memory from BulkRequestBuilder#105194

Merged
masseyke merged 6 commits intoelastic:mainfrom
masseyke:release-bulk-request-builder-memory
Feb 7, 2024
Merged

Releasing child request builder memory from BulkRequestBuilder#105194
masseyke merged 6 commits intoelastic:mainfrom
masseyke:release-bulk-request-builder-memory

Conversation

@masseyke
Copy link
Copy Markdown
Member

@masseyke masseyke commented Feb 6, 2024

One disadvantage of the builder changes made at #104927 is that the peak memory usage is higher since when BulkRequestBuilder's request() method is called, we have to have the memory to hold onto all of the request builders for the child requests as well as all child requests themselves. This PR changes that peak memory usage by releasing each child request builder as soon as the child request has been created, so we're never in a situation where we have to hold all child request builders and all child requests in memory at once. It has the side effect that the request() method can only be called once. That does not seem inconsistent with the idea of a builder or our usage of it though. It's a pattern we've discussed enforcing on all of our builders.

@masseyke masseyke added >non-issue :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. labels Feb 6, 2024
@masseyke masseyke marked this pull request as ready for review February 6, 2024 16:46
@elasticsearchmachine
Copy link
Copy Markdown
Collaborator

Pinging @elastic/es-distributed (Team:Distributed)

@elasticsearchmachine elasticsearchmachine added the Team:Distributed Meta label for distributed team. label Feb 6, 2024
@joegallo
Copy link
Copy Markdown
Contributor

joegallo commented Feb 6, 2024

The code looks fine to me, but I'd like the formal 👍/👎 review to come from somebody with more knowledge of the memory profile of a running Elasticsearch process and thus the consequences of this PR (i.e. somebody from @elastic/es-distributed).

.iterator(); requestsIter.hasNext();) {
DocWriteRequest<?> childRequest = requestsIter.next().request();
request.add(childRequest);
requestsIter.remove(); // The inner request builder can now be garbage collected
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe removing items one-by-one from an ArrayList iterator is an O(N²) operation because each .remove() call has to shuffle all the remaining elements down by one. Maybe just set them to null instead and then clear the list at the end? Or maybe use a queue instead?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe use a queue instead?

Yeah ArrayDeque is essentially an ArrayList with a moveable head so it'd work better here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can avoid explicitly using the iterator and just clear the ArrayList after iterating over it.

for (var requestBuilder : requestBuilders) {
    request.add(requestBuilder.request());
}
requestBuilders.clear();

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting until the end and then clearing all the builders still has the 2x peak memory usage problem that we're trying to solve.

Copy link
Copy Markdown
Member Author

@masseyke masseyke Feb 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I'll rework this. According to the ArrayDeque javadocs

Most ArrayDeque operations run in amortized constant time. Exceptions include remove, removeFirstOccurrence, removeLastOccurrence, contains, iterator.remove(), and the bulk operations, all of which run in linear time.

So it might not be the way to go either. Maybe null and clear (or new list -- seems like i've seen clear be slow for large lists) is the way to go.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But you wouldn't need to use any of those O(N) operations with ArrayDeque, you'd be calling pollFirst() which is definitely O(1).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, right. :)

Copy link
Copy Markdown
Member

@DaveCTurner DaveCTurner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

* In the following loop we intentionally remove the builders from requestBuilders so that they can be garbage collected. This is
* so that we don't require double the memory of all of the inner requests, which could be really bad for a lage bulk request.
*/
for (ActionRequestLazyBuilder<? extends DocWriteRequest<?>, ? extends DocWriteResponse> builder = requestBuilders
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: may as well not waste half a line spelling out the type signature :)

Suggested change
for (ActionRequestLazyBuilder<? extends DocWriteRequest<?>, ? extends DocWriteResponse> builder = requestBuilders
for (final var builder = requestBuilders

@masseyke masseyke merged commit d8fdf6f into elastic:main Feb 7, 2024
@masseyke masseyke deleted the release-bulk-request-builder-memory branch February 7, 2024 16:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. >non-issue Team:Distributed Meta label for distributed team. v8.13.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants