Revert "Unwrap lazy blocks before expensive remote and local exchange operations" PR by neeradsomanchi · Pull Request #16773 · prestodb/presto

neeradsomanchi · 2021-09-21T01:37:15Z

This reverts #16617

The commits were causing previously successful queries to fail with EXCEEDED_LOCAL_MEMORY_LIMIT errors.

== NO RELEASE NOTE ==

This reverts commit bfd946f.

This reverts commit 703fe74.

This reverts commit 0ad84b2.

pettyjamesm

The changes in the revert look fine and safe to me, approved.

That said, I’m curious to know any details you might be able to provide about the problems you observed that were resolved by reverting these changes. I haven’t seen similar issues in our test environments and nothing in the PR being reverted should fundamentally require any more or less query memory- so this seems like an edge case that needs to be chased down.

pettyjamesm · 2021-09-21T13:06:05Z

presto-main/src/main/java/com/facebook/presto/operator/exchange/PartitioningExchanger.java

-            }
-            Page pageSplit;
-            if (partitionSize == page.getPositionCount()) {
-                pageSplit = page; // entire page will be sent to this partition, no copies necessary


Is it possible that the issue is that these pages were retaining more memory as a result of not having all of their positions explicitly copied out? If this partitioning exchanger was used to do a local exchange after a remote exchange (fairly common), then the issue with VariableWidthBlocks retaining their entire SerializedPage slice could indeed yield a much higher retained size than before.

Sorry about the delayed reply. @aweisberg do you have more context on this since you mentioned this PR in our discussion about the local memory issues?

I think I talked about this with James in Slack, but we didn't pick the correct way to follow up and get this performance improvement without also sometimes regressing on memory.

Since I had also made the regressing change to Trino, I followed up there with a fix and a longer description of the problem / alternatives. See: trinodb/trino#9327 (with a follow up PR in trinodb/trino#9379)

As for what to do in PrestoDB, there some unknowns about reintroducing this change with a similar fix on the Trino side, as I do not have a comparable testing environment to ensure that calling Page#compact() explicitly will perform comparably to Page#copyPositions in terms of throughput or reported memory usage for all workloads. Trino has made a variety of other localized changes to related classes (like adding explicit calls to Page#compact() in other places, revised operator and memory tracking implementations, etc). On the Athena side, we chose to take the performance hit of eagerly copying VariableWidthBlock instances out of their input slice during deserialization which means we didn't have the same issue, but that's an expensive hit to take to throughput.

So as I see it, we can either:

Take a harsh performance hit to VariableWidthBlock deserialization throughput, but address this basic memory tracking bug more thoroughly (ie: right now the memory tracking bug isn't typically reported because there is usually a partitioning local exchange after a remote exchange that forces the copy, but not always)

I can put together a PR with this change reintroduced, but with comparable fixes to the ones I made on the Trino side to mitigate the potential issue and let someone with time and realistic production workloads test them out.

Leave the change reverted (I have no objection with this option)

neeradsomanchi added 3 commits September 20, 2021 18:32

Revert "Improve PartitioningExchanger performance"

52f5cbc

This reverts commit bfd946f.

Revert "Ensure blocks are fully loaded by when enforcing output layouts"

aa780fa

This reverts commit 703fe74.

Revert "Record operator output before addPage in LocalExchangeSink"

dc10d18

This reverts commit 0ad84b2.

neeradsomanchi marked this pull request as ready for review September 21, 2021 03:35

mbasmanova requested a review from pettyjamesm September 21, 2021 03:38

neeradsomanchi requested review from highker and tdcmeehan September 21, 2021 03:39

pettyjamesm approved these changes Sep 21, 2021

View reviewed changes

pettyjamesm reviewed Sep 21, 2021

View reviewed changes

tdcmeehan merged commit cfc45b7 into prestodb:master Sep 21, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert "Unwrap lazy blocks before expensive remote and local exchange operations" PR#16773

Revert "Unwrap lazy blocks before expensive remote and local exchange operations" PR#16773
tdcmeehan merged 3 commits intoprestodb:masterfrom
neeradsomanchi:262-revert

neeradsomanchi commented Sep 21, 2021 •

edited

Loading

Uh oh!

pettyjamesm left a comment

Uh oh!

pettyjamesm Sep 21, 2021

Uh oh!

neeradsomanchi Sep 27, 2021

Uh oh!

aweisberg Sep 27, 2021

Uh oh!

pettyjamesm Sep 28, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

neeradsomanchi commented Sep 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pettyjamesm left a comment

Choose a reason for hiding this comment

Uh oh!

pettyjamesm Sep 21, 2021

Choose a reason for hiding this comment

Uh oh!

neeradsomanchi Sep 27, 2021

Choose a reason for hiding this comment

Uh oh!

aweisberg Sep 27, 2021

Choose a reason for hiding this comment

Uh oh!

pettyjamesm Sep 28, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

neeradsomanchi commented Sep 21, 2021 •

edited

Loading