Skip to content

Aggregation ORDER BY & DISTINCT spilling#14527

Merged
highker merged 1 commit intoprestodb:masterfrom
sachdevs:agg-orderby-spilling
Jul 2, 2020
Merged

Aggregation ORDER BY & DISTINCT spilling#14527
highker merged 1 commit intoprestodb:masterfrom
sachdevs:agg-orderby-spilling

Conversation

@sachdevs
Copy link
Contributor

@sachdevs sachdevs commented May 14, 2020

This PR implements ORDER BY and DISTINCT spilling for use in aggregation functions. Will be publishing docs on implementation details and updating this PR in the future.

== RELEASE NOTES ==

General Changes
* Add local disk spilling support for aggregation functions with `ORDER BY` or `DISTINCT` syntax.

@sachdevs sachdevs requested a review from highker May 14, 2020 05:52
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed offline, let's spill RowBlocks for intermediate states.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from eb2fdeb to a69f43e Compare May 15, 2020 20:55
@sachdevs sachdevs requested a review from highker May 15, 2020 20:56
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The overall idea is on the right direction. Here is the high-level design suggestion to unify the solution for both ordering and distincting cases:

  1. Both Ordering and Distincting (Grouped)Accumulators do not have xxxIntermediateXxx() interfaces implemented. Because they do not support partial aggregation. So let's create a new class called FinalOnly(Grouped)Accumulator. This class should implement all xxxIntermediateXxx() interfaces by throwing UnsupportedOperationException. Have both Ordering and Distincting (Grouped)Accumulators inheriting FinalOnly(Grouped)Accumulator.
  2. Create new classes inheriting (Grouped)Accumulators called SpillableFinalOnly(Grouped)Accumulator. The classes take FinalOnly(Grouped)Accumulators as a delegate. SpillableFinalOnly(Grouped)Accumulator is responsible for maintaining the hashtable state. Use ObjectBigArray for the hashtable. (Check my inline comment).
  3. SpillableFinalOnly(Grouped)Accumulator should implement xxxIntermediateXxx() interfaces. Together with the addInput interface, they should build the hashtable for RowBlock <-> Page mapping. As we have discussed offline. The hashtable is only to accumulate the original data by grouping data into different group Ids so that we can spill by groups. (We do not lose information at this step though it may use quite a lot of memory).
  4. SpillableFinalOnly(Grouped)Accumulator prepare/evalFinal interfaces should call addInput/prepareFinal/evalFinal interfaces for its FinalOnly(Grouped)Accumulator delegate.
  5. Directly create corresponding FinalOnly(Grouped)Accumulator if spilling is not enabled. Otherwise, create SpillableFinalOnly(Grouped)Accumulator wrapping FinalOnly(Grouped)Accumulator.

cc @arhimondr in case there is better design.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch 2 times, most recently from 3912702 to 9909c3d Compare May 19, 2020 03:04
@sachdevs sachdevs requested a review from highker May 19, 2020 03:09
@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from 9909c3d to 6639b91 Compare May 19, 2020 20:06
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's clean up the PR with the design proposed. That will make the review easier.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from 6639b91 to 2073f62 Compare May 29, 2020 22:04
@sachdevs
Copy link
Contributor Author

sachdevs commented May 29, 2020

@highker planning on doing a full refactor with proposed design once distinct is working. I pushed my work so far, having some trouble with array_agg, will update when it works.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from 2073f62 to 2e869d8 Compare June 1, 2020 17:35
Copy link
Contributor Author

@sachdevs sachdevs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Distinct and order by spilling work now. There are one main glaring issue to iron out before we continue, we need to figure out how to reliably detect if unspill is happening. Due to the current logic, things like array_agg(DISTINCT x ORDER BY y) doesn't work as well as simple things like, "SELECT count(distinct x), y FROM t GROUP BY y" since we look at the last block of the unspilt page to see if it is an array block. See the attached comment for details.

@sachdevs sachdevs requested a review from highker June 1, 2020 17:41
@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from 2e869d8 to ac1d598 Compare June 1, 2020 18:47
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a fast skim through DistinctingGroupedAccumulator. The logic looks legit. (Didn't get into details line by line).

In terms of if (page.getBlock(page.getChannelCount() - 1) instanceof ArrayBlock) {, We might be able to add a new interface for GroupedAccumulator to hint the input is a new input or an unspilled input. But I would hold this new interface until we have the abstraction ready. Then we can better evaluate how to have the interface change.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch 4 times, most recently from 7dc0fb3 to ecd5f18 Compare June 4, 2020 20:10
@sachdevs sachdevs requested a review from highker June 4, 2020 20:10
Copy link
Contributor Author

@sachdevs sachdevs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Summarized the main issues with the design so far.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from ecd5f18 to ec43556 Compare June 4, 2020 20:26
@sachdevs sachdevs changed the title [WIP] Aggregation order by spilling draft [WIP] Aggregation ORDER BY & DISTINCT spilling Jun 4, 2020
@highker highker changed the title [WIP] Aggregation ORDER BY & DISTINCT spilling Aggregation ORDER BY & DISTINCT spilling Jun 5, 2020
@sachdevs sachdevs force-pushed the agg-orderby-spilling branch 3 times, most recently from 3691c6e to e7b5eb5 Compare June 16, 2020 20:09
@sachdevs
Copy link
Contributor Author

sachdevs commented Jun 17, 2020

Will address these comments + look into page compaction for next iteration.

EDIT as discussed offline, page compaction/ memory reduction for distinct has been made into its own task.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from e7b5eb5 to 14268e4 Compare June 17, 2020 21:16
@sachdevs sachdevs requested a review from highker June 17, 2020 21:18
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments only

Comment on lines 168 to 173
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this part necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah since memory usage is updated in updateMemory() to be set to the size of the empty hash agg builder. This is not correct since we never spilt in the first place in startMemoryRevoke.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding comments.

Comment on lines 158 to 162
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, the original logic can only happen if an operator that has never spilled and just started to build final results; however, a revoke request comes in. Am I right? If that is the case, shall we make a comment here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah so essentially we are declining memory revoking in the case that the hashaggbuilder has already completed. This is only set to true when InMemoryHashAggregationBuilder.buildResult is called - NOT InMemoryHashAggregationBuilder.buildHashSortedResult. This is because after buildResult, spilling should be impossible, because it can no longer process any more input anyway.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch 2 times, most recently from f69f151 to 0086104 Compare June 24, 2020 22:38
@sachdevs sachdevs requested a review from highker June 25, 2020 01:37
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would be surprised if spillEnabled cannot be tunneled from LocalExecutionPlanner... Can we give it a try and see what will happen?

@jainxrohit jainxrohit self-requested a review June 25, 2020 23:28
@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from 0086104 to 8f9f6ea Compare June 30, 2020 18:06
@sachdevs sachdevs requested review from highker and rschlussel June 30, 2020 18:06
@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from 8f9f6ea to 58dddc4 Compare June 30, 2020 23:28
@sachdevs
Copy link
Contributor Author

Last update should fix any failing checks.

Comment on lines 241 to 247
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? the function is to create IntermediateAccumulator but here it delegates to GroupedAccumulator and then use createDefaultGroupedAccumulator instead of createDefaultGroupedIntermediateAccumulator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes since we do the same logic in either case since we do not care about the underlying accumulator working on intermediate values in our FinalOnlyGroupedAccumulator/SpillableFinalOnlyGroupedAccumulator delegate. If I was to separate these functions the resulting logic should be the same. It originally was separate but I noticed it could be simplified.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch 2 times, most recently from 25d84bd to 6107334 Compare July 1, 2020 19:20
@sachdevs sachdevs requested a review from highker July 1, 2020 19:21
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comments

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, is it even possible we need createSpillableGroupedIntermediateAccumulator()? Can we check if the following is good enough?

    @Override
    public GroupedAccumulator createGroupedIntermediateAccumulator()
    {
        checkState(!hasDistinct() || !hasOrderBy(), "distinct or order by cannot have partial aggregation");
        
        try {
            return groupedAccumulatorConstructor.newInstance(stateDescriptors, ImmutableList.of(), Optional.empty(), lambdaProviders);
        }
        catch (InstantiationException | IllegalAccessException | InvocationTargetException e) {
            throw new RuntimeException(e);
        }
    }

Copy link
Contributor Author

@sachdevs sachdevs Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need this since before:

  1. createGroupedIntermediateAccumulator used to be never called (in case of ORDER BY or DISTINCT) since order by and distinct did not have support for intermediate results. Hence, when spill is enabled, we use createGroupedIntermediateAccumulator to recreate the intermediate version of the accumulator with the spillable wrapper when unspilling and creating intermediate accumulators.

  2. checkState(!hasDistinct() || !hasOrderBy()) not sure what this does exactly in this context since having distinct AND orderby shouldnt be a state failure. createGroupedIntermediateAccumulator can be called with hasDistinct/hasOrderBy set to true.

  3. return groupedAccumulatorConstructor.newInstance would be a bug since this references the underlying accumulator (NOT order by or distinct, but the accumulator inside of order by or distinct). This means that during intermediate accumulation we would get non-distinct non-ordered values.

I actually tried writing this code without this section in a previous iteration but I realized we need the spillable wrapper in the intermediate case.

@sachdevs sachdevs force-pushed the agg-orderby-spilling branch from 6107334 to b53600f Compare July 2, 2020 18:49
@sachdevs sachdevs requested a review from highker July 2, 2020 18:49
Copy link

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants