Skip to content

Spill large blocks to separate spill files in distinct aggregates#17096

Merged
highker merged 1 commit intoprestodb:masterfrom
pgupta2:spill_large_blocks
Feb 7, 2022
Merged

Spill large blocks to separate spill files in distinct aggregates#17096
highker merged 1 commit intoprestodb:masterfrom
pgupta2:spill_large_blocks

Conversation

@pgupta2
Copy link
Copy Markdown
Contributor

@pgupta2 pgupta2 commented Dec 14, 2021

Distinct Aggregates spilling logic compacts all values corresponding
to the same groupId into a single array block and put it in the output
block of the aggregator. This is done for each aggregate function
in the HashAggregationBuilder. If multiple distinct aggregate functions
are present, then we perform same kind of compaction into
corresponding blocks in the same position within a page.

Due to this, a single row within the page can become really huge and
when this page is spilled, it can fail with 'integer overflow' error during
serialization. Instead of compressing all values within a single array
block, we can spill them into a separate spill file, if the block size goes
beyond a certain threshold.

When large block spill is enabled, we use a hybrid approach for prepping
page for spill. If an array block for a groupId goes beyond a threshold, we
spill it into a separate spill file and store the serialized file handle in the
output block. If the size is within the threshold, we will directly store the
values in the output block. This approach enables us to perform a spill
only iff a block is huge. Since, this will happen rarely, most of the time,
we will be storing the data directly in the output block of the aggregate.

Test plan -

  • Unit Tests
  • Tested ~350 production pipelines that have distinct aggregates for correctness and performance.
== RELEASE NOTES ==

General Changes
* Add a new configuration property ``experimental.distinct-aggregation-large-block-spill-enabled`` to enable spilling of blocks that are larger than ``experimental.distinct-aggregation-large-block-size-threshold`` bytes into a separate spill file.  This can be overridden by ``distinct_aggregation_large_block_spill_enabled`` session property.
* Add a new configuration property ``experimental.distinct-aggregation-large-block-size-threshold`` to define the threshold size beyond which the block will be spilled into a separate spill file.  This can be overridden by ``distinct_aggregation_large_block_size_threshold`` session property.

@pgupta2 pgupta2 marked this pull request as draft December 14, 2021 08:09
@pgupta2 pgupta2 force-pushed the spill_large_blocks branch 4 times, most recently from cec2290 to 89188c3 Compare December 17, 2021 04:35
@pgupta2 pgupta2 marked this pull request as ready for review December 17, 2021 05:59
@pgupta2 pgupta2 requested review from a team and highker December 17, 2021 05:59
@highker
Copy link
Copy Markdown

highker commented Jan 5, 2022

The PR is on my radar. I will review it next week.

Copy link
Copy Markdown

@highker highker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • High-level comments + nits only. Didn't dig deep into logic details.
  • Do we have unit tests coverage?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generic comment: this class has become way too big. It would be good to split it into multiple files in a followup PR

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove else {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/page -> page.getPositionCount()/Page::getPositionCount

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to add a javadoc to explain the difference between this and Spiller

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would recommend to use a class to represent file handle like SpillerHandle or SpilledFileHandle or whatever the name making sense. byte[] could be confusing. My first glance on it made me think it's the returning stream.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: TempStorageStandaloneSpiller

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ioException == null is always true

Comment on lines 131 to 135
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we reuse or create spilling errors?

Comment on lines 156 to 157
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

directly return

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this line?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before exiting this method, finally will be called and we will try to close tempDataSink. If we get an exception during this operation, then we catch the exception and throw it again. We do need this line to throw the exception else it will be swallowed. Right?

@pgupta2 pgupta2 force-pushed the spill_large_blocks branch from 89188c3 to c66ae3b Compare February 3, 2022 05:27
@pgupta2
Copy link
Copy Markdown
Contributor Author

pgupta2 commented Feb 3, 2022

@highker : Sorry for getting late on this PR. I was busy with other stuff. I have addressed the comments so far.

Do we have unit tests coverage?

TestSpilledAggregationWithLargeBlockSpillingEnabled.java enables large block spilling and some of the queries trigger the large block spill path as well. So, we do have unit test coverage for this.

@pgupta2 pgupta2 requested a review from highker February 3, 2022 05:30
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this interface going to take SerializedTempStorageHandle?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call this SerializedStorageHandle?

@highker highker self-assigned this Feb 3, 2022
Distinct Aggregates spilling logic compacts all values corresponding
to the same groupId into a single array block and put it in the output
block of the aggregator. This is done for each aggregate function
in the HashAggregationBuilder. If multiple distinct aggregate functions
are present, then we perform same kind of compaction into
corresponding blocks in the same position within a page.

Due to this, a single row within the page can become really huge and
when this page is spilled, it can fail with 'integer overflow' error during
serialization. Instead of compressing all values within a single array
block, we can spill them into a separate spill file, if the block size goes
beyond a certain threshold.

When large block spill is enabled, we use a hybrid approach for prepping
page for spill. If an array block for a groupId goes beyond a threshold, we
spill it into a separate spill file and store the serialized file handle in the
output block. If the size is within the threshold, we will directly store the
values in the output block. This approach enables us to perform a spill
only iff a block is huge. Since, this will happen rarely, most of the time,
we will be storing the data directly in the output block of the aggregate.
@pgupta2 pgupta2 force-pushed the spill_large_blocks branch from c66ae3b to 4e6c726 Compare February 6, 2022 19:26
@pgupta2 pgupta2 requested a review from highker February 6, 2022 19:29
@highker highker merged commit d37ac07 into prestodb:master Feb 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants