Skip to content

Short circuit page source creation in Iceberg#11976

Merged
findepi merged 1 commit intotrinodb:masterfrom
findinpath:short-circuit-pruned-iceberg-page
Apr 21, 2022
Merged

Short circuit page source creation in Iceberg#11976
findepi merged 1 commit intotrinodb:masterfrom
findinpath:short-circuit-pruned-iceberg-page

Conversation

@findinpath
Copy link
Contributor

@findinpath findinpath commented Apr 15, 2022

Description

In case that the dynamic filter completes after scheduling of split
on the worker the results in the split will be getting pruned.

Is this change a fix, improvement, new feature, refactoring, or other?

Fix

Is this a change to the core query engine, a connector, client library, or the SPI interfaces? (be specific)

Iceberg connector

How would you describe this change to a non-technical end user or system administrator?

See https://trinodb.slack.com/archives/CJ6UC075E/p1649977257483049 Slack discussion

Related issues, pull requests, and links

Documentation

(x) No documentation is needed.
( ) Sufficient documentation is included in this PR.
( ) Documentation PR is available with #prnumber.
( ) Documentation issue #issuenumber is filed, and can be handled later.

Release notes

( ) No release notes entries required.
(x) Release notes entries required with the following suggested text:

# Iceberg
* Fix query failures with iceberg when a dynamic filter prunes a split on worker node

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, can you try taking similar approach as TestIcebergSplitSource to create IcebergTableHandle ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. Indeed the creation of the table schema was much too verbose. I opted for using the iceberg Schema object for this purpose.

Adding the overhead of using a query runner just for retrieving the table schema seems not that good to me because the maintainer of the code might think that this is an integration test when it is a simple unit test.

@findinpath findinpath force-pushed the short-circuit-pruned-iceberg-page branch from 266592f to a583b61 Compare April 19, 2022 09:23
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of assertion on page source class, please assert on the output page which comes from calling ConnectorPageSource#getNextPage.

@findinpath findinpath force-pushed the short-circuit-pruned-iceberg-page branch from a583b61 to 3ebc62e Compare April 19, 2022 20:12
Copy link
Member

@raunaqmorarka raunaqmorarka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm % minor comments

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no need for this to be a partitioned column, we will be able to prune splits even with DF on a data column

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed all references to "partition" which may be indeed misleading for the functionality tested.

@findinpath findinpath force-pushed the short-circuit-pruned-iceberg-page branch from 3ebc62e to d0bef4a Compare April 20, 2022 09:13
In case that the dynamic filter completes after scheduling of split
on the worker, the results in the split will be getting pruned.
@findinpath findinpath force-pushed the short-circuit-pruned-iceberg-page branch from d0bef4a to a3bb77d Compare April 20, 2022 09:15
@findepi
Copy link
Member

findepi commented Apr 20, 2022

@findinpath @raunaqmorarka please formulate a RN entry for this, i..e. what are the triggering conditions for the bug this is fixing.

@findinpath
Copy link
Contributor Author

RN updated.

@findepi
Copy link
Member

findepi commented Apr 21, 2022

Thanks @findinpath @raunaqmorarka

@findepi findepi merged commit a3f5996 into trinodb:master Apr 21, 2022
@findepi findepi mentioned this pull request Apr 21, 2022
@github-actions github-actions bot added this to the 378 milestone Apr 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

3 participants