Skip to content

Conversation

@alexjo2144
Copy link
Member

@alexjo2144 alexjo2144 commented May 28, 2021

This implementation only applies to tables using the ORC file format.

This is very much a WIP PR to get initial feedback

Fixes #5179

TODO:

  • Test with tables that don't have the Iceberg column id field

@cla-bot cla-bot bot added the cla-signed label May 28, 2021
@martint martint requested a review from phd3 May 28, 2021 16:33
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to store projected columns here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was based on the Hive implementation. Seems like the only place it's used is to avoid duplicating work if the same projections are given to the Metadata multiple times? #7360

It's also used for testing, but maybe we can skip it

Copy link
Member

@losipiuk losipiuk Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the only place it's used is to avoid duplicating work if the same projections are given to the Metadata multiple times

This is actually important. The contract is that if call to applyProjection is a no-op should return Optional.empty()
#7750 (comment)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the optimization in #7360 could also be achieved without looking at the tablehandle. but don't mind keeping it as is given other connectors do the same.

@alexjo2144 alexjo2144 force-pushed the iceberg/orc-projection-pushdown branch from f04e2ab to 75dba60 Compare June 28, 2021 21:36
@alexjo2144
Copy link
Member Author

Finally had a chance to get back to this PR. I added the PageSource level projection adapter for if two column handles overlap like a.b and a.b.c. I'm still working on getting Parquet working but I should have some more time later this week.

@alexjo2144 alexjo2144 requested a review from phd3 June 30, 2021 13:52
@alexjo2144
Copy link
Member Author

I added a second commit here with support for Parquet Iceberg tables. It was a little difference because the OrcReader accepts nested OrcColumns and knows how to read them back but parts of the ParquetReader column descriptions/types needed to be set up from the root of the column.

@alexjo2144 alexjo2144 requested a review from losipiuk June 30, 2021 13:54
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>= or >? Is x parent of x

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, for the one place this is used it is important that x is a parent of x. Could probably inline this method though, I thought it was going to be more re-usable than it was.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indexOfSublist is an option to avoid comparing sizes, but I'd leave it up to you. Also, we should just inline this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is preexistent; but check for empty is not obvious to me. Should we rather check if we are missing mapping for some columns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the assumption is that if the Id field exists for one column it must exist for all of them. I haven't seen a case where they don't exist yet.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a checkstate that it holds? So the result map is either empty or the size matches fileColumns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, columns can be missing if they've been added since this data file was created.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if an old table is migrated to iceberg without rewriting data (which is going to be most common scenario initially), files won't contain ids at all.

We should probably refactor this to inspect one field to see if it has an id, and set a flag. If true, everything is expected to have an id. That'd be easier to reason about than looking at emptiness of a map. But it doesn't need to be in scope here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if an old table is migrated to iceberg without rewriting data (which is going to be most common scenario initially), files won't contain ids at all.

Are there any tests for that situation? Would like to add a projected test case

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added some migrated cases to the Spark compatibility tests. @phd3 are you familiar with the Iceberg schema.name-mapping.default property? Looks like a more robust way we could be mapping column names for migrated tables.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we had the full Iceberg schema here as suggested in #8754 I think we could get the mapping from schema.findField(String name)

Copy link
Member

@losipiuk losipiuk Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that possible at this point? It feels we are rebuilding it above if it was empty.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like it builds a backup mapping by name for the columns where the id is empty

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a chance to get naming conflict here; get same projectedColumnName for different handles. It should not be possible if path elements and name cannot contain dots. It is a corner case, but still can you verify if that is possible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a validation for this in the Iceberg schema code:

trino:default> CREATE TABLE foo ("a.b" INT, a ROW (b INT));
Query 20210707_163341_00003_xkr2b failed: Invalid schema: multiple fields for name a.b: 1 and 3
org.apache.iceberg.exceptions.ValidationException: Invalid schema: multiple fields for name a.b: 1 and 3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i missed this comment so i checked this myself.
the code should be adorned with comment informing the reader about such limitations/assumptions where we leverage them

@alexjo2144 alexjo2144 force-pushed the iceberg/orc-projection-pushdown branch from c1ed4b3 to 32d6705 Compare October 25, 2021 17:42
@alexjo2144
Copy link
Member Author

Rebased for merge conflicts

@alexjo2144 alexjo2144 force-pushed the iceberg/orc-projection-pushdown branch from 32d6705 to d043e30 Compare October 27, 2021 15:09
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we end up projecting all of root.getNestedColumns(), each of them fully, could we return fullyProjectedLayout?

(OK to address in follow-up)

Comment on lines 677 to 680
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we're doing tree traversal here (effectively).
wonder whether we can improve representation of dereferences so that we don't need to copy all the elements when recursing (OK to address in follow-up)

Comment on lines 677 to 680
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we're doing tree traversal here (effectively).
wonder whether we can improve representation of dereferences so that we don't need to copy all the elements when recursing (OK to address in follow-up)

Consistently use the OrcColumn name between the FieldMapper and
FieldLayouts in the Orc reader. This fixes the case where a field
has been renamed and the Trino type name does not match the Orc
column name.

This is tested in following commit using Iceberg projection pushdown.
@alexjo2144 alexjo2144 force-pushed the iceberg/orc-projection-pushdown branch from d043e30 to 3aa8126 Compare November 4, 2021 15:20
@alexjo2144
Copy link
Member Author

Had to rebase to resolve merge conflicts but the last set of comments is in the last fixup commit. Thanks

@alexjo2144
Copy link
Member Author

Test failures are in trino-jdbc, unrelated

@findepi
Copy link
Member

findepi commented Nov 8, 2021

TODO:

  • Test with tables that don't have the Iceberg column id field

please make sure there is an issue for this

Test failures are in trino-jdbc, unrelated

please make sure there is an issue for this

Copy link
Member

@phd3 phd3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for your work on this @alexjo2144.

With Iceberg, pushdown on metadata side itself gives us huge benefits because of split generation time pruning. However, I think there're some missing pieces on pagesource side as mentioned in comments. Please let me know if there's an issue with my understanding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we intentionally not pushing predicates into parquet reader? (line 591)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably just because it's relatively new. The change allowing for passing a Predicate to ParquetReader was just merged in August. If it's useful here, I'd rather do it in followup, since it's related to all the column index changes to the parquet reader and I don't really understand those yet. 6eb42f2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mainly thinking of the pushdown in rowgroups (ORC stripe-equivalent, even before newly added code). but look like it's missing in hive connector too. cc @JamesRTaylor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i am not sure i understand the conclusion. should we have a follow-up issue for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exclusion of nested predicates was added initially when only ORC support was added for dereference pushdown. However, seems like it didn't change in #3396 while adding parquet support. I'm not sure if that was intentional. but you're right that this deserves a separate issue. #9928

For iceberg connector, I don't see corresponding changes like #3396 here for propagating projected layout, so assumed that we're tackling it separately.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@phd3 phd3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can merge this one PR with ORC improvements once #8129 (comment) is resolved, and handle Parquet stuff separately. Can you please squash commits?

@findepi
Copy link
Member

findepi commented Nov 10, 2021

CI #8611

Refactor needed for Iceberg projection pushdown. Iceberg
uses field ids to specify columns rather than column names.
Annotate nullable fields with the @nullable annotation
and allow for any ColumnHandle implementation.
@alexjo2144 alexjo2144 force-pushed the iceberg/orc-projection-pushdown branch from 3aa8126 to 01c38ed Compare November 10, 2021 17:41
@alexjo2144
Copy link
Member Author

alexjo2144 commented Nov 10, 2021

Created a follow up Issue: #9931 and squashed

Copy link
Member

@phd3 phd3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your work on this @alexjo2144 ! I'll merge this unless someone else has further comments.

@phd3 phd3 merged commit 5fc2d0d into trinodb:master Nov 20, 2021
@phd3
Copy link
Member

phd3 commented Nov 20, 2021

Merged, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

Dereference Pushdown for Iceberg Connector

6 participants