Implement Dereference pushdown for the Iceberg connector #8129

alexjo2144 · 2021-05-28T16:30:21Z

~~This implementation only applies to tables using the ORC file format.~~

~~This is very much a WIP PR to get initial feedback~~

Fixes #5179

TODO:

Test with tables that don't have the Iceberg column id field

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

phd3 · 2021-05-28T21:21:57Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

why do we need to store projected columns here?

That was based on the Hive implementation. Seems like the only place it's used is to avoid duplicating work if the same projections are given to the Metadata multiple times? #7360

It's also used for testing, but maybe we can skip it

Seems like the only place it's used is to avoid duplicating work if the same projections are given to the Metadata multiple times

This is actually important. The contract is that if call to applyProjection is a no-op should return Optional.empty()
#7750 (comment)

I think the optimization in #7360 could also be achieved without looking at the tablehandle. but don't mind keeping it as is given other connectors do the same.

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

alexjo2144 · 2021-06-28T21:38:23Z

Finally had a chance to get back to this PR. I added the PageSource level projection adapter for if two column handles overlap like a.b and a.b.c. I'm still working on getting Parquet working but I should have some more time later this week.

alexjo2144 · 2021-06-30T13:54:39Z

I added a second commit here with support for Parquet Iceberg tables. It was a little difference because the OrcReader accepts nested OrcColumns and knows how to read them back but parts of the ParquetReader column descriptions/types needed to be set up from the root of the column.

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergColumnHandle.java

losipiuk · 2021-07-01T10:12:43Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergColumnHandle.java

>= or >? Is x parent of x

Hmm, for the one place this is used it is important that x is a parent of x. Could probably inline this method though, I thought it was going to be more re-usable than it was.

indexOfSublist is an option to avoid comparing sizes, but I'd leave it up to you. Also, we should just inline this

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSource.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

losipiuk · 2021-07-01T15:16:48Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

this is preexistent; but check for empty is not obvious to me. Should we rather check if we are missing mapping for some columns?

I think the assumption is that if the Id field exists for one column it must exist for all of them. I haven't seen a case where they don't exist yet.

Can we add a checkstate that it holds? So the result map is either empty or the size matches fileColumns?

Well, columns can be missing if they've been added since this data file was created.

if an old table is migrated to iceberg without rewriting data (which is going to be most common scenario initially), files won't contain ids at all.

We should probably refactor this to inspect one field to see if it has an id, and set a flag. If true, everything is expected to have an id. That'd be easier to reason about than looking at emptiness of a map. But it doesn't need to be in scope here.

if an old table is migrated to iceberg without rewriting data (which is going to be most common scenario initially), files won't contain ids at all.

Are there any tests for that situation? Would like to add a projected test case

I added some migrated cases to the Spark compatibility tests. @phd3 are you familiar with the Iceberg schema.name-mapping.default property? Looks like a more robust way we could be mapping column names for migrated tables.

If we had the full Iceberg schema here as suggested in #8754 I think we could get the mapping from schema.findField(String name)

losipiuk · 2021-07-01T15:18:22Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

Is that possible at this point? It feels we are rebuilding it above if it was empty.

It looks like it builds a backup mapping by name for the columns where the id is empty

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

losipiuk · 2021-07-06T11:29:17Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

Is there a chance to get naming conflict here; get same projectedColumnName for different handles. It should not be possible if path elements and name cannot contain dots. It is a corner case, but still can you verify if that is possible?

Looks like there's a validation for this in the Iceberg schema code:

trino:default> CREATE TABLE foo ("a.b" INT, a ROW (b INT)); Query 20210707_163341_00003_xkr2b failed: Invalid schema: multiple fields for name a.b: 1 and 3 org.apache.iceberg.exceptions.ValidationException: Invalid schema: multiple fields for name a.b: 1 and 3

i missed this comment so i checked this myself.
the code should be adorned with comment informing the reader about such limitations/assumptions where we leverage them

...in/trino-iceberg/src/test/java/io/trino/plugin/iceberg/TestIcebergOrcProjectionPushdown.java

alexjo2144 · 2021-10-25T18:57:06Z

Rebased for merge conflicts

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergColumnHandle.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergMetadata.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

findepi · 2021-11-03T09:58:44Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

if we end up projecting all of root.getNestedColumns(), each of them fully, could we return fullyProjectedLayout?

(OK to address in follow-up)

findepi · 2021-11-03T10:01:22Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

It seems we're doing tree traversal here (effectively).
wonder whether we can improve representation of dereferences so that we don't need to copy all the elements when recursing (OK to address in follow-up)

findepi · 2021-11-03T10:01:22Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

It seems we're doing tree traversal here (effectively).
wonder whether we can improve representation of dereferences so that we don't need to copy all the elements when recursing (OK to address in follow-up)

Consistently use the OrcColumn name between the FieldMapper and FieldLayouts in the Orc reader. This fixes the case where a field has been renamed and the Trino type name does not match the Orc column name. This is tested in following commit using Iceberg projection pushdown.

alexjo2144 · 2021-11-04T18:04:57Z

Had to rebase to resolve merge conflicts but the last set of comments is in the last fixup commit. Thanks

alexjo2144 · 2021-11-04T19:30:30Z

Test failures are in trino-jdbc, unrelated

findepi · 2021-11-08T09:21:27Z

TODO:

Test with tables that don't have the Iceberg column id field

please make sure there is an issue for this

Test failures are in trino-jdbc, unrelated

please make sure there is an issue for this

phd3

thanks for your work on this @alexjo2144.

With Iceberg, pushdown on metadata side itself gives us huge benefits because of split generation time pruning. However, I think there're some missing pieces on pagesource side as mentioned in comments. Please let me know if there's an issue with my understanding.

lib/trino-orc/src/main/java/io/trino/orc/OrcReader.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergTableHandle.java

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

phd3 · 2021-11-09T02:06:30Z

plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/IcebergPageSourceProvider.java

are we intentionally not pushing predicates into parquet reader? (line 591)

Probably just because it's relatively new. The change allowing for passing a Predicate to ParquetReader was just merged in August. If it's useful here, I'd rather do it in followup, since it's related to all the column index changes to the parquet reader and I don't really understand those yet. 6eb42f2

I was mainly thinking of the pushdown in rowgroups (ORC stripe-equivalent, even before newly added code). but look like it's missing in hive connector too. cc @JamesRTaylor

i am not sure i understand the conclusion. should we have a follow-up issue for this?

exclusion of nested predicates was added initially when only ORC support was added for dereference pushdown. However, seems like it didn't change in #3396 while adding parquet support. I'm not sure if that was intentional. but you're right that this deserves a separate issue. #9928

For iceberg connector, I don't see corresponding changes like #3396 here for propagating projected layout, so assumed that we're tackling it separately.

phd3

I think we can merge this one PR with ORC improvements once #8129 (comment) is resolved, and handle Parquet stuff separately. Can you please squash commits?

findepi · 2021-11-10T08:53:41Z

CI #8611

Refactor needed for Iceberg projection pushdown. Iceberg uses field ids to specify columns rather than column names.

Annotate nullable fields with the @nullable annotation and allow for any ColumnHandle implementation.

alexjo2144 · 2021-11-10T17:51:11Z

Created a follow up Issue: #9931 and squashed

phd3

Thanks for your work on this @alexjo2144 ! I'll merge this unless someone else has further comments.

phd3 · 2021-11-20T00:31:13Z

Merged, thanks!

cla-bot bot added the cla-signed label May 28, 2021

martint requested a review from phd3 May 28, 2021 16:33

phd3 reviewed May 28, 2021

View reviewed changes

alexjo2144 force-pushed the iceberg/orc-projection-pushdown branch from f04e2ab to 75dba60 Compare June 28, 2021 21:36

alexjo2144 requested a review from phd3 June 30, 2021 13:52

alexjo2144 requested a review from losipiuk June 30, 2021 13:54