Support S3FileIO in Hadoop and Nessie backed Iceberg tables. by hmadison · Pull Request #20352 · prestodb/presto

hmadison · 2023-07-20T20:43:28Z

When an non-Hive catalog is used to back Iceberg, Presto does not set up the s3, s3a, or s3n file systems for Hadoop. This prevents non-Hive catalogs from accessing files in object storage without manually passing in a hive configuration.

As a small usability improvement, I've updated IcebergResourceFactory to apply S3ConfigurationUpdater#updateConfiguration when iceberg.hadoop.config.resources is not set. This has the effect of configuring the object storage file systems for Hadoop by default, with the same configuration properties as the Hive connector.

Test plan

I ran (via Docker Compose) a test query which created a new table using the tpch.tiny.region with Nessie/MinIO backed catalog. I then queried the table back out and compared it to the source tpch.tiny.region table.

Test Resources

Docker compose file:

---
services:
  nessie:
    image: "ghcr.io/projectnessie/nessie"
    ports:
      - "19120:19120"
  minio:
    image: "quay.io/minio/minio"
    ports:
      - "9001:9001"
      - "9000:9000"
    command: |
      server /data --console-address ":9001"

Test Query:

create schema test;
create table test.region as (select * from tpch.tiny.region);

== RELEASE NOTES ==

General Changes
* Iceberg catalogues which use Hadoop or Nessie as catalogs now support reading from and writing to S3 with the same configuration options as the Hive catalog.

tdcmeehan · 2023-07-21T00:43:26Z

FYI the issue linked seems irrelevant

hmadison · 2023-07-21T02:51:17Z

FYI the issue linked seems irrelevant

My apologies, I meant to link to the Iceberg issue which discusses, in part, the error message.

tdcmeehan · 2023-07-21T15:12:24Z

No problem, do you want to update the commit message with apache/iceberg#3546 so it correctly links?

Resolves apache/iceberg#3546

agrawalreetika · 2023-08-29T18:07:06Z

presto-iceberg/src/main/java/com/facebook/presto/iceberg/IcebergResourceFactory.java

        Configuration configuration = new Configuration(false);
+
        if (hadoopConfigResources.isEmpty()) {
+            s3ConfigurationUpdater.updateConfiguration(configuration);


Thank you @hmadison for fixing up support for S3-backed Iceberg tables.

I have a question, Is there any reason we are not updating S3 configs always?
I mean we can also have the scenario when we have an iceberg table backed by an S3 filesystem and we want to supply some extra Hadoop config-resource in the catalog, then in that case how s3 configs would be updated?

hmadison requested a review from a team as a code owner July 20, 2023 20:43

hmadison requested a review from presto-oss July 20, 2023 20:43

tdcmeehan approved these changes Jul 21, 2023

View reviewed changes

Support S3FileIO in Hadoop and Nessie backed Iceberg tables.

a3e6d50

Resolves apache/iceberg#3546

hmadison force-pushed the hm/iceberg-nessie-s3 branch from fcd4548 to a3e6d50 Compare July 21, 2023 15:36

tdcmeehan merged commit 532e77f into prestodb:master Jul 21, 2023

hmadison deleted the hm/iceberg-nessie-s3 branch July 21, 2023 16:49

wanglinsong mentioned this pull request Jul 27, 2023

Add release notes for 0.283 #20402

Merged

28 tasks

agrawalreetika reviewed Aug 29, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support S3FileIO in Hadoop and Nessie backed Iceberg tables.#20352

Support S3FileIO in Hadoop and Nessie backed Iceberg tables.#20352
tdcmeehan merged 1 commit intoprestodb:masterfrom
hmadison:hm/iceberg-nessie-s3

hmadison commented Jul 20, 2023 •

edited by tdcmeehan

Loading

Uh oh!

tdcmeehan commented Jul 21, 2023

Uh oh!

hmadison commented Jul 21, 2023

Uh oh!

tdcmeehan commented Jul 21, 2023

Uh oh!

agrawalreetika Aug 29, 2023 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

hmadison commented Jul 20, 2023 • edited by tdcmeehan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tdcmeehan commented Jul 21, 2023

Uh oh!

hmadison commented Jul 21, 2023

Uh oh!

tdcmeehan commented Jul 21, 2023

Uh oh!

agrawalreetika Aug 29, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hmadison commented Jul 20, 2023 •

edited by tdcmeehan

Loading

agrawalreetika Aug 29, 2023 •

edited

Loading