Add Glacier skipping logic to S3FileSystem by ericlgoodman · Pull Request #18237 · trinodb/trino

ericlgoodman · 2023-07-12T01:38:09Z

Description

TrinoS3FileSystem contains logic/configuration for skipping Glacier objects found in S3.

This PR adds the missing functionality/configuration from TrinoS3FileSystem and also adds the capability to configure the filesystem to read restored Glacier objects.

Additional context and related issues

Recently, S3 added the RestoreStatus to the result of ListObjectsV2 - see https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-s3-restore-status-s3-glacier-objects-list-api/ for more details.

Release notes

( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
() Release notes are required, with the following suggested text:

# Section
* Fix some things. ({issue}`issuenumber`)

cla-bot · 2023-07-12T01:38:11Z

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

cla-bot · 2023-07-13T15:10:32Z

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

aczajkowski

LGTM % few suggestions

aczajkowski · 2023-07-14T19:11:03Z

        new S3Location(location);
    }
+
+    private boolean shouldReadObject(S3Object object)


NIT: Maybe move connected methods to S3ObjectStorageClassFilter and return function from context

.filter(context.s3ObjectStorageClassFilter())

e.g

public enum S3ObjectStorageClassFilter { READ_ALL(o -> true), SKIP_ALL_GLACIER(S3ObjectStorageClassFilter::isNotGlacierObject), READ_RESTORED_GLACIER_OBJECTS(S3ObjectStorageClassFilter::isCompletedRestoredObject); private Function<S3Object, Boolean> filter; private S3ObjectStorageClassFilter(Function<S3Object, Boolean> filter){ ..... } private static boolean isCompletedRestoredObject(S3Object object) { /* There are 3 cases for the restore status: * * 1. The object is not restored, and has not been requested to be restored. We should have a null * value for restoreStatus * 2. The object is in the process of being restored. isRestoreInProgress will be true, but restoreExpiryDate * will be null. * 3. The object has completed restore. isRestoreInProgress will be false, restoreExpiryDate will be some date. * * Since we only care about distinguishing when it is case 3, we can just check restoreExpiryDate for a non-null value. */ return isNotGlacierObject(object) || Optional.ofNullable(object.restoreStatus()) .map(RestoreStatus::restoreExpiryDate) .isPresent(); } private static boolean isNotGlacierObject(S3Object object) { return !GLACIER_STORAGE_CLASSES.contains(object.storageClass()); } }

I like this! Will update

aczajkowski · 2023-07-14T19:40:12Z

+        return switch (context.s3ObjectStorageClassFilter()) {
+            case READ_ALL -> true;
+            case SKIP_ALL_GLACIER -> !isGlacierObject(object);
+            case READ_RESTORED_GLACIER_OBJECTS -> !isGlacierObject(object) || isCompletedRestoredObject(object);


That will restore ALL non Glacier objects and RESTORED objects ? Should we adjust name ?
Maybe
READ_ALL
READ_NON_GLACIER
READ_NON_GLACIER_AND_RESTORED

Agreed this is more clear will update

ericlgoodman · 2023-07-20T20:55:00Z

Facing some dependency issues, as it looks like the version of Netty used in the SDK with restoreStatus breaks Apache Arrow:

<!-- Netty 4.1.94.Final breaks Apache Arrow -->
<dep.netty.version>4.1.93.Final</dep.netty.version>

Per this comment, we need to wait for the next release of Apache Arrow.

findepi · 2023-07-31T10:05:20Z

+        return s3ObjectStorageClassFilter;
+    }
+
+    @Config("s3.object-storage-class-filter")


if i understand correctly, this is applicable to hive connector only and should be a config there.
Also, i don't know why anyone would want to silently skip glacier objects. that's just returning wrong data, isn't it?

delta and iceberg use listings for cleanups (eg DROP TABLE); in such case I don't think we should be skipping glacier objects.

That's a good point, although this is actually a problem with the legacy S3 code. Thoughts on how to improve this?

if i understand correctly, this is applicable to hive connector only and should be a config there. delta and iceberg use listings for cleanups (eg DROP TABLE); in such case I don't think we should be skipping glacier objects.

Good point. What about overloading listFiles() and adding a function signature which accepts a filter (and defaults to the filesystem configuration for the regular signature of listFiles(Location location)? The S3AFileSystem does something similar.

Also, i don't know why anyone would want to silently skip glacier objects. that's just returning wrong data, isn't it?

This is the default behavior in Athena. Lots of people have lifecycle policies on S3 buckets which transition their objects in place from standard storage to Glacier storage. This allows them to continue to read from that table.

electrum

Can you explain the use case for this feature? Silently skipping data is a correctness issue.

electrum · 2023-08-10T21:47:30Z

-            ListObjectsV2Iterable iterable = client.listObjectsV2Paginator(request);
-            return new S3FileIterator(s3Location, iterable.contents().iterator());
+            Iterator<S3Object> iterator = client.listObjectsV2Paginator(request).contents()
+                    .stream()


Nit: put stream on previous line

electrum · 2023-08-10T21:48:29Z

+
+    private static boolean isCompletedRestoredObject(S3Object object)
+    {
+        /* There are 3 cases for the restore status:


How about simplifying this to

// Only restored objects will have the restoreExpiryDate set. // Ignore not-restored objects and in-progress restores.

electrum · 2023-08-10T21:52:20Z

+
+    private static boolean isGlacierObject(S3Object object)
+    {
+        return GLACIER_STORAGE_CLASSES.contains(object.storageClass());


Let's simplify this to

return (object.storageClass() == GLACIER) || (object.storageClass() == DEEP_ARCHIVE);

Or we could do

return switch (object.storageClass()) { case GLACIER, DEEP_ARCHIVE -> true; default -> false; };

electrum · 2023-08-10T21:55:05Z

+        return s3ObjectStorageClassFilter;
+    }
+
+    @Config("s3.object-storage-class-filter")


That's a good point, although this is actually a problem with the legacy S3 code. Thoughts on how to improve this?

github-actions · 2024-01-12T17:03:58Z

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

mosabua · 2024-01-12T18:12:37Z

👋 @ericlgoodman @findepi @electrum - this PR has become inactive. We hope you are still interested in working on it. Please let us know, and we can try to get reviewers to help with that.

We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks.

github-actions · 2024-02-05T17:08:18Z

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

electrum · 2024-02-13T01:38:11Z

Apologies for the delay on following up on this. As @findepi mentioned, putting the skipping logic directly in the file system is problematic for deletes, and we really only want this to apply to Hive file listing. Adding a new listing method just for this is ugly and breaks the abstractions.

Looking at this with fresh eyes, I'm thinking we can implement this by adding a Set<String> tags field to the FileEntry object. Objects could be tagged with s3:glacier and s3:glacierRestored as applicable. Then the Hive connector can have a hive.s3.skip-glacier-objects config in HiveConfig and handle the skipping inside TrinoFileStatusRemoteIterator.

ericlgoodman · 2024-02-27T21:02:16Z

Apologies for the delay on following up on this. As @findepi mentioned, putting the skipping logic directly in the file system is problematic for deletes, and we really only want this to apply to Hive file listing. Adding a new listing method just for this is ugly and breaks the abstractions.

Looking at this with fresh eyes, I'm thinking we can implement this by adding a Set<String> tags field to the FileEntry object. Objects could be tagged with s3:glacier and s3:glacierRestored as applicable. Then the Hive connector can have a hive.s3.skip-glacier-objects config in HiveConfig and handle the skipping inside TrinoFileStatusRemoteIterator.

Sounds good to me. Sorry for the delay on this - we've had a lot of reshuffling lately. Planning on having someone on my team pick this up in the next few weeks.

github-actions · 2024-03-20T17:02:59Z

This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua

bangtim · 2024-03-21T15:32:12Z

I'm thinking we can implement this by adding a Set<String> tags field to the FileEntry object. Objects could be tagged with s3:glacier and s3:glacierRestored as applicable. Then the Hive connector can have a hive.s3.skip-glacier-objects config in HiveConfig and handle the skipping inside TrinoFileStatusRemoteIterator.

Hi @electrum, I'll be picking this back up. I was wondering how we should go about implementing the skipping logic in TrinoFileStatusRemoteIterator. If we were to pass the filter to there, how would we go about the hasNext() and next() logic? hasNext() would only allow us to see if there is a next object, whereas in next() we could see if the object matches our filter but we wouldn't be able to just call next() again b/c we aren't sure if it hasNext(). Any help would be appreciated, thank you

electrum · 2024-03-22T20:42:17Z

Hi @bangtim, take a look at io.trino.plugin.hive.fs.DirectoryListingFilter or com.google.common.collect.AbstractIterator for ideas on how to implement the skipping logic. The basic idea is that the iterator needs to do the filtering and hold onto the next element, which can then be referenced in the hasNext() or next() methods.

ericlgoodman · 2024-04-02T00:32:12Z

To be completed via #21164

Add Glacier skipping logic to S3FileSystem

b20b5de

ericlgoodman force-pushed the glacier-restored-object branch from 4383d43 to b20b5de Compare July 13, 2023 15:10

ericlgoodman marked this pull request as ready for review July 13, 2023 19:13

JunhyungSong requested review from aczajkowski, dain and electrum July 14, 2023 16:45

aczajkowski reviewed Jul 14, 2023

View reviewed changes

aczajkowski requested review from findepi and losipiuk July 21, 2023 19:24

findepi reviewed Jul 31, 2023

View reviewed changes

electrum reviewed Aug 10, 2023

View reviewed changes

github-actions bot added the stale label Jan 12, 2024

github-actions bot removed the stale label Jan 15, 2024

github-actions bot added the stale label Feb 5, 2024

github-actions bot removed the stale label Feb 13, 2024

bangtim mentioned this pull request Mar 19, 2024

Add glacier skipping logic #21164

Merged

github-actions bot added the stale label Mar 20, 2024

github-actions bot removed the stale label Mar 21, 2024

findinpath mentioned this pull request Mar 29, 2024

Feature request: hive.s3.skip-glacier-objects - skips restored object #21174

Open

ericlgoodman closed this Apr 2, 2024

Conversation

ericlgoodman commented Jul 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Additional context and related issues

Release notes

Uh oh!

cla-bot bot commented Jul 12, 2023

Uh oh!

cla-bot bot commented Jul 13, 2023

Uh oh!

aczajkowski left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aczajkowski Jul 14, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ericlgoodman commented Jul 20, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

electrum left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 12, 2024

Uh oh!

mosabua commented Jan 12, 2024

Uh oh!

github-actions bot commented Feb 5, 2024

Uh oh!

electrum commented Feb 13, 2024

Uh oh!

ericlgoodman commented Feb 27, 2024

Uh oh!

github-actions bot commented Mar 20, 2024

Uh oh!

bangtim commented Mar 21, 2024

Uh oh!

electrum commented Mar 22, 2024

Uh oh!

ericlgoodman commented Apr 2, 2024

Uh oh!

Reviewers

Assignees

Labels

Milestone

Development

Uh oh!

6 participants

ericlgoodman commented Jul 12, 2023 •

edited

Loading

aczajkowski Jul 14, 2023 •

edited

Loading

ericlgoodman commented Jul 20, 2023 •

edited

Loading