Add Glacier skipping logic to S3FileSystem#18237
Add Glacier skipping logic to S3FileSystem#18237ericlgoodman wants to merge 1 commit intotrinodb:masterfrom
Conversation
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
4383d43 to
b20b5de
Compare
|
Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla |
| new S3Location(location); | ||
| } | ||
|
|
||
| private boolean shouldReadObject(S3Object object) |
There was a problem hiding this comment.
NIT: Maybe move connected methods to S3ObjectStorageClassFilter and return function from context
.filter(context.s3ObjectStorageClassFilter())
e.g
public enum S3ObjectStorageClassFilter
{
READ_ALL(o -> true),
SKIP_ALL_GLACIER(S3ObjectStorageClassFilter::isNotGlacierObject),
READ_RESTORED_GLACIER_OBJECTS(S3ObjectStorageClassFilter::isCompletedRestoredObject);
private Function<S3Object, Boolean> filter;
private S3ObjectStorageClassFilter(Function<S3Object, Boolean> filter){
.....
}
private static boolean isCompletedRestoredObject(S3Object object)
{
/* There are 3 cases for the restore status:
*
* 1. The object is not restored, and has not been requested to be restored. We should have a null
* value for restoreStatus
* 2. The object is in the process of being restored. isRestoreInProgress will be true, but restoreExpiryDate
* will be null.
* 3. The object has completed restore. isRestoreInProgress will be false, restoreExpiryDate will be some date.
*
* Since we only care about distinguishing when it is case 3, we can just check restoreExpiryDate for a non-null value. */
return isNotGlacierObject(object) || Optional.ofNullable(object.restoreStatus())
.map(RestoreStatus::restoreExpiryDate)
.isPresent();
}
private static boolean isNotGlacierObject(S3Object object)
{
return !GLACIER_STORAGE_CLASSES.contains(object.storageClass());
}
}
| return switch (context.s3ObjectStorageClassFilter()) { | ||
| case READ_ALL -> true; | ||
| case SKIP_ALL_GLACIER -> !isGlacierObject(object); | ||
| case READ_RESTORED_GLACIER_OBJECTS -> !isGlacierObject(object) || isCompletedRestoredObject(object); |
There was a problem hiding this comment.
That will restore ALL non Glacier objects and RESTORED objects ? Should we adjust name ?
Maybe
READ_ALL
READ_NON_GLACIER
READ_NON_GLACIER_AND_RESTORED
There was a problem hiding this comment.
Agreed this is more clear will update
|
Facing some dependency issues, as it looks like the version of Netty used in the SDK with <!-- Netty 4.1.94.Final breaks Apache Arrow -->
<dep.netty.version>4.1.93.Final</dep.netty.version>Per this comment, we need to wait for the next release of Apache Arrow. |
| return s3ObjectStorageClassFilter; | ||
| } | ||
|
|
||
| @Config("s3.object-storage-class-filter") |
There was a problem hiding this comment.
if i understand correctly, this is applicable to hive connector only and should be a config there.
Also, i don't know why anyone would want to silently skip glacier objects. that's just returning wrong data, isn't it?
delta and iceberg use listings for cleanups (eg DROP TABLE); in such case I don't think we should be skipping glacier objects.
There was a problem hiding this comment.
That's a good point, although this is actually a problem with the legacy S3 code. Thoughts on how to improve this?
There was a problem hiding this comment.
if i understand correctly, this is applicable to hive connector only and should be a config there. delta and iceberg use listings for cleanups (eg DROP TABLE); in such case I don't think we should be skipping glacier objects.
Good point. What about overloading listFiles() and adding a function signature which accepts a filter (and defaults to the filesystem configuration for the regular signature of listFiles(Location location)? The S3AFileSystem does something similar.
Also, i don't know why anyone would want to silently skip glacier objects. that's just returning wrong data, isn't it?
This is the default behavior in Athena. Lots of people have lifecycle policies on S3 buckets which transition their objects in place from standard storage to Glacier storage. This allows them to continue to read from that table.
electrum
left a comment
There was a problem hiding this comment.
Can you explain the use case for this feature? Silently skipping data is a correctness issue.
| ListObjectsV2Iterable iterable = client.listObjectsV2Paginator(request); | ||
| return new S3FileIterator(s3Location, iterable.contents().iterator()); | ||
| Iterator<S3Object> iterator = client.listObjectsV2Paginator(request).contents() | ||
| .stream() |
There was a problem hiding this comment.
Nit: put stream on previous line
|
|
||
| private static boolean isCompletedRestoredObject(S3Object object) | ||
| { | ||
| /* There are 3 cases for the restore status: |
There was a problem hiding this comment.
How about simplifying this to
// Only restored objects will have the restoreExpiryDate set.
// Ignore not-restored objects and in-progress restores.|
|
||
| private static boolean isGlacierObject(S3Object object) | ||
| { | ||
| return GLACIER_STORAGE_CLASSES.contains(object.storageClass()); |
There was a problem hiding this comment.
Let's simplify this to
return (object.storageClass() == GLACIER) || (object.storageClass() == DEEP_ARCHIVE);Or we could do
return switch (object.storageClass()) {
case GLACIER, DEEP_ARCHIVE -> true;
default -> false;
};| return s3ObjectStorageClassFilter; | ||
| } | ||
|
|
||
| @Config("s3.object-storage-class-filter") |
There was a problem hiding this comment.
That's a good point, although this is actually a problem with the legacy S3 code. Thoughts on how to improve this?
|
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
|
👋 @ericlgoodman @findepi @electrum - this PR has become inactive. We hope you are still interested in working on it. Please let us know, and we can try to get reviewers to help with that. We're working on closing out old and inactive PRs, so if you're too busy or this has too many merge conflicts to be worth picking back up, we'll be making another pass to close it out in a few weeks. |
|
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
|
Apologies for the delay on following up on this. As @findepi mentioned, putting the skipping logic directly in the file system is problematic for deletes, and we really only want this to apply to Hive file listing. Adding a new listing method just for this is ugly and breaks the abstractions. Looking at this with fresh eyes, I'm thinking we can implement this by adding a |
Sounds good to me. Sorry for the delay on this - we've had a lot of reshuffling lately. Planning on having someone on my team pick this up in the next few weeks. |
|
This pull request has gone a while without any activity. Tagging the Trino developer relations team: @bitsondatadev @colebow @mosabua |
Hi @electrum, I'll be picking this back up. I was wondering how we should go about implementing the skipping logic in TrinoFileStatusRemoteIterator. If we were to pass the filter to there, how would we go about the hasNext() and next() logic? hasNext() would only allow us to see if there is a next object, whereas in next() we could see if the object matches our filter but we wouldn't be able to just call next() again b/c we aren't sure if it hasNext(). Any help would be appreciated, thank you |
|
Hi @bangtim, take a look at |
|
To be completed via #21164 |
Description
TrinoS3FileSystemcontains logic/configuration for skipping Glacier objects found in S3.This PR adds the missing functionality/configuration from
TrinoS3FileSystemand also adds the capability to configure the filesystem to read restored Glacier objects.Additional context and related issues
Recently, S3 added the
RestoreStatusto the result ofListObjectsV2- see https://aws.amazon.com/about-aws/whats-new/2023/06/amazon-s3-restore-status-s3-glacier-objects-list-api/ for more details.Release notes
( ) This is not user-visible or docs only and no release notes are required.
(x) Release notes are required, please propose a release note for me.
() Release notes are required, with the following suggested text: