-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Closed
Description
Context
- We are testing the use of the sorted_by property with large Icerberg tables in Trino.
- Our iceberg tables are stored in the S3 Tables REST catalog managed by AWS.
- This catalog does not allow me to access the data or metadata files in the iceberg table. This service hides those details.
- However, when running a table with this property, I get an error.
Trino
- Version: 476
- Deployed in EKS with the official helm chart
- Catalog configuration (some strings were redacted):
my_catalog: |
connector.name=iceberg
iceberg.catalog.type=rest
iceberg.rest-catalog.uri=https://glue.us-east-1.amazonaws.com/iceberg
iceberg.rest-catalog.warehouse=********:s3tablescatalog/my-s3-table-bucket
iceberg.rest-catalog.sigv4-enabled=true
iceberg.rest-catalog.view-endpoints-enabled=false
iceberg.rest-catalog.signing-name=glue
fs.hadoop.enabled=false
fs.native-s3.enabled=true
fs.cache.directories=/tmp/trino/data/cache
fs.cache.enabled=true
fs.cache.max-disk-usage-percentages=70
fs.cache.ttl=1d
s3.region=us-east-1
s3.endpoint=https://s3.us-east-1.amazonaws.com
s3.use-web-identity-token-credentials-provider=true
s3.iam-role=arn:aws:iam::********:role/my-trino-role
iceberg.writer-sort-buffer-size=1MB
iceberg.allowed-extra-properties=*
iceberg.unique-table-location=true
iceberg.add-files-procedure.enabled=true
iceberg.register-table-procedure.enabled=true
iceberg.metadata-cache.enabled=false
iceberg.expire-snapshots.min-retention=1m
iceberg.remove-orphan-files.min-retention=1m
Error
This is the trace. I've redacted a few strings there.
io.trino.spi.TrinoException: Error committing write to Hive
at io.trino.plugin.hive.SortingFileWriter.commit(SortingFileWriter.java:162)
at io.trino.plugin.iceberg.IcebergSortingFileWriter.commit(IcebergSortingFileWriter.java:92)
at io.trino.plugin.iceberg.IcebergPageSink.closeWriter(IcebergPageSink.java:416)
at io.trino.plugin.iceberg.IcebergPageSink.finish(IcebergPageSink.java:230)
at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSink.finish(ClassLoaderSafeConnectorPageSink.java:84)
at io.trino.operator.TableWriterOperator.finish(TableWriterOperator.java:235)
at io.trino.operator.Driver.processInternal(Driver.java:421)
at io.trino.operator.Driver.lambda$process$0(Driver.java:306)
at io.trino.operator.Driver.tryWithLock(Driver.java:709)
at io.trino.operator.Driver.process(Driver.java:298)
at io.trino.operator.Driver.processForDuration(Driver.java:269)
at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:889)
at io.trino.execution.executor.dedicated.SplitProcessor.run(SplitProcessor.java:77)
at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.lambda$run$0(TaskEntry.java:201)
at io.trino.$gen.Trino_476____20250625_222347_2.run(Unknown Source)
at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.run(TaskEntry.java:202)
at io.trino.execution.executor.scheduler.FairScheduler.runTask(FairScheduler.java:177)
at io.trino.execution.executor.scheduler.FairScheduler.lambda$submit$0(FairScheduler.java:164)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:545)
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:128)
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:80)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1095)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:619)
at java.base/java.lang.Thread.run(Thread.java:1447)
Caused by: java.io.UncheckedIOException: io.trino.filesystem.TrinoFileSystemException: Failed to delete file: s3://******-***-****-**********************--table-s3/data/trino-tmp-files/sorting-file-writer-***********.1
at io.trino.plugin.hive.SortingFileWriter.mergeFiles(SortingFileWriter.java:256)
at io.trino.plugin.hive.SortingFileWriter.writeSorted(SortingFileWriter.java:215)
at io.trino.plugin.hive.SortingFileWriter.commit(SortingFileWriter.java:158)
... 24 more
Caused by: io.trino.filesystem.TrinoFileSystemException: Failed to delete file: s3://*******-****-****-**********************--table-s3/data/trino-tmp-files/sorting-file-writer-***********.1
at io.trino.filesystem.s3.S3FileSystem.deleteFile(S3FileSystem.java:152)
at io.trino.filesystem.switching.SwitchingFileSystem.deleteFile(SwitchingFileSystem.java:85)
at io.trino.filesystem.tracing.TracingFileSystem.lambda$deleteFile$0(TracingFileSystem.java:79)
at io.trino.filesystem.tracing.Tracing.lambda$withTracing$0(Tracing.java:42)
at io.trino.filesystem.tracing.Tracing.withTracing(Tracing.java:51)
at io.trino.filesystem.tracing.Tracing.withTracing(Tracing.java:41)
at io.trino.filesystem.tracing.TracingFileSystem.deleteFile(TracingFileSystem.java:79)
at io.trino.filesystem.cache.CacheFileSystem.deleteFile(CacheFileSystem.java:81)
at io.trino.plugin.hive.SortingFileWriter.mergeFiles(SortingFileWriter.java:252)
... 26 more
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: Access Denied (Service: S3, Status Code: 403, Request ID: ***********, Extended Request ID: ***********) (SDK Attempt Count: 1)
at software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:113)
at software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:61)
at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.retryPolicyDisallowedRetryException(RetryableStageHelper.java:168)
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:73)
at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:53)
at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:35)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:82)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:43)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:210)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
at software.amazon.awssdk.services.s3.DefaultS3Client.deleteObject(DefaultS3Client.java:3379)
at io.trino.filesystem.s3.S3FileSystem.deleteFile(S3FileSystem.java:149)
... 34 more
Investigation
- This is the line that fails, if I'm not mistaken:
trino/lib/trino-filesystem-s3/src/main/java/io/trino/filesystem/s3/S3FileSystem.java
Line 149 in 9d4221f
client.deleteObject(request); - This AWS SDK Client method should not be used with S3 Tables, if I'm not mistaken: https://sdk.amazonaws.com/java/api/latest/software/amazon/awssdk/services/s3/S3Client.html#deleteObject(software.amazon.awssdk.services.s3.model.DeleteObjectRequest)
- As a result, I believe that the following happens:
- Trino tries to use normal S3 methods
- However, it shouldn't, because we are using the S3 Tables catalog which hides this lower level S3 API for our iceberg tables.
Metadata
Metadata
Assignees
Labels
No labels