Skip to content

"Failed to delete file" exception in a large managed iceberg S3 Table when using sorted_by #26077

@J0hnG4lt

Description

@J0hnG4lt

Context

  • We are testing the use of the sorted_by property with large Icerberg tables in Trino.
  • Our iceberg tables are stored in the S3 Tables REST catalog managed by AWS.
  • This catalog does not allow me to access the data or metadata files in the iceberg table. This service hides those details.
  • However, when running a table with this property, I get an error.

Trino

  • Version: 476
  • Deployed in EKS with the official helm chart
  • Catalog configuration (some strings were redacted):
  my_catalog: |
    connector.name=iceberg
    iceberg.catalog.type=rest
    iceberg.rest-catalog.uri=https://glue.us-east-1.amazonaws.com/iceberg
    iceberg.rest-catalog.warehouse=********:s3tablescatalog/my-s3-table-bucket
    iceberg.rest-catalog.sigv4-enabled=true
    iceberg.rest-catalog.view-endpoints-enabled=false
    iceberg.rest-catalog.signing-name=glue
    fs.hadoop.enabled=false
    fs.native-s3.enabled=true
    fs.cache.directories=/tmp/trino/data/cache
    fs.cache.enabled=true
    fs.cache.max-disk-usage-percentages=70
    fs.cache.ttl=1d
    s3.region=us-east-1
    s3.endpoint=https://s3.us-east-1.amazonaws.com
    s3.use-web-identity-token-credentials-provider=true
    s3.iam-role=arn:aws:iam::********:role/my-trino-role
    iceberg.writer-sort-buffer-size=1MB
    iceberg.allowed-extra-properties=*
    iceberg.unique-table-location=true
    iceberg.add-files-procedure.enabled=true
    iceberg.register-table-procedure.enabled=true
    iceberg.metadata-cache.enabled=false
    iceberg.expire-snapshots.min-retention=1m
    iceberg.remove-orphan-files.min-retention=1m

Error

This is the trace. I've redacted a few strings there.

io.trino.spi.TrinoException: Error committing write to Hive
	at io.trino.plugin.hive.SortingFileWriter.commit(SortingFileWriter.java:162)
	at io.trino.plugin.iceberg.IcebergSortingFileWriter.commit(IcebergSortingFileWriter.java:92)
	at io.trino.plugin.iceberg.IcebergPageSink.closeWriter(IcebergPageSink.java:416)
	at io.trino.plugin.iceberg.IcebergPageSink.finish(IcebergPageSink.java:230)
	at io.trino.plugin.base.classloader.ClassLoaderSafeConnectorPageSink.finish(ClassLoaderSafeConnectorPageSink.java:84)
	at io.trino.operator.TableWriterOperator.finish(TableWriterOperator.java:235)
	at io.trino.operator.Driver.processInternal(Driver.java:421)
	at io.trino.operator.Driver.lambda$process$0(Driver.java:306)
	at io.trino.operator.Driver.tryWithLock(Driver.java:709)
	at io.trino.operator.Driver.process(Driver.java:298)
	at io.trino.operator.Driver.processForDuration(Driver.java:269)
	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:889)
	at io.trino.execution.executor.dedicated.SplitProcessor.run(SplitProcessor.java:77)
	at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.lambda$run$0(TaskEntry.java:201)
	at io.trino.$gen.Trino_476____20250625_222347_2.run(Unknown Source)
	at io.trino.execution.executor.dedicated.TaskEntry$VersionEmbedderBridge.run(TaskEntry.java:202)
	at io.trino.execution.executor.scheduler.FairScheduler.runTask(FairScheduler.java:177)
	at io.trino.execution.executor.scheduler.FairScheduler.lambda$submit$0(FairScheduler.java:164)
	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:545)
	at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:128)
	at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:74)
	at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:80)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1095)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:619)
	at java.base/java.lang.Thread.run(Thread.java:1447)
Caused by: java.io.UncheckedIOException: io.trino.filesystem.TrinoFileSystemException: Failed to delete file: s3://******-***-****-**********************--table-s3/data/trino-tmp-files/sorting-file-writer-***********.1
	at io.trino.plugin.hive.SortingFileWriter.mergeFiles(SortingFileWriter.java:256)
	at io.trino.plugin.hive.SortingFileWriter.writeSorted(SortingFileWriter.java:215)
	at io.trino.plugin.hive.SortingFileWriter.commit(SortingFileWriter.java:158)
	... 24 more
Caused by: io.trino.filesystem.TrinoFileSystemException: Failed to delete file: s3://*******-****-****-**********************--table-s3/data/trino-tmp-files/sorting-file-writer-***********.1
	at io.trino.filesystem.s3.S3FileSystem.deleteFile(S3FileSystem.java:152)
	at io.trino.filesystem.switching.SwitchingFileSystem.deleteFile(SwitchingFileSystem.java:85)
	at io.trino.filesystem.tracing.TracingFileSystem.lambda$deleteFile$0(TracingFileSystem.java:79)
	at io.trino.filesystem.tracing.Tracing.lambda$withTracing$0(Tracing.java:42)
	at io.trino.filesystem.tracing.Tracing.withTracing(Tracing.java:51)
	at io.trino.filesystem.tracing.Tracing.withTracing(Tracing.java:41)
	at io.trino.filesystem.tracing.TracingFileSystem.deleteFile(TracingFileSystem.java:79)
	at io.trino.filesystem.cache.CacheFileSystem.deleteFile(CacheFileSystem.java:81)
	at io.trino.plugin.hive.SortingFileWriter.mergeFiles(SortingFileWriter.java:252)
	... 26 more
Caused by: software.amazon.awssdk.services.s3.model.S3Exception: Access Denied (Service: S3, Status Code: 403, Request ID: ***********, Extended Request ID: ***********) (SDK Attempt Count: 1)
	at software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:113)
	at software.amazon.awssdk.services.s3.model.S3Exception$BuilderImpl.build(S3Exception.java:61)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.retryPolicyDisallowedRetryException(RetryableStageHelper.java:168)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:73)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.RetryableStage.execute(RetryableStage.java:36)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:53)
	at software.amazon.awssdk.core.internal.http.StreamManagingStage.execute(StreamManagingStage.java:35)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.executeWithTimer(ApiCallTimeoutTrackingStage.java:82)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:62)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallTimeoutTrackingStage.execute(ApiCallTimeoutTrackingStage.java:43)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:50)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ApiCallMetricCollectionStage.execute(ApiCallMetricCollectionStage.java:32)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:37)
	at software.amazon.awssdk.core.internal.http.pipeline.stages.ExecutionFailureExceptionReportingStage.execute(ExecutionFailureExceptionReportingStage.java:26)
	at software.amazon.awssdk.core.internal.http.AmazonSyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonSyncHttpClient.java:210)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.invoke(BaseSyncClientHandler.java:103)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.doExecute(BaseSyncClientHandler.java:173)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.lambda$execute$1(BaseSyncClientHandler.java:80)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.measureApiCallSuccess(BaseSyncClientHandler.java:182)
	at software.amazon.awssdk.core.internal.handler.BaseSyncClientHandler.execute(BaseSyncClientHandler.java:74)
	at software.amazon.awssdk.core.client.handler.SdkSyncClientHandler.execute(SdkSyncClientHandler.java:45)
	at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53)
	at software.amazon.awssdk.services.s3.DefaultS3Client.deleteObject(DefaultS3Client.java:3379)
	at io.trino.filesystem.s3.S3FileSystem.deleteFile(S3FileSystem.java:149)
	... 34 more

Investigation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions