Support skip archive for Glue in Iceberg#14336
Conversation
|
@electrum Can you add |
|
For what it's worth, I think we should just get rid of glue archiving altogether. Either that, or at some point we should make it default skip archiving. Otherwise, every Trino Iceberg cluster will blow up after so many thousands of transactions in Glue (which shouldn't be the default behavior out of the box). |
9f653a8 to
affa7c8
Compare
|
To support removing old archives, we need two more API calls (getTableVersions & batchDeleteTableVersion{Async}) & two more IAM permissions (GetTableVersions & BatchDeleteTableVersion). I lean toward enabling the flag by default in the future. |
There was a problem hiding this comment.
For backward compatibility. I can enable by default if there's no objection.
There was a problem hiding this comment.
it's unlikely that most people care about the Glue table versions for Iceberg tables, but it's also not unreasonable that someone uses that.
i think we should have it disabled (false) for now
There was a problem hiding this comment.
Agreed, disabled by default makes sense.
|
Do you plan any follow-ups on hive/delta as well? |
48e5dfa to
616e5b7
Compare
There was a problem hiding this comment.
Looks like we want to unify this with plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/catalog/glue/TrinoGlueCatalogFactory.java by providing some Glue client provider interface, especially that createAsyncGlueClient has this parameter Optional
There was a problem hiding this comment.
Added a preparatory commit to unify them. Let me know if it's different from your intention.
616e5b7 to
81d22c7
Compare
| public class GlueClientProvider | ||
| implements Provider<AWSGlueAsync> | ||
| { | ||
| private final AWSGlueAsync glueClient; |
There was a problem hiding this comment.
i think the Provider.get should do the instantiation so that the provided instance can be bound with various scopes.
ie. this works correctly only because the Module uses SINGLETON.
There was a problem hiding this comment.
it's unlikely that most people care about the Glue table versions for Iceberg tables, but it's also not unreasonable that someone uses that.
i think we should have it disabled (false) for now
| if (request instanceof UpdateTableRequest updateTableRequest) { | ||
| return updateTableRequest.withSkipArchive(skipArchive); | ||
| } | ||
| return request; |
There was a problem hiding this comment.
Glue Catalog uses only a couple of different requests.
Can we enumerate all these, to ensure this class gets updated if any new request variant is added?
for example, UpdateTableRequestV2 or something?
|
@ebyhr sorry for the delay. Done now. |
152f88c to
bbb3182
Compare
bbb3182 to
fbcc74b
Compare
|
@ebyhr does this need to be included in docs as a new property, or is this just propagation of a property that already exists elsewhere? |
This one needs to be documented. (follow-up PR) |

Description
TestIcebergGlueCatalogSkipArchive.testSkipArchivefails until updating IAM policy.Fixes #13413
Release notes
( ) This is not user-visible or docs only and no release notes are required.
( ) Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text: