Disallow dropping Iceberg schema that contains external files#9767
Disallow dropping Iceberg schema that contains external files#9767jirassimok wants to merge 9 commits intotrinodb:masterfrom
Conversation
1153b1a to
c16e96a
Compare
|
The tests previously failed because Tempto thought the tests were using HDP2 (from config-default), but the Spark environments used HDP3 instead, which uses a different port, so the To fix this, I've made the Spark environments use the same Tempto configuration as |
d6c81ff to
3b2c26c
Compare
...er/src/main/java/io/trino/tests/product/launcher/env/environment/EnvSinglenodeSparkHive.java
Outdated
Show resolved
Hide resolved
3b2c26c to
1f2ae81
Compare
|
Shouldn't be merged in current shape -- #9740 (comment) |
1f2ae81 to
e427f53
Compare
e427f53 to
5e53606
Compare
b36273e to
2e7df7f
Compare
|
Fixed the tests; this should work properly now. |
...src/main/java/io/trino/tests/product/launcher/env/environment/EnvSinglenodeSparkIceberg.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
CONTAINER_TEMPTO_PROFILE_CONFIG is deprecated. what's the current way to accomplish this?
There was a problem hiding this comment.
It lists EnvironmentContainers.configureTempto as its replacement, but that method doesn't work in this case (because that only works when the tempto configuration has a specific name and is in the environment's config directory).
Quite a few things around the environment configuration need to be refactored, and I avoided changing it as much as possible.
(This is also copied verbatim from EnvSinglenodeHdp3, so if we change it here, we should probably also change it there.)
There was a problem hiding this comment.
so CONTAINER_TEMPTO_PROFILE_CONFIG is deprecated but we cannot use the replacement?
was it deprecated too early? cc @kokosing
There was a problem hiding this comment.
The replacement works in most cases, but there are a few where it doesn't. To address that, we should probably add an overload like configureTempto(Environment.Builder, String) to specify which file to use rather than taking a file from a ResourceProvider.
...src/main/java/io/trino/tests/product/launcher/env/environment/EnvSinglenodeSparkIceberg.java
Outdated
Show resolved
Hide resolved
...g/trino-product-tests/src/main/java/io/trino/tests/product/iceberg/TestCreateDropSchema.java
Outdated
Show resolved
Hide resolved
...g/trino-product-tests/src/main/java/io/trino/tests/product/iceberg/TestCreateDropSchema.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
I'd use fully qualified names instead, but whatever
There was a problem hiding this comment.
I had that at first, but then if I (or any future developer) forgot to qualify the names anywhere, the tests might pass without actually using Iceberg.
It also makes the variable declarations in the tests a little less nice, because you need the unqualified schema name for the default schema location.
There was a problem hiding this comment.
I had that at first, but then if I (or any future developer) forgot to qualify the names anywhere, the tests might pass without actually using Iceberg.
we should remove default catalog/schema from tempto configuration here
@jirassimok can you work on that, separately?
4a883f2 to
ebe31db
Compare
|
please add an escape hatch like #10067 |
ebe31db to
bae9114
Compare
|
Updated to be based on #10146. Only the last two commits are part of this PR. |
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoCatalogFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Could you please add the database information in the exception?
There was a problem hiding this comment.
That would go in #10146, but this was just copied from the other "could not write" errors in the class, which also don't give more detailed messages.
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoHiveCatalog.java
Outdated
Show resolved
Hide resolved
db6f0bc to
d5970bb
Compare
d5970bb to
497fa82
Compare
Instead of /catalog/database/.trinoSchema, the database schemas in FileHiveMetastore now go in /catalog/.trinoSchema.database.
497fa82 to
6199026
Compare
There was a problem hiding this comment.
so CONTAINER_TEMPTO_PROFILE_CONFIG is deprecated but we cannot use the replacement?
was it deprecated too early? cc @kokosing
plugin/trino-iceberg/src/main/java/io/trino/plugin/iceberg/TrinoHiveCatalog.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
If we see no files
then we delete the directory
or can't see the location at all, use fallback.
only on this latter case we use fallback
did you mean "when location is not set"?
There was a problem hiding this comment.
If we see no files, delete. If we can't see or if getDatabase didn't get the location (it always should in practice, even when using default location) use fallback. I'll update the comment.
There was a problem hiding this comment.
How should we test this logic with respect to upcoming Iceberg Glue support? (#10151)
No change requested in this PR, but still worth discussing.
cc @jirassimok @losipiuk @jackye1995 @phd3
In SemiTransactionalHiveMetastore, check for files before dropping the schema. Do not request deletion (via HiveMetastore) if files are visible in the schema location. A new config property, hive.delete-schema-locations-fallback, determines the behavior when Trino can't check the file location. False (the default) will not request deletion in that case.
6199026 to
d65ec83
Compare
d65ec83 to
c666720
Compare
|
merged as ccee7b6, thanks |
Based on #10146
Adds same logic for Iceberg