Disallow dropping Hive schema that contains external files#10146
Disallow dropping Hive schema that contains external files#10146findepi merged 7 commits intotrinodb:masterfrom
Conversation
findepi
left a comment
There was a problem hiding this comment.
i understand the missing piece is the logic which would drive the value for the deleteData parameter.
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/glue/GlueHiveMetastore.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/glue/GlueHiveMetastore.java
Outdated
Show resolved
Hide resolved
|
@jirassimok can you please verify Glue |
811fe45 to
71571a0
Compare
|
@jirassimok did you look at the CI's red? |
|
It doesn't look related. |
e773947 to
c3e09be
Compare
.../trino-hive/src/main/java/io/trino/plugin/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
.../trino-hive/src/main/java/io/trino/plugin/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
losipiuk
left a comment
There was a problem hiding this comment.
reviewed. Tiny things + question ragarding default behaviour for case when we fail to list schema directory.
c3e09be to
e3abc7a
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/file/FileHiveMetastore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Why is it needed in Don't put database schema files in the database directories commit?
There was a problem hiding this comment.
It's ensuring that the database directory exists when the database is created. It's not strictly necessary, but I think it makes sense.
There was a problem hiding this comment.
it's an error when deleteData && location.isEmpty().
Either checkState or add a comment why we're not failing
There was a problem hiding this comment.
This can theoretically occur if database.getLocation() returns empty.
In this case, it only occurs if Glue's API returns null for the database location, which seems unlikely, but that method is not documented as non-null.
.../trino-hive/src/main/java/io/trino/plugin/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestCreateDropSchema.java
Outdated
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestCreateDropSchema.java
Outdated
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestCreateDropSchema.java
Outdated
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestCreateDropSchema.java
Outdated
Show resolved
Hide resolved
.../trino-hive/src/main/java/io/trino/plugin/hive/metastore/SemiTransactionalHiveMetastore.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveConfig.java
Outdated
Show resolved
Hide resolved
e3abc7a to
07ed54f
Compare
2fd7051 to
6116ac0
Compare
|
Green tests before new push. |
6116ac0 to
73c0880
Compare
|
Changes in last update: auto-close the |
There was a problem hiding this comment.
Why true as default. You probably talked this through.
There was a problem hiding this comment.
It's the current behavior, so it's less likely to cause problems for anyone upgrading. But it's also the less-safe behavior for someone who isn't familiar with Trino who might , especially if they aren't very familiar with it. I think we did discussed it last week, but maybe we should change it?
What do you think, @findepi?
There was a problem hiding this comment.
The point of the change is "do not delete those random files when deleting schema directory".
false feels more appropriate in case we cannot verify whether there are any files in there.
let's also add code comment discussion the rationale behind the choice we put in the default
There was a problem hiding this comment.
Switched the default and added a comment.
73c0880 to
ec3c615
Compare
|
Rebased forward, and avoided including |
ec3c615 to
877cc04
Compare
Instead of /catalog/database/.trinoSchema, the database schemas in FileHiveMetastore now go in /catalog/.trinoSchema.database.
877cc04 to
11055a1
Compare
|
@jirassimok mind the CI |
In SemiTransactionalHiveMetastore, check for files before dropping the schema. Do not request deletion (via HiveMetastore) if files are visible in the schema location. A new config property, hive.delete-schema-locations-fallback, determines the behavior when Trino can't check the file location. False (the default) will not request deletion in that case.
11055a1 to
703098e
Compare
|
All tests have passed, something is just dangling in this one so it isn't finishing the job. |
This will attempt to reimplement the behavior from #9740 in a way that doesn't break for some configurations.