Disallow dropping Hive schemas that contain external files#9740
Disallow dropping Hive schemas that contain external files#9740losipiuk merged 1 commit intotrinodb:masterfrom
Conversation
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
plugin/trino-hive/src/main/java/io/trino/plugin/hive/HiveMetadata.java
Outdated
Show resolved
Hide resolved
testing/trino-product-tests/src/main/java/io/trino/tests/product/hive/TestCreateDropSchema.java
Outdated
Show resolved
Hide resolved
514304f to
2d197c5
Compare
1aa0008 to
e993519
Compare
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/file/FileHiveMetastore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
can you just static import those (actually I am not sure if you really can) :)
There was a problem hiding this comment.
If I make the enum non-private, I could.
There was a problem hiding this comment.
Yeah - would make more sense to me :)
There was a problem hiding this comment.
Made it package-private with a comment.
plugin/trino-hive/src/main/java/io/trino/plugin/hive/metastore/file/FileHiveMetastore.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
nit: put path in exception message
losipiuk
left a comment
There was a problem hiding this comment.
LGTM. A few nits. To be merged after.
fc68747 to
daf9981
Compare
|
This shouldn't be merged yet, pending offline discussion. |
|
Per offline discussion, we should unregister such schema, and leave the files on the storage intact. |
daf9981 to
412d29e
Compare
There was a problem hiding this comment.
The more I think about it, the less sure I am that this is the right place to delete the files.
I think it might be better to include a deleteData parameter for HiveMetastore.dropDatabase, and pass true when there are no external files. That argument is already present on HiveMetastore.dropTable and dropPartition, too, so it wouldn't be entirely out of place.
I guess if we delete the file here, though, HiveMetastore.dropDatabase should note that it's not supposed to delete non-metadata files.
There was a problem hiding this comment.
I'm not completely sure how this case still works (well, it seemed to work locally; maybe it doesn't actually work). If the location property is not set, HiveMetadata shouldn't be deleting anything, and the HiveMetastore implementations shouldn't be deleting anything, either.
(Though I don't believe this test runs with every HiveMetastore we have, which would probably be good coverage to have.)
|
Extracted #9902 from here |
There was a problem hiding this comment.
This should be done in SemiTransactionalHiveMetastore, and only when commit is invoked
add a test with:
DROP SCHEMA- rollback
- try to use that schema
There was a problem hiding this comment.
This depends on file systems.
We should have a copy of that test in io.trino.plugin.hive.AbstractTestHive, so it's run for supported storages too.
|
#9902 merged, @jirassimok can you please rebase & squash? |
aab6657 to
1b56cc6
Compare
fb8d1c4 to
85a8348
Compare
|
@electrum this slightly changes the semantics of |
fe78787 to
7b2e2d3
Compare
|
That update added some logging. |
|
Still LGTM. Maybe:
And we can merge this one? |
|
Sounds good. @alexjo2144 also suggested logging a warning instead of throwing an exception if there's an error while trying to delete the files, because the schema has already been dropped at that point. |
7b2e2d3 to
149992b
Compare
In HiveMetadata, delete an empty schema location after dropping it from the metastore. In ThriftHiveMetastore and FileHiveMetastore, do not delete data.
149992b to
11ba2d2
Compare
|
Last push just fixed an copy/paste error in the tests that made one of them fail. |
This avoids potential data loss when a schema is created and dropped in a location that already has files.