Skip to content

[native] Enable hive.allow-drop-table in NativeQueryRunner#21641

Merged
majetideepak merged 1 commit intoprestodb:masterfrom
pdabre12:add-drop-table-system-property
Jan 18, 2024
Merged

[native] Enable hive.allow-drop-table in NativeQueryRunner#21641
majetideepak merged 1 commit intoprestodb:masterfrom
pdabre12:add-drop-table-system-property

Conversation

@pdabre12
Copy link
Contributor

@pdabre12 pdabre12 commented Jan 5, 2024

Allow NativeQueryRunner to drop tables. We need this for writing CTAS tests via the NativeQueryRunner.

Resolves: #21659

@pdabre12 pdabre12 requested a review from a team as a code owner January 5, 2024 10:27
@pdabre12 pdabre12 requested review from a team and shrinidhijoshi as code owners January 5, 2024 19:36
@pdabre12 pdabre12 requested a review from presto-oss January 5, 2024 19:36
Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdabre12 Are you new to Presto? Please, review contributing guildelines and make sure to create independently buildable commits with nice commit messages that follow the guidelines.

@pdabre12 pdabre12 marked this pull request as draft January 5, 2024 19:41
@pdabre12
Copy link
Contributor Author

pdabre12 commented Jan 5, 2024

@mbasmanova Sorry for the inconvenience, I missed to tag this PR as a draft, it's still in progress. I will clean it up and make sure its according to the community guidelines before its ready for review.

@mbasmanova
Copy link
Contributor

@pdabre12 Sounds good. Would you also create a GitHub issue to describe the problem you are solving and the approach you use to solve it?

@pdabre12
Copy link
Contributor Author

pdabre12 commented Jan 5, 2024

@mbasmanova Yes, I will.

@pdabre12 pdabre12 force-pushed the add-drop-table-system-property branch 4 times, most recently from 2affa78 to 8248196 Compare January 9, 2024 20:47
@pdabre12 pdabre12 marked this pull request as ready for review January 9, 2024 22:12
@pdabre12
Copy link
Contributor Author

pdabre12 commented Jan 9, 2024

@majetideepak Please review.

@pdabre12 pdabre12 changed the title Add drop table system property to native query runner [native] Add drop table system property to native query runner Jan 9, 2024
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdabre12 Is there a reason to not use the builder()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I did not use the builder() is because I wanted to make use of the putIfAbsent() method available on HashMap and as far as I know we cannot create an HashMap with a builder().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is needed because under the hood while creating a queryRunner using PrestoSparkNativeQueryRunnerUtils , the native worker properties (hiveProperties above) are added to the properties.

Not adding the putIfAbsent() check there throws an error : duplicate key values for hive.allow-drop-table .

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simply remove .put("hive.allow-drop-table", "true") below to avoid the duplicate key issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we cannot because the PrestoSparkNativeQueryRunnerUtils uses the PrestoSparkQueryRunner present in presto-spark-base and PrestoSparkQueryRunner is called separately too while testing presto-spark-base operations. Removing these properties from here results in an error while running those tests.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then likely getNativeWorkerHiveProperties is not the right place to add this.
Can we add inside at presto-hive/src/test/java/com/facebook/presto/hive/HiveQueryRunner.java:225?

@majetideepak majetideepak changed the title [native] Add drop table system property to native query runner [native] Enable hive.allow-drop-table in NativeQueryRunner Jan 10, 2024
@pdabre12 pdabre12 force-pushed the add-drop-table-system-property branch 2 times, most recently from 048bac0 to e72cd9b Compare January 10, 2024 03:18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pdabre12 : Any reason we have not added the rest of the properties viz. hive.allow-rename-(table)column, hive.allow-add-column, hive.allow-drop-column to NativeQueryRunnerUtils.getNativeWorkerHiveProperties() function above ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the need was only for hive.allow-drop-table. But good point. We could add the remaining and remove them from here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup , the need was only for hive.allow-drop-table.
My plan was to add the other properties once this refactor was accepted.

@pdabre12 pdabre12 force-pushed the add-drop-table-system-property branch 5 times, most recently from 3f7cdb3 to f32d5cf Compare January 11, 2024 19:50
@majetideepak
Copy link
Collaborator

@pdabre12 thanks for iterating on this. My goal is to avoid changes to the non-native code paths since some of the APIs are reused across different contexts.
The diff below limits the changes to native only. Will this work?

--- a/presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java
+++ b/presto-native-execution/src/test/java/com/facebook/presto/nativeworker/PrestoNativeQueryRunnerUtils.java
@@ -151,6 +151,11 @@ public class PrestoNativeQueryRunnerUtils
             boolean addStorageFormatToPath)
             throws Exception
     {
+        ImmutableMap.Builder<String, String> hivePropertiesBuilder = new ImmutableMap.Builder<>();
+        hivePropertiesBuilder
+                .putAll(getNativeWorkerHiveProperties(storageFormat))
+                .put("hive.allow-drop-table", "true");
+
         // Make query runner with external workers for tests
         return HiveQueryRunner.createQueryRunner(
                 ImmutableList.of(),
@@ -162,7 +167,7 @@ public class PrestoNativeQueryRunnerUtils
                         .build(),
                 ImmutableMap.of(),
                 "legacy",
-                getNativeWorkerHiveProperties(storageFormat),
+                hivePropertiesBuilder.build(),
                 workerCount,
                 Optional.of(Paths.get(addStorageFormatToPath ? dataDirectory + "/" + storageFormat : dataDirectory)),
                 Optional.of((workerIndex, discoveryUri) -> {

@pdabre12 pdabre12 force-pushed the add-drop-table-system-property branch 2 times, most recently from 047fa06 to 23808e3 Compare January 11, 2024 22:35
Copy link
Collaborator

@majetideepak majetideepak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @pdabre12

@majetideepak
Copy link
Collaborator

@mbasmanova can you take another look? Thanks.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this not part of getNativeWorkerHiveProperties ?

Can you update PR description to explain why you are making this change?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1. Agree with @mbasmanova. I too feel this should be a part of getNativeWorkerHiveProperties(). Several people look at that function when setting up their catalogs.

Copy link
Collaborator

@majetideepak majetideepak Jan 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We tried that. getNativeWorkerHiveProperties() is used on the Native Spark side as well and it conflicted due to duplicate config.
The diff would look like 2affa78

You will see that on the Spark side, the corresponding config is set inside the PrestoSparkQueryRunner.
We followed the existing pattern and avoided risky refactoring.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova I updated the description. Thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these classes are used for both Native and Java paths, we should have native-only, java-only, and common config.
hive.allow-drop-table is a common config.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the above classification makes sense, we could refactor the existing config in a separate PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is also Spark config.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we know why PrestoSparkQueryRunner sets this by default, but Hive query runner doesn't?

CC: @arhimondr

@mbasmanova
Copy link
Contributor

Allow NativeQueryRunner to drop tables. We need this for writing CTAS tests via the NativeQueryRunner.

@majetideepak How does this work for native Presto-on-Spark?

@majetideepak
Copy link
Collaborator

How does this work for native Presto-on-Spark?

@mbasmanova I see that it also invokes the PrestoSparkQueryRunner inside which the hive.allow-drop-table property is set.
https://github.com/prestodb/presto/blob/master/presto-native-execution/src/test/java/com/facebook/presto/spark/PrestoSparkNativeQueryRunnerUtils.java#L138

@majetideepak
Copy link
Collaborator

majetideepak commented Jan 12, 2024

@pdabre12
Copy link
Contributor Author

How does this work for native Presto-on-Spark?

@mbasmanova I see that it also invokes the PrestoSparkQueryRunner inside which the hive.allow-drop-table property is set. https://github.com/prestodb/presto/blob/master/presto-native-execution/src/test/java/com/facebook/presto/spark/PrestoSparkNativeQueryRunnerUtils.java#L138

The hive.allow-drop-table property is set in PrestoSparkQueryRunner here: https://github.com/prestodb/presto/blob/master/presto-spark-base/src/test/java/com/facebook/presto/spark/PrestoSparkQueryRunner.java#L382

@mbasmanova
Copy link
Contributor

The solution in createJavaQueryRunner looks the best. Let's do the same for native query runner and unify Prestissimo and Presto-on-Spark.

        if ("legacy".equals(security)) {
            hivePropertiesBuilder.put("hive.allow-drop-table", "true");
        }

@majetideepak
Copy link
Collaborator

majetideepak commented Jan 16, 2024

@mbasmanova the solution in this PR is what createJavaQueryRunner does. Except that security is not an argument to the createNativeQueryRunner and we always add hive.allow-drop-table.

@majetideepak
Copy link
Collaborator

@mbasmanova can you approve this change so that we can unblock the CTAS tests?
The current change aligns with createJavaQueryRunner.
There is some historical context here that we should understand and refactor the configs in a separate PR.

@mbasmanova
Copy link
Contributor

@majetideepak Deepak, I'm confused because I believe we already have CTAS tests. Is this not the case? What's blocking us from adding more?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe

ImmutableMap.<String, String>builder()
   .putAll(getNativeWorkerHiveProperties(storageFormat))
   .put("hive.allow-drop-table", "true")
   .build();

and let's add comment to explain that we need hive.allow-drop-table=true because we use "legacy" security.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mbasmanova Thanks for the review, added the comment .

@majetideepak
Copy link
Collaborator

I'm confused because I believe we already have CTAS tests. Is this not the case? What's blocking us from adding more?

The current native tests are set up such that the JavaQueryRunner writes the tables and the NativeQueryRunner reads the tables.
The new CTAS tests that we are planning to add would be the opposite. The NativeQueryRunner would write the tables and JavaQueryRunner would read the tables.
See issue #20848

Copy link
Contributor

@mbasmanova mbasmanova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. Thanks.

Please, add a comment to explain why you are adding that property and use

ImmutableMap.<String, String>builder() 
   .putAll(getNativeWorkerHiveProperties(storageFormat)) 
   .put("hive.allow-drop-table", "true") 
   .build();

@majetideepak
Copy link
Collaborator

majetideepak commented Jan 18, 2024

To add more context, some existing test CTAS APIs drop tables. So we need the queryrunner to be able to drop a table.

    protected void assertCreateTableAsSelect(Session session, String table, @Language("SQL") String query, @Language("SQL") String expectedQuery, @Language("SQL") String rowCountQuery)
    {
        assertUpdate(session, "CREATE TABLE " + table + " AS " + query, rowCountQuery);
        assertQuery(session, "SELECT * FROM " + table, expectedQuery);
        assertUpdate(session, "DROP TABLE " + table);

        assertFalse(getQueryRunner().tableExists(session, table));
    }

@pdabre12 pdabre12 force-pushed the add-drop-table-system-property branch from 23808e3 to fea00c0 Compare January 18, 2024 19:37
@pdabre12 pdabre12 force-pushed the add-drop-table-system-property branch from fea00c0 to a94d775 Compare January 18, 2024 19:39
@majetideepak majetideepak merged commit 1a4e0eb into prestodb:master Jan 18, 2024
@pdabre12 pdabre12 deleted the add-drop-table-system-property branch January 18, 2024 21:53
@wanglinsong wanglinsong mentioned this pull request Feb 12, 2024
64 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[native] Drop table operation not enabled in Native Query Runner

4 participants