[HUDI-5301] Spark SQL queries support setting parameters through set #7339

dongkelun · 2022-11-30T08:01:18Z

Change Logs

[HUDI-5301] Spark SQL queries support setting parameters through set

Impact

[HUDI-5301] Spark SQL queries support setting parameters through set

Risk level (write none, low medium or high below)

none

Documentation Update

Describe any necessary documentation update if there is any new feature, config, or user-facing change

The config description must be updated if new configs are added or the default value of the configs are changed
Any new feature or user-facing change requires updating the Hudi website. Please create a Jira ticket, attach the
ticket number here and follow the instruction to make
changes to the website.

Contributor's checklist

Read through contributor's guide
Change Logs and Impact were stated clearly
Adequate tests were added if applicable
CI passed

xiarixiaoyao · 2022-11-30T09:58:59Z

hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala

+    .key("hoodie.query.use.database")
+    .defaultValue(false)
+    .withDocumentation("Whether to add database name to qualify table name when setting parameters in Spark SQL query")
+


Does this modification have somethings to do with the pr title?

@xiarixiaoyao This title is not reflected because the form of set parameter is not supported previously. Adding this parameter is mainly consistent with Hive incremental query: ` HoodieHiveUtils.HOODIE_ INCREMENTAL_ USE_ DATABASE ', mainly considering the case that different databases have the same table name.

The reason why it is not described in detail in the PR is that it is uncertain whether the community will approve this form of query. If necessary, I can add a detailed description in the PR. In addition, only incremental queries are added to the test cases, excluding other query types. If necessary, I can add more detailed test cases

leesf · 2022-11-30T13:01:38Z

hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/DataSourceOptions.scala

      "(obtain latest view, by merging base and (if any) log files)")

+  val QUERY_USE_DATABASE: ConfigProperty[Boolean] = ConfigProperty
+    .key("hoodie.query.use.database")


sorry, I am a little confused about the config and the use case of this config.

@leesf This configuration is reflected in the test case. The main consideration is that if different databases have the same table name, such as db1.table1 and db2.table1, and if the two tables are queried in the same session at the same time, I only want to set the incremental query parameters of db1.table1：

set hoodie.table1.datasource.query.type=incremental; set hoodie.table1.datasource.read.begin.instanttime=20221130163703640;

In this way, although I only want to query db1.table1 incrementally, I will also perform incremental queries when querying db2.table1. This is not the effect I expected, so I have this parameter：

set hoodie.query.use.database = true; set hoodie.db1.table1.datasource.query.type=incremental; set hoodie.db1.table1.datasource.read.begin.instanttime=20221130163703640;

In this way, we can only perform incremental queries on db1.table1. This configuration is false by default, which is consistent with the Hive incremental query parameters

The PR of Hive incremental query：#4083

If it only affects incremental query, maybe hoodie.query.incremental.database is a better name? or it is also affect other types of query? then we need to add more test cases.

It also affects other types of queries. I can add test cases of other query types.

@leesf Hello, I have added test cases of other query types

…t test cases

hudi-bot · 2022-12-03T17:53:48Z

CI report:

e033aaf Azure: SUCCESS

Bot commands

@hudi-bot supports the following commands:

@hudi-bot run azure re-run the last Azure build

YannByron · 2022-12-05T08:56:46Z

@dongkelun I'm not a big fan of querying hudi table in other modes by setting spark conf. Maybe TableValuedFunction is a better way.

yihua

Closing this PR now since a better should be used. @dongkelun Please reopen if needed with revisions based on the suggestion. Thank you.

xiarixiaoyao approved these changes Nov 30, 2022

View reviewed changes

xiarixiaoyao self-requested a review November 30, 2022 09:49

xiarixiaoyao reviewed Nov 30, 2022

View reviewed changes

[HUDI-5301] Spark SQL queries support setting parameters through set

063ead2

dongkelun force-pushed the HUDI-5301 branch from 90cec43 to 063ead2 Compare November 30, 2022 12:33

leesf reviewed Nov 30, 2022

View reviewed changes

Add test case for snapshot and read_optimized query

adef447

dongkelun force-pushed the HUDI-5301 branch from 467e83c to adef447 Compare December 1, 2022 15:24

Merge branch 'master' of https://github.com/apache/hudi into HUDI-5301

d63ed06

dongkelun force-pushed the HUDI-5301 branch 2 times, most recently from 48d8e1e to 47c76c2 Compare December 3, 2022 02:33

Fix the failure problem caused by parameter conflict between differen…

8c2a468

…t test cases

dongkelun force-pushed the HUDI-5301 branch from 47c76c2 to 8c2a468 Compare December 3, 2022 08:56

Merge branch 'master' of https://github.com/apache/hudi into HUDI-5301

e033aaf

nsivabalan added priority:blocker Production down; release blocker release-0.12.2 Patches targetted for 0.12.2 labels Dec 5, 2022

codope added priority:critical Production degraded; pipelines stalled area:sql SQL interfaces and removed priority:blocker Production down; release blocker release-0.12.2 Patches targetted for 0.12.2 labels Dec 7, 2022

vinothchandar added the release-1.0.0 label Aug 16, 2023

github-actions bot added the size:M PR with lines of changes in (100, 300] label Feb 26, 2024

yihua reviewed Sep 16, 2024

View reviewed changes

yihua closed this Sep 16, 2024

hudi-bot mentioned this pull request Dec 9, 2025

Spark SQL queries support setting parameters through set #15599

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[HUDI-5301] Spark SQL queries support setting parameters through set #7339

[HUDI-5301] Spark SQL queries support setting parameters through set #7339

Uh oh!

dongkelun commented Nov 30, 2022 •

edited

Loading

Uh oh!

xiarixiaoyao Nov 30, 2022

Uh oh!

dongkelun Nov 30, 2022

Uh oh!

leesf Nov 30, 2022

Uh oh!

dongkelun Nov 30, 2022 •

edited

Loading

Uh oh!

dongkelun Nov 30, 2022

Uh oh!

leesf Dec 1, 2022 •

edited

Loading

Uh oh!

dongkelun Dec 1, 2022

Uh oh!

dongkelun Dec 1, 2022

Uh oh!

hudi-bot commented Dec 3, 2022

Uh oh!

YannByron commented Dec 5, 2022

Uh oh!

yihua left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

[HUDI-5301] Spark SQL queries support setting parameters through set #7339

[HUDI-5301] Spark SQL queries support setting parameters through set #7339

Uh oh!

Conversation

dongkelun commented Nov 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Logs

Impact

Risk level (write none, low medium or high below)

Documentation Update

Contributor's checklist

Uh oh!

xiarixiaoyao Nov 30, 2022

Choose a reason for hiding this comment

Uh oh!

dongkelun Nov 30, 2022

Choose a reason for hiding this comment

Uh oh!

leesf Nov 30, 2022

Choose a reason for hiding this comment

Uh oh!

dongkelun Nov 30, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongkelun Nov 30, 2022

Choose a reason for hiding this comment

Uh oh!

leesf Dec 1, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dongkelun Dec 1, 2022

Choose a reason for hiding this comment

Uh oh!

dongkelun Dec 1, 2022

Choose a reason for hiding this comment

Uh oh!

hudi-bot commented Dec 3, 2022

CI report:

Uh oh!

YannByron commented Dec 5, 2022

Uh oh!

yihua left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

dongkelun commented Nov 30, 2022 •

edited

Loading

dongkelun Nov 30, 2022 •

edited

Loading

leesf Dec 1, 2022 •

edited

Loading