Skip to content

Conversation

@chenghao-intel
Copy link
Contributor

What changes were proposed in this pull request?

We should respect the --hiveconf in the spark-sql command line, otherwise, the existing applications based on the spark 1.6 and earlier will broke, as the configurations specified via --hiveconf are missing

How was this patch tested?

I've added the unit test, but still need to be verified with real application.

@SparkQA
Copy link

SparkQA commented Jun 7, 2016

Test build #60127 has finished for PR 13542 at commit bbf0ac4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 8, 2016

Test build #60144 has finished for PR 13542 at commit 1fc8a30.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@chenghao-intel
Copy link
Contributor Author

cc @yhuai

@chenghao-intel
Copy link
Contributor Author

Currently, the SparkSQL cli will ignore the configuration passed from commandline via --hiveconf, this will break lots of existing application, it's not by design, isn't it? @yhuai @rxin

We are still verifying if this PR by running the TPCx-BB, will remove the WIP once it passed the test.

| --hiveconf ${ConfVars.METASTOREWAREHOUSE}=$warehousePath
| --hiveconf ${ConfVars.SCRATCHDIR}=$scratchDirPath
| --hiveconf conf1=conftest
| --hiveconf conf2=1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chenghao-intel Does --hiveconf ${ConfVars.METASTORECONNECTURLKEY}=$jdbcUrl work?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it works, that's intention, right?

But seems the below code in SparkSQLCliDriver will not work as we expected.

      if (key != "javax.jdo.option.ConnectionURL") {
        conf.set(key, value)
        sessionState.getOverriddenConfigurations.put(key, value)
      }

Why do we have to ignore the connection url?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yhuai any concern for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you know why those work but something like --hiveconf conf1=conftest does not?

@jameszhouyi
Copy link

Hi @chenghao-intel
This PR is tested and it's OK for my case.

@chenghao-intel chenghao-intel changed the title [SPARK-15730][SQL][WIP] Respect the --hiveconf in the spark-sql command line [SPARK-15730][SQL] Respect the --hiveconf in the spark-sql command line Jun 15, 2016
@chenghao-intel
Copy link
Contributor Author

Thanks @jameszhouyi , I've removed the WIP from the title.

@jameszhouyi
Copy link

Hi Spark community ,
Could you please help to review this PR to merge in 2.0.0 ? Since this bug has broken our real cases. Thanks in advance !

@yhuai
Copy link
Contributor

yhuai commented Jun 30, 2016

Can you provide more information on the root cause? Seems it is not clear why it does not work.

@yhuai
Copy link
Contributor

yhuai commented Jul 1, 2016

also, can you try --conf and see if it works?

@chenghao-intel
Copy link
Contributor Author

@yhuai I couldn't find any piece of code to copy the HiveConf(from SessionState) to SqlConf? Can you confirm this?

Probably that's the reason why --hiveconf doesn't work.

@jameszhouyi
Copy link

Hi,
Could you please help to review this PR to merge it in 2.0.0 ? this broken thing blocked our testing. Thanks!

@yhuai
Copy link
Contributor

yhuai commented Jul 5, 2016

I have opened https://github.com/apache/spark/pull/14058/files (it has one update).

asfgit pushed a commit that referenced this pull request Jul 5, 2016
## What changes were proposed in this pull request?
This PR makes spark-sql (backed by SparkSQLCLIDriver) respects confs set by hiveconf, which is what we do in previous versions. The change is that when we start SparkSQLCLIDriver, we explicitly set confs set through --hiveconf to SQLContext's conf (basically treating those confs as a SparkSQL conf).

## How was this patch tested?
A new test in CliSuite.

Closes #13542

Author: Cheng Hao <[email protected]>
Author: Yin Huai <[email protected]>

Closes #14058 from yhuai/hiveConfThriftServer.

(cherry picked from commit 920cb5f)
Signed-off-by: Reynold Xin <[email protected]>
@asfgit asfgit closed this in 920cb5f Jul 5, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants