-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-14720][SPARK-13643] Move Hive-specific methods into HiveSessionState and Create a SparkSession class #12522
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This requires changing all the downstream places that take in HiveContext and replacing that with (SQLContext, HiveSessionState).
Now both shared state and session state is tracked in SparkSession and we use reflection to instantiate them. After this commit SQLContext and HiveContext are just wrappers for SparkSession.
Previously we still tried to load HiveContext even if the user explicitly specified an "in-memory" catalog impelmentation. Now it will load a SQLContext in this case.
It was failing because we were passing in a subclass of SparkContext into SparkSession, and the reflection was using the wrong class to get the constructor. This is now fixed with ClassTags.
Avoid some unnecessary casts.
The problem was that we weren't using the right QueryExecution when we called TestHive.sessionState.executePlan. We were using HiveQueryExecution instead of the custom one that we created in TestHiveContext. This turned out to be very difficult to fix due to the tight coupling of QueryExecution within TestHiveContext. I had to refactor this code significantly to extract the nested logic one by one.
The problem was that we were getting everything from executionHive's hiveconf and setting that in metadataHive, overriding the value of `hive.metastore.warehouse.dir`, which we customize in TestHive. This resulted in a bunch of "Table src does not exist" errors from Hive.
|
Test build #56340 has finished for PR 12522 at commit
|
|
Test build #2835 has finished for PR 12522 at commit
|
|
Test build #2836 has finished for PR 12522 at commit
|
…use SQLContext. So, let's change the catalog conf's default value to in-memory. In the constructor of HiveContext, we will set this conf to hive.
|
Test build #56376 has finished for PR 12522 at commit
|
Contributor
Author
|
test this please |
|
Test build #56381 has finished for PR 12522 at commit
|
Contributor
|
Merging this in master. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What changes were proposed in this pull request?
This PR has two main changes.
How was this patch tested?
Existing tests
This PR is trying to fix test failures of #12485.