[SPARK-20202][BUILD][SQL] Remove references to org.spark-project.hive (Hive 1.2.1)#29936
[SPARK-20202][BUILD][SQL] Remove references to org.spark-project.hive (Hive 1.2.1)#29936dongjoon-hyun wants to merge 2 commits intoapache:masterfrom dongjoon-hyun:SPARK-REMOVE-HIVE1
Conversation
|
Test build #129353 has started for PR 29936 at commit |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Retest this please |
|
Test build #129363 has finished for PR 29936 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #129362 has finished for PR 29936 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #129366 has finished for PR 29936 at commit
|
|
Retest this please |
|
Retest this please. |
|
Test build #129383 has finished for PR 29936 at commit
|
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #129382 has finished for PR 29936 at commit
|
srowen
left a comment
There was a problem hiding this comment.
I'm OK with it if Hive 1.2 is gone.
|
Thank you, @srowen . |
|
@wangyum can you review this? I think it's important to get your review here. |
| @@ -296,8 +296,7 @@ private[hive] class HiveClientImpl( | |||
| case e: NoClassDefFoundError | |||
| if HiveUtils.isHive23 && e.getMessage.contains("org/apache/hadoop/hive/serde2/SerDe") => | |||
There was a problem hiding this comment.
@dongjoon-hyun should we remove HiveUtils.isHive23 and related changes? I see it was added when we add Hive 2.3 support at 33f3c48#diff-842e3447fc453de26c706db1cac8f2c4R59. cc @wangyum FYI
There was a problem hiding this comment.
+1, we can remove HiveUtils.isHive23 and related changes.
| @@ -28,14 +28,11 @@ import org.apache.spark.sql.hive.test.TestHive | |||
| * {{{ | |||
There was a problem hiding this comment.
I think we can now remove this comment d7755cf#diff-10e604a9a9d9c4bcc9cdc01049851095R170
and enable this test d7755cf#diff-10e604a9a9d9c4bcc9cdc01049851095R201. Feel free to ignore back if it fails for whatever reason.
There was a problem hiding this comment.
Looks we can just remove this comment too:
|
Thank you so much for review, @HyukjinKwon and @wangyum . It's difficult to remove all conditional logics / comments / website in this PR. We have more instances in the test cases, too. I want to clean up them in another PR or a follow-up carefully. Is it okay for you guys? This PR focuses on removing |
dbtsai
left a comment
There was a problem hiding this comment.
LGTM. +1 on cleaning up the rest of Hive1.2 code in followup PR as it's everywhere, and takes time to carefully remove them.
| <id>hive-1.2</id> | ||
| <properties> | ||
| <hive.group>org.spark-project.hive</hive.group> | ||
| <hive.classifier></hive.classifier> | ||
| <!-- Version used in Maven Hive dependency --> | ||
| <hive.version>1.2.1.spark2</hive.version> | ||
| <!-- Version used for internal directory structure --> | ||
| <hive.version.short>1.2</hive.version.short> | ||
| <hive.parquet.scope>${hive.deps.scope}</hive.parquet.scope> | ||
| <hive.storage.version>2.6.0</hive.storage.version> | ||
| <hive.storage.scope>provided</hive.storage.scope> | ||
| <hive.common.scope>provided</hive.common.scope> | ||
| <hive.llap.scope>provided</hive.llap.scope> | ||
| <hive.serde.scope>provided</hive.serde.scope> | ||
| <hive.shims.scope>provided</hive.shims.scope> | ||
| <orc.classifier>nohive</orc.classifier> | ||
| <datanucleus-core.version>3.2.10</datanucleus-core.version> | ||
| </properties> | ||
| <!-- Exists only for backward compatibility. No-op. --> |
There was a problem hiding this comment.
So this profile is still useful after we remove all the hive-1.2 stuffs?
There was a problem hiding this comment.
No, it's just that you get an error if you specify -Phive-1.2 if it's removed entirely. Now, maybe that's a good thing? previously for Scala profiles I had left them in because, for example -Pscala-2.12 was not the default before. When it became the default, that would have caused an error even though it was already set up for 2.12. But this case is probably different: someone selecting Hive 1.x support should see an error, probably? in which case this should be removed.
|
Looks OK and the tests/Github Actions were passed. It is okay to clean up other Hive 1.2 related code in next PRs. |
|
Thank you all. Merged to master for Apache Spark 3.1.0. |
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Yeah, followup is fine. LGTM. |
|
Test build #129424 has finished for PR 29936 at commit
|
What changes were proposed in this pull request?
As of today,
masterbranch) removed Hive 1.2 related artifacts from Apache Spark binary distributions.This PR(SPARK-20202) aims to remove the following usage of unofficial Apache Hive fork completely from Apache Spark master for Apache Spark 3.1.0.
For the forked Hive 1.2.1.spark2 users, Apache Spark 2.4(LTS) and 3.0 (~ 2021.12) will provide it.
Why are the changes needed?
1.2.1.spark2exposed many unfixable bugs in Apache because the forked1.2.1.spark2is not maintained at all. Apache Hive 2.3.0 was released at 2017-07-19 and it has been used with less number of bugs compared with1.2.1.spark2. Many bugs still exist inhive-1.2profile and new Apache Spark unit tests are added withHiveUtils.isHive23condition so far.Does this PR introduce any user-facing change?
No. This is a dev-only change. PRBuilder will not accept
[test-hive1.2]on master andbranch-3.1.How was this patch tested?