Skip to content

Conversation

@yhuai
Copy link
Contributor

@yhuai yhuai commented Feb 11, 2020

What changes were proposed in this pull request?

This PR sets orc's classifier to nohive, which has shaded hive-storage-api.

Why are the changes needed?

Right now, Hive 2.3 profile pulls in regular orc, which depends on hive-storage-api. However, hive-storage-api and hive-common have the following common class files

org/apache/hadoop/hive/common/ValidReadTxnList.class
org/apache/hadoop/hive/common/ValidTxnList.class
org/apache/hadoop/hive/common/ValidTxnList$RangeResponse.class

For example, https://github.com/apache/hive/blob/rel/storage-release-2.6.0/storage-api/src/java/org/apache/hadoop/hive/common/ValidReadTxnList.java (pulled in by orc 1.5.8) and https://github.com/apache/hive/blob/rel/release-2.3.6/common/src/java/org/apache/hadoop/hive/common/ValidReadTxnList.java (from hive-common 2.3.6) both are in the classpath and they are different. Having both versions in the classpath can cause unexpected behavior due to classloading order. We should still use orc-nohive, which has hive-storage-api shaded.

Does this PR introduce any user-facing change?

How was this patch tested?

@SparkQA
Copy link

SparkQA commented Feb 11, 2020

Test build #118203 has finished for PR 27536 at commit 35d9a3f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Feb 11, 2020

Test build #118207 has finished for PR 27536 at commit 26e8bcf.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@HyukjinKwon
Copy link
Member

cc @wangyum

@dongjoon-hyun
Copy link
Member

Hi, @yhuai . Thank you for making a PR. Could you fix the UT failures?

@yhuai
Copy link
Contributor Author

yhuai commented Feb 11, 2020

oh hive-storage-api still gets pulled in. Let me check.

@SparkQA
Copy link

SparkQA commented Feb 11, 2020

Test build #118265 has finished for PR 27536 at commit 9e8791e.

  • This patch fails to build.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor Author

yhuai commented Feb 12, 2020

hmm. We need to keep hive-storage-api. But I will need to check why we hit the runtime exception. Somehow we used hive-storage-api's VectorizedRowBatch instead of orc's VectorizedRowBatch for orc code path.

@yhuai
Copy link
Contributor Author

yhuai commented Feb 12, 2020

Also, the error cause was

Caused by: sbt.ForkMain$ForkError: java.lang.NoSuchMethodError: org.apache.orc.TypeDescription.createRowBatch(I)Lorg/apache/hadoop/hive/ql/exec/vector/VectorizedRowBatch;
	at org.apache.hadoop.hive.ql.io.orc.WriterImpl.<init>(WriterImpl.java:96)
	at org.apache.hadoop.hive.ql.io.orc.OrcFile.createWriter(OrcFile.java:320)
	at org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:103)
	at org.apache.spark.sql.hive.execution.HiveOutputWriter.write(HiveFileFormat.scala:156)
	at org.apache.spark.sql.execution.datasources.SingleDirectoryDataWriter.write(FileFormatDataWriter.scala:140)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:273)
	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)
	at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:281)
	... 9 more

Seems org.apache.hadoop.hive.ql.io.orc.WriterImpl was hive's orc.

@yhuai
Copy link
Contributor Author

yhuai commented Feb 12, 2020

@SparkQA
Copy link

SparkQA commented Feb 12, 2020

Test build #118279 has finished for PR 27536 at commit a5039ab.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@xuanyuanking
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Feb 12, 2020

Test build #118307 has finished for PR 27536 at commit a5039ab.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yhuai
Copy link
Contributor Author

yhuai commented Feb 14, 2020

@dongjoon-hyun @wangyum do you happen to know what happened with #27536 (comment)? Seems in hive module, we are sending orc project created VectorizedRowBatch to hive's orc data source instead of the data source file inside orc project.

@dongjoon-hyun
Copy link
Member

Not yet, @yhuai . Let me check that tonight and during weekend. I didn't dig that deeper until now. I'll ping here if I got something.

@yhuai
Copy link
Contributor Author

yhuai commented Feb 15, 2020

thank you @dongjoon-hyun !

@wangyum
Copy link
Member

wangyum commented Feb 15, 2020

Hi @omalley. Is the nohive variant compatible with Hive 2.3? https://issues.apache.org/jira/browse/ORC-174

@wangyum
Copy link
Member

wangyum commented Feb 15, 2020

I personally think it is incompatible, I have tried it many times before.
@yhuai How about replace our hive-thriftserver with another thriftserver that does not depend on Hive. After that, we can easily upgrade the built-in Hive to Hive 3.x to workaround the conflict issue.

@dongjoon-hyun
Copy link
Member

@wangyum . hive-thriftserver is irrelevant to hive module failures, e.g., org.apache.spark.sql.hive.CompressionCodecSuite, isn't it?

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @yhuai . I took a look at the failure and the related code in Spark/Orc/Hive. The current sql/hive module failures are mainly due to the following difference.

  1. Hive 1.2 ~ 2.x use the embedded ORC like org.apache.hive.orc.TypeDescription
  2. Hive 2.3 uses Apache ORC like org.apache.orc.TypeDescription. (Note that the package name is different)
$ javap -cp orc-core-1.5.9.jar org.apache.orc.TypeDescription | grep "createRowBatch(int)"
  public org.apache.hadoop.hive.ql.exec.vector.VectorizedRowBatch createRowBatch(int);
  1. ORC nohive library has the following. (Note that the return type is different)
$ javap -cp orc-core-1.5.9-nohive.jar org.apache.orc.TypeDescription | grep "createRowBatch(int)"
  public org.apache.orc.storage.ql.exec.vector.VectorizedRowBatch createRowBatch(int);

As we know, orc:nohive is for no-hive environment like sql/core. As a result, all ORC tests in sql/core passes always. However, along with that, orc:nohive doesn't aim to support Hive code itself. Hive code needs the original orc library. I believe that this was the original technical difficulty which @wangyum and @gatorsmile tried to fix in #23788 .

@yhuai
Copy link
Contributor Author

yhuai commented Feb 18, 2020

@dongjoon-hyun thank you for looking into it. How did sql/hive work with hive 1.2 and orc nohive? Does sql/hive also use hive's orc when hive 1.2 was used?

@dongjoon-hyun
Copy link
Member

Yes. Right. Hive 1.2 exists before Apache ORC and has all ORC related stuff as a embedded manner like org.apache.hive.orc.TypeDescription.

@yhuai
Copy link
Contributor Author

yhuai commented Feb 20, 2020

@dongjoon-hyun So, hive 2.3 depends on apache orc instead of using orc embedded in hive, which means we will need to pull in regular orc instead of orc-nohive. Is my understanding correct?

@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Feb 21, 2020

The current failure analysis is due to that. Yes.

For pulling regular ORC here, cc @wangyum and @gatorsmile for the reasoning of the original PR.

@omalley
Copy link
Contributor

omalley commented Feb 24, 2020

Sorry for being late to the thread. Yes, now that Spark depends on Hive >= 2.3, we should move away from the nohive variant and share the same ORC release.

@yhuai
Copy link
Contributor Author

yhuai commented Feb 26, 2020

thank you @omalley and @dongjoon-hyun!

btw, are we concerned that hive-common shipped with hive 2.3.6 and hive-storage-api 2.6.0 used by orc 1.5.9 share duplicate classes that have different versions? I am worried that we may not consistently pick up the right version due to class loading order, which can cause confusing runtime exception.

@yhuai
Copy link
Contributor Author

yhuai commented Mar 4, 2020

Closing it as we need to use regular orc

@yhuai yhuai closed this Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants