Skip to content

Conversation

@wangyum
Copy link
Member

@wangyum wangyum commented Oct 22, 2021

What changes were proposed in this pull request?

This PR ported HIVE-21498, HIVE-25098 and upgraded libthrift to 0.16.0.

The CHANGES list for libthrift 0.16.0 is available at: https://github.com/apache/thrift/blob/v0.16.0/CHANGES.md

Why are the changes needed?

To address CVE-2020-13949.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Existing test.

@wangyum
Copy link
Member Author

wangyum commented Oct 22, 2021

cc @HyukjinKwon @juliuszsompolski We can upgrade to 0.13.0 first. If 0.16.0 is released, we can upgrade to 0.16.0 because we need this patch.

@HyukjinKwon
Copy link
Member

cc
@srowen and @dongjoon-hyun too FYI

@SparkQA
Copy link

SparkQA commented Oct 22, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48993/

@SparkQA
Copy link

SparkQA commented Oct 22, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48993/

@SparkQA
Copy link

SparkQA commented Oct 22, 2021

Test build #144522 has finished for PR 34362 at commit 8d9e375.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor but legitimate thing we should do: update LICENSE-binary to add a line about the licensing of this, which is dual licensed as CDDL 1.1 and GPL + classpath. Just add one line around...

Common Development and Distribution License (CDDL) 1.1
------------------------------------------------------

javax.el:javax.el-api	https://javaee.github.io/uel-ri/
javax.servlet.jsp:jsp-api

I suppose we need a copy of the license text in licenses-binary too, ideally
https://github.com/javaee/javax.annotation/blob/master/LICENSE

Copy link
Member Author

@wangyum wangyum Oct 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed javax.annotation-api-1.3.2.jar. It seems we do not need it:

LM-SHC-16508156:thrift yumwang$ grep -ER "javax.annotation" .
./lib/java/gradle/environment.gradle:ext.javaxAnnotationVersion = property('javax.annotation.version')
./lib/java/gradle/environment.gradle:    compile "javax.annotation:javax.annotation-api:${javaxAnnotationVersion}"
Binary file ./lib/java/.gradle/5.6.4/executionHistory/executionHistory.bin matches
Binary file ./lib/java/.gradle/5.6.4/javaCompile/classAnalysis.bin matches
Binary file ./lib/java/.gradle/5.6.4/javaCompile/taskHistory.bin matches
./lib/java/gradle.properties:javax.annotation.version=1.3.2
./lib/java/build/poms/pom-default.xml:      <groupId>javax.annotation</groupId>
./lib/java/build/poms/pom-default.xml:      <artifactId>javax.annotation-api</artifactId>
Binary file ./lib/java/build/deps/javax.annotation-api-1.3.2.jar matches
./lib/java/build/tmp/javadoc/javadoc.options:-classpath '/Users/yumwang/opensource/thrift/lib/java/build/classes/java/main:/Users/yumwang/opensource/thrift/lib/java/build/resources/main:/Users/yumwang/.gradle/caches/modules-2/files-2.1/org.slf4j/slf4j-api/1.7.28/2cd9b264f76e3d087ee21bfc99305928e1bdb443/slf4j-api-1.7.28.jar:/Users/yumwang/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpclient/4.5.10/7ca2e4276f4ef95e4db725a8cd4a1d1e7585b9e5/httpclient-4.5.10.jar:/Users/yumwang/.gradle/caches/modules-2/files-2.1/org.apache.httpcomponents/httpcore/4.4.12/21ebaf6d532bc350ba95bd81938fa5f0e511c132/httpcore-4.4.12.jar:/Users/yumwang/.gradle/caches/modules-2/files-2.1/javax.servlet/javax.servlet-api/4.0.1/a27082684a2ff0bf397666c3943496c44541d1ca/javax.servlet-api-4.0.1.jar:/Users/yumwang/.gradle/caches/modules-2/files-2.1/javax.annotation/javax.annotation-api/1.3.2/934c04d3cfef185a8008e7bf34331b79730a9d43/javax.annotation-api-1.3.2.jar:/Users/yumwang/.gradle/caches/modules-2/files-2.1/commons-logging/commons-logging/1.2/4bfc12adfe4842bf07b657f0369c4cb522955686/commons-logging-1.2.jar:/Users/yumwang/.gradle/caches/modules-2/files-2.1/commons-codec/commons-codec/1.11/3acb4705652e16236558f0f4f2192cc33c3bd189/commons-codec-1.11.jar'
./compiler/cpp/src/thrift/generate/t_java_generator.cc:  indent(out) << "@javax.annotation.Generated(value = \"" << autogen_summary() << "\"";

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'm sure we don't use it directly, but that doesn't mean thrift doesn't need it.
That said, it's just a set of annotations, which are usually optional.
We exclude this elsewhere, so I think this is OK.

@SparkQA
Copy link

SparkQA commented Oct 22, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49000/

@SparkQA
Copy link

SparkQA commented Oct 22, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/49000/

@SparkQA
Copy link

SparkQA commented Oct 22, 2021

Test build #144529 has finished for PR 34362 at commit affe2f6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yaooqinn
Copy link
Member

Interesting, Thrift 0.13.0 does not work with JDK8 per https://issues.apache.org/jira/browse/THRIFT-5274

@srowen
Copy link
Member

srowen commented Oct 22, 2021

Hm, looks like our Java 8 tests pass though. It might not affect Spark, but that makes me uneasy.
I suppose we can wait for 0.16.0, but not sure how many months away that is.
I wouldn't oppose merging this to move forward a bit

@dongjoon-hyun
Copy link
Member

Thank you for pinging me, @HyukjinKwon .
cc @sunchao

@dongjoon-hyun dongjoon-hyun marked this pull request as draft November 4, 2021 06:34
@dongjoon-hyun
Copy link
Member

Let's wait for a while since this is not for Apache Spark 3.2.1.
To prevent accidental merging, I converted this PR to the draft, @wangyum .

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems OK if tests all pass this time

Copy link
Member

@srowen srowen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wangyum if you're OK with this, you can mark it as ready to merge. Looks OK to me

@dongjoon-hyun dongjoon-hyun marked this pull request as ready for review January 30, 2022 04:01
@dongjoon-hyun
Copy link
Member

I converted it back to Ready for Review because I did it before before Apache Spark 3.2.1 release.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last commit failed with the following. Could you re-trigger the tests, @wangyum ?

Error:  Failed to execute goal on project spark-hive_2.12: Could not resolve dependencies for project org.apache.spark:spark-hive_2.12:jar:spark-975225:
Could not find artifact org.apache.thrift:libthrift:jar:0.16.0 in gcs-maven-central-mirror (https://maven-central.storage-download.googleapis.com/maven2/) -> [Help 1]

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending CIs. And, please make the PR description up-to-date.

This pr backport HIVE-21498 to upgrade libthrift to 0.13.0.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-37090][BUILD] Upgrade libthrift to avoid security vulnerabilities [SPARK-37090][BUILD] Upgrade libthrift to 0.16.0 to avoid security vulnerabilities Jan 30, 2022
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-37090][BUILD] Upgrade libthrift to 0.16.0 to avoid security vulnerabilities [SPARK-37090][BUILD] Upgrade libthrift to 0.16.0 to avoid security vulnerabilities Jan 30, 2022
@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Feb 6, 2022

I updated the description manually; I think this is good to go?

We have no evidence that this PR passes the CIs, don't we, @srowen ? As I mentioned here (#34362 (review)), the latest CI failed.

@srowen
Copy link
Member

srowen commented Feb 6, 2022

Ah I see the tests passing but did not all applicable tests run? Is it because the tests need to be enabled to run in the submitter's acct?

@dongjoon-hyun
Copy link
Member

  • No, no tests ran. It was a build error due to Could not find artifact org.apache.thrift:libthrift:jar:0.16.0 as I wrote in the previous comment.
  • When you cannot check the result in Apache Spark repo, you need to visit their branch because sometimes GitHub Action notification doesn't work properly. In this PR, the following is his branch. And, you can see the actual GitHub Action result at the commit.

@wangyum
Copy link
Member Author

wangyum commented Feb 14, 2022

Sorry. I was on Chinese new year holiday last 2 weeks. I used to test Thrift 0.16-rc0, Thrift 0.16 has not been officially released.

@wangyum wangyum marked this pull request as draft February 14, 2022 02:38
@dongjoon-hyun
Copy link
Member

No problem. Please let us know when the PR is ready, @wangyum .

@wangyum wangyum marked this pull request as ready for review February 18, 2022 06:21
14:53:54.715 WARN org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "HiveServer2-Handler-Pool: Thread-164" java.lang.NoClassDefFoundError: org/apache/thrift/transport/TFramedTransport
  | => hat java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:348)
        at org.apache.hadoop.hive.metastore.MetaStoreUtils.getClass(MetaStoreUtils.java:1708)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:131)
        at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104)
        at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3607)
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3659)
        at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3639)
        at org.apache.hadoop.hive.ql.metadata.Hive.getAllFunctions(Hive.java:3901)
        at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:248)
        at org.apache.hadoop.hive.ql.metadata.Hive.registerAllFunctionsOnce(Hive.java:231)
        at org.apache.hadoop.hive.ql.metadata.Hive.<init>(Hive.java:395)
        at org.apache.hadoop.hive.ql.metadata.Hive.create(Hive.java:339)
        at org.apache.hadoop.hive.ql.metadata.Hive.getInternal(Hive.java:319)
        at org.apache.hadoop.hive.ql.metadata.Hive.get(Hive.java:288)
Comment on lines +25 to +30
* This is based on libthrift-0.12.0 {@link org.apache.thrift.transport.TFramedTransport}.
* To fix class of org.apache.thrift.transport.TFramedTransport not found after upgrading libthrift.
*
* TFramedTransport is a buffered TTransport that ensures a fully read message
* every time by preceding messages with a 4-byte frame size.
*/
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen
Copy link
Member

srowen commented Feb 18, 2022

Looks like librthrift 0.16.0 was released and this passes, so should be good to go. Thoughts about backporting this to 3.2 and 3.1? (3.0, I presume, is EOL now). I guess I'm inclined to unless there is a non-trivial risk of breaking something, like Hadoop 2 compatibility

@srowen
Copy link
Member

srowen commented Feb 20, 2022

One last check from @dongjoon-hyun maybe; this and backports looks OK

@dongjoon-hyun
Copy link
Member

Thank you for asking, @srowen . Sounds reasonable. No objection for backporting from my side.

@srowen srowen closed this in 4789e1f Feb 20, 2022
@srowen
Copy link
Member

srowen commented Feb 20, 2022

Merged to master. I'll look at the backports next

@srowen
Copy link
Member

srowen commented Feb 20, 2022

Oh @wangyum do you have pull requests for the backports? the content looks OK, just bears testing

yaooqinn pushed a commit to apache/kyuubi that referenced this pull request Feb 23, 2022
### _Why are the changes needed?_

Upgrade libthrift to 0.16.0 due to [CVE-2020-13949](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13949) and the coming upstream change of Spark apache/spark#34362

### _What changed in this PR?_

- Upgrade libthrift to 0.16.0

- Shade and relocate `thrift` and `hive-service-rpc` classes in `kyuubi-spark-engine`, it's necessary to avoide conflicting with old thrift libs bundled in Spark binary releases.

- Due to thrift change the method signature, the subclasses those interfaces in Kyuubi also need to modify to pass compile.
We rely on Hive 2.3.9 jars in some components, e.g. `kyuubi-hive-jdbc`, `LocalMetaServer` in `kyuubi-server` test classes.

Some classes in Hive jars compiled against old thrift interfaces which are not compatible with thrift 0.16.0, it causes runtime link error, we found the following classes which breaks the test and copied them with necessary modification to make it work with thrift 0.16.0.

    - `TFramedTransport`
    - `TFilterTransport`
    - `TUGIAssumingTransport`
    - `TUGIContainingTransport`

- Next Steps, I think it's worth to do in separated PRs.

    - Recover the `HiveDelegationTokenProviderSuite`, one approach is use an isolate classloader to load HMS classes and thrift 0.9.3 classes from Maven, this approach can also be used for the planed Zoopkeeper upgrading to help us verficating the compatibility of Zookeeper Server 3.4.x.
    - Rewrite `kyuubi-hive-jdbc` to make it decouple with Hive jars, because there maybe other places which may not work with thrift 0.16.0 but the UTs does not cover.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [x] [Run test](https://kyuubi.apache.org/docs/latest/develop_tools/testing.html#running-tests) locally before make a pull request

Closes #1953 from SteNicholas/KYUUBI-1948.

Closes #1948

de5d1ea [SteNicholas] [KYUUBI #1948] Upgrade thrift version to 0.16.0
898effc [SteNicholas] [KYUUBI #1948] Upgrade thrift version to 0.16.0
803e270 [SteNicholas] [KYUUBI #1948] Upgrade thrift version to 0.16.0

Authored-by: SteNicholas <[email protected]>
Signed-off-by: Kent Yao <[email protected]>
srowen pushed a commit that referenced this pull request Feb 26, 2022
…ty vulnerabilities

This is a backport of #34362 to branch 3.2.

### What changes were proposed in this pull request?

This PR ported HIVE-21498, HIVE-25098 and upgraded libthrift to 0.16.0.

The CHANGES list for libthrift 0.16.0 is available at: https://github.com/apache/thrift/blob/v0.16.0/CHANGES.md

### Why are the changes needed?

To address [CVE-2020-13949](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13949).

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing test.

Closes #35646 from wangyum/SPARK-37090-branch-3.2.

Authored-by: Yuming Wang <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
srowen pushed a commit that referenced this pull request Feb 26, 2022
…ty vulnerabilities

This is a backport of #34362 to branch 3.1.

### What changes were proposed in this pull request?

This PR ported HIVE-21498, HIVE-25098 and upgraded libthrift to 0.16.0.

The CHANGES list for libthrift 0.16.0 is available at: https://github.com/apache/thrift/blob/v0.16.0/CHANGES.md

### Why are the changes needed?

To address [CVE-2020-13949](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13949).

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Existing test.

Closes #35647 from wangyum/SPARK-37090-branch-3.1.

Authored-by: Yuming Wang <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
@dongjoon-hyun
Copy link
Member

Hi, All. This is reverted from all branches due to the regression.
Please see #35646

kazuyukitanimura pushed a commit to kazuyukitanimura/spark that referenced this pull request Aug 10, 2022
…ty vulnerabilities

This is a backport of apache#34362 to branch 3.2.

This PR ported HIVE-21498, HIVE-25098 and upgraded libthrift to 0.16.0.

The CHANGES list for libthrift 0.16.0 is available at: https://github.com/apache/thrift/blob/v0.16.0/CHANGES.md

To address [CVE-2020-13949](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13949).

No.

Existing test.

Closes apache#35646 from wangyum/SPARK-37090-branch-3.2.

Authored-by: Yuming Wang <[email protected]>
Signed-off-by: Sean Owen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants