Skip to content

Conversation

@a2l007
Copy link
Contributor

@a2l007 a2l007 commented Apr 12, 2023

Upgrades parquet-mr dependency version to 1.13.0. The release notes for the version can be found here.

This PR has:

  • been self-reviewed.
  • added or updated version, license, or notice information in licenses.yaml
  • been tested in a test Druid cluster.

@abhishekagarwal87
Copy link
Contributor

@a2l007 - can you look into CI failures. LGTM once those are addressed.

@a2l007
Copy link
Contributor Author

a2l007 commented Apr 12, 2023

I missed that #14005 isn't merged yet. Moving this to draft state until that PR is merged since the latest parquet version is built on top of hadoop 3.2.0.

@a2l007 a2l007 marked this pull request as draft April 12, 2023 19:01
@abhishekagarwal87
Copy link
Contributor

Hmm. will this upgrade break Hadoop 2? If so, we cannot make the change till we get rid of Hadoop 2 entirely.

@a2l007
Copy link
Contributor Author

a2l007 commented Apr 26, 2023

@abhishekagarwal87 I have a discussion with the parquet folks going on here: https://issues.apache.org/jira/browse/PARQUET-2276
They've unintentionally broke compatibility with hadoop 2.8 with their latest version and it seems from their responses the lowest hadoop version they will support is 2.9. I'll need to test out the latest parquet dependency with hadoop 2.9 and evaluate if it is worth making this temporary move to hadoop 2.9 for Druid before our long term migration over to 3.x.
Another option for us would be to stay put at parquet version 1.12.0. One drawback of this would be that we wouldn't be able to take full advantage of Parquet Modular Encryption, but I don't know if other druid users care about that feature at the moment.

@abhishekagarwal87
Copy link
Contributor

@a2l007 - thanks for sharing the context. one way to make this work is to have different parquet-mr version for different hadoop profiles. So parquet-mr can be 1.13.0 on Hadoop 3 and 1.12.0 on Hadoop 2. We have pom profiles for different hadoop versions.

@a2l007 a2l007 marked this pull request as ready for review June 5, 2023 00:00
@a2l007
Copy link
Contributor Author

a2l007 commented Jun 5, 2023

@abhishekagarwal87 Thanks for the review!

@a2l007 a2l007 merged commit 6a4cbab into apache:master Jun 7, 2023
@abhishekagarwal87 abhishekagarwal87 added this to the 27.0 milestone Jul 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants