-
Notifications
You must be signed in to change notification settings - Fork 3k
Docs: Add section to include instructions for Hive on Tez #3944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Since I am not a native English speaker, I would like @rdblue or @massdosage or someone to review the change. |
site/docs/hive.md
Outdated
|
|
||
| #### Hive on Tez configuration | ||
|
|
||
| To use Tez engine on Hive, Tez needs to be upgraded to >= `0.10.1` to contain the fix [Tez-4248](https://issues.apache.org/jira/browse/TEZ-4248). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| To use Tez engine on Hive, Tez needs to be upgraded to >= `0.10.1` to contain the fix [Tez-4248](https://issues.apache.org/jira/browse/TEZ-4248). | |
| To use the Tez engine on Hive, Tez needs to be upgraded to >= `0.10.1` which contains a necessary fix [Tez-4248](https://issues.apache.org/jira/browse/TEZ-4248). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the above for Hive >=3? If so, probably worth mentioning it. Does this also mean you need to override the Tez jar files that come with the standard Hive installation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to Tez-4100, the Tez 0.10.1 release should be working with Hadoop 3.1.3. And for standard Hive installation (default Tez version should be 0.9.2), the Tez jar file needs to be patched. I'll add a comment for that.
site/docs/hive.md
Outdated
| To use Tez engine on Hive, Tez needs to be upgraded to >= `0.10.1` to contain the fix [Tez-4248](https://issues.apache.org/jira/browse/TEZ-4248). | ||
|
|
||
| !!! Warning | ||
| For Hive `2.3.x`, need to manual build from Tez `branch-0.9` for compatibility issue in Tez `0.10.1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| For Hive `2.3.x`, need to manual build from Tez `branch-0.9` for compatibility issue in Tez `0.10.1`. | |
| For Hive `2.3.x`, you will need to manually build Tez from the `branch-0.9` branch due to a backwards incompatibility issue with Tez `0.10.1`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if I understand correctly, Hive 2.x only works with Tez 0.9.x and not 0.10.x? And therefore one needs to compile and build a specific version of Tez oneself (from that branch?) and then override the version that comes with Hive 2.3.x?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct.
site/docs/hive.md
Outdated
| !!! Warning | ||
| For Hive `2.3.x`, need to manual build from Tez `branch-0.9` for compatibility issue in Tez `0.10.1`. | ||
|
|
||
| And also set the hive config `tez.mrreader.config.update.properties=hive.io.file.readcolumn.names,hive.io.file.readcolumn.ids`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| And also set the hive config `tez.mrreader.config.update.properties=hive.io.file.readcolumn.names,hive.io.file.readcolumn.ids`. | |
| You will also need to set the following property in the Hive configuration: `tez.mrreader.config.update.properties=hive.io.file.readcolumn.names,hive.io.file.readcolumn.ids`. |
|
Hello @massdosage, thanks for the suggestion, I have changed the wording as suggested. Please take a look. |
site/docs/hive.md
Outdated
|
|
||
| #### Hive on Tez configuration | ||
|
|
||
| To use the Tez engine on Hive(requires version >= `3.1.3`), Tez needs to be upgraded to >= `0.10.1` which contains a necessary fix [Tez-4248](https://issues.apache.org/jira/browse/TEZ-4248). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no 3.1.3 release ATM. Did you mean 3.1.2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I just changed to 3.1.2
pvary
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, one final check anyone?
site/docs/hive.md
Outdated
|
|
||
| #### Hive on Tez configuration | ||
|
|
||
| To use the Tez engine on Hive(requires version >= `3.1.2`), Tez needs to be upgraded to >= `0.10.1` which contains a necessary fix [Tez-4248](https://issues.apache.org/jira/browse/TEZ-4248). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The wording is a tiny bit confusing to me. (requires version >= 3.1.2) would indicate to me that Tez can only be used with Hive 3. Shall we reword it like this?
To use the Tez engine on Hive 3.1.2 or later, ...
followed by:
To use the Tez engine on Hive 2.3.x, ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. That would be much more clear.
Updated the PR. Please take another look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good now, thanks
|
Just wondering when this change will be reflected into the website documentation. :) |
|
I am not sure TBH. The last discussion on the dev list about the docs site was this: https://lists.apache.org/thread/wq0qzbcwkjc13zp3j8mtkf8op1n5fn62 I am not sure what is the status, but if you have some time, it would be good to check out. |
|
Sorry, I only had a chance to look at it today but I see it's already been approved. LGTM too! |
|
Thanks for your time @0xffmeta, @massdosage and @marton-bod! |
* apache/iceberg#3723 * apache/iceberg#3732 * apache/iceberg#3749 * apache/iceberg#3766 * apache/iceberg#3787 * apache/iceberg#3796 * apache/iceberg#3809 * apache/iceberg#3820 * apache/iceberg#3878 * apache/iceberg#3890 * apache/iceberg#3892 * apache/iceberg#3944 * apache/iceberg#3976 * apache/iceberg#3993 * apache/iceberg#3996 * apache/iceberg#4008 * apache/iceberg#3758 and 3856 * apache/iceberg#3761 * apache/iceberg#2062 * apache/iceberg#3422 * remove restriction related to legacy parquet file list
Add instructions for hive on tez.