Skip to content

Conversation

@szlta
Copy link
Contributor

@szlta szlta commented Jun 21, 2021

As apache/iceberg#2613 is resolved, we should port it to Hive codebase, to enable vectorized ORC reads on Iceberg-backed tables.

Change-Id: I22a8ea3dfeb1340ffa53a51e76d237638e907bb8
Change-Id: I33841a1c9604159db912da8672c8d79e83fdbd22
@pvary
Copy link
Contributor

pvary commented Jun 23, 2021

Is there any difference in this change compared to the Iceberg version? If so, could you please help the review describing them?

Thanks,
Peter

@szlta
Copy link
Contributor Author

szlta commented Jun 23, 2021

Changes to iceberg version are:

  • some Hive classes / methods (i.e. utilities) that had to be copied over to Iceberg are obviously not part of the change on Hive side
  • qtest is an addition here
  • setting of in-memory-data-model is now part of HiveIcebergInputformat rather than IcebergStorageHandler (as vectorization might not happen for certain queries, and this is not known yet in storageHandler#configureJobConf)

@szlta szlta merged commit 2aafcdd into apache:master Jun 23, 2021
@szlta szlta deleted the HIVE-25216 branch June 23, 2021 09:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants