Skip to content

Add checkpoint for ORC UNION type#14532

Merged
phd3 merged 1 commit intotrinodb:masterfrom
baohe-zhang:fix_union_stream_checkpoint
Nov 10, 2022
Merged

Add checkpoint for ORC UNION type#14532
phd3 merged 1 commit intotrinodb:masterfrom
baohe-zhang:fix_union_stream_checkpoint

Conversation

@baohe-zhang
Copy link
Copy Markdown
Contributor

@baohe-zhang baohe-zhang commented Oct 9, 2022

Description

  • When an ORC stripe has more than one ROW_GROUP, UNION type column will be failed to read.
  • Tracked down to the case where UNION type data streams are not handled as part of stream checkpointing.
  • Added checkpointing for UNION type to fix the issue

Non-technical explanation

Release notes

() This is not user-visible or docs only and no release notes are required.
() Release notes are required, please propose a release note for me.
(x) Release notes are required, with the following suggested text:

# Hive
- Fix read failures for union type columns in ORC tables. ({issue}`#14532`)

@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Oct 9, 2022

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@baohe-zhang
Copy link
Copy Markdown
Contributor Author

Hi @phd3 Could you review it?

@phd3
Copy link
Copy Markdown
Member

phd3 commented Oct 20, 2022

Copy link
Copy Markdown
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks fine to me, but would be good to have a test

@phd3
Copy link
Copy Markdown
Member

phd3 commented Oct 24, 2022

When an ORC stripe has more than one ROW_GROUP, UNION type column will be failed to read.

@baohe-zhangis is it correct to say that - we're able to reproduce this by writing file bigger than one rowgroup in a stripe in a unit test? Given that this is an omission from the original implementation, may be okay to merge without a special test (i.e. we don't have such tests for other types)

@baohe-zhang
Copy link
Copy Markdown
Contributor Author

baohe-zhang commented Nov 1, 2022

When an ORC stripe has more than one ROW_GROUP, UNION type column will be failed to read.

@baohe-zhangis is it correct to say that - we're able to reproduce this by writing file bigger than one rowgroup in a stripe in a unit test? Given that this is an omission from the original implementation, may be okay to merge without a special test (i.e. we don't have such tests for other types)

@phd3 We can't do that in a unit test as right now trino orc doesn't support write in union type.

[EDITED by phd3: discussed offline about adding a product test]

@baohe-zhang baohe-zhang force-pushed the fix_union_stream_checkpoint branch from 9b07615 to 1c08ccc Compare November 4, 2022 15:35
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Nov 4, 2022

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@baohe-zhang
Copy link
Copy Markdown
Contributor Author

Hi @phd3 @electrum ,

I added a unit test in hive product test. Without this fix, the test will fail and throw an exception:

presto-master       | io.trino.spi.TrinoException: Failed to read ORC file: hdfs://hadoop-master:9000/user/hive/warehouse/test_read_uniontype/000000_0
presto-master       | 	at io.trino.plugin.hive.orc.OrcPageSource.handleException(OrcPageSource.java:209)
presto-master       | 	at io.trino.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:163)
presto-master       | 	at io.trino.plugin.hive.HivePageSource.getNextPage(HivePageSource.java:197)
presto-master       | 	at io.trino.operator.TableScanOperator.getOutput(TableScanOperator.java:311)
presto-master       | 	at io.trino.operator.Driver.processInternal(Driver.java:411)
presto-master       | 	at io.trino.operator.Driver.lambda$process$10(Driver.java:314)
presto-master       | 	at io.trino.operator.Driver.tryWithLock(Driver.java:706)
presto-master       | 	at io.trino.operator.Driver.process(Driver.java:306)
presto-master       | 	at io.trino.operator.Driver.processForDuration(Driver.java:277)
presto-master       | 	at io.trino.execution.SqlTaskExecution$DriverSplitRunner.processFor(SqlTaskExecution.java:739)
presto-master       | 	at io.trino.execution.executor.PrioritizedSplitRunner.process(PrioritizedSplitRunner.java:164)
presto-master       | 	at io.trino.execution.executor.TaskExecutor$TaskRunner.run(TaskExecutor.java:515)
presto-master       | 	at io.trino.$gen.Trino_401_96_g7335d3f____20221103_184416_2.run(Unknown Source)
presto-master       | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
presto-master       | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
presto-master       | 	at java.base/java.lang.Thread.run(Thread.java:833)
presto-master       | Caused by: java.lang.IllegalArgumentException: Unsupported column type UNION
presto-master       | 	at io.trino.orc.checkpoint.Checkpoints.getStreamCheckpoints(Checkpoints.java:122)
presto-master       | 	at io.trino.orc.StripeReader.createRowGroups(StripeReader.java:350)
presto-master       | 	at io.trino.orc.StripeReader.readStripe(StripeReader.java:172)
presto-master       | 	at io.trino.orc.OrcRecordReader.advanceToNextStripe(OrcRecordReader.java:533)
presto-master       | 	at io.trino.orc.OrcRecordReader.advanceToNextRowGroup(OrcRecordReader.java:475)
presto-master       | 	at io.trino.orc.OrcRecordReader.nextPage(OrcRecordReader.java:396)
presto-master       | 	at io.trino.plugin.hive.orc.OrcPageSource.getNextPage(OrcPageSource.java:158)
presto-master       | 	... 14 more

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @baohe-zhang , great to see this being reproduced.

2022-11-06T15:50:09.2852993Z tests               | 2022-11-06 21:35:09 INFO: Test io.trino.tests.product.hive.TestReadUniontype.testReadOrcUniontypeWithCheckpoint took 42.67s
2022-11-06T15:50:09.9087769Z tests               | 2022-11-06 21:35:09 INFO: SUCCESS     /    io.trino.tests.product.hive.TestReadUniontype.testReadOrcUniontypeWithCheckpoint (Groups: smoke) took 43.3 seconds
2022-11-06T15:50:11.0978592Z tests               | 2022-11-06 21:35:11 INFO: [141 of 561] io.trino.tests.product.hive.TestReadUniontype.testReadUniontype [ORC] (Groups: smoke)

The test takes ~43s on the CI. I'm bit on the fence whether it "deserves" this kind of time - given it was an omission originally. This is ~1% of overall module time - less than many other tests. So I'm leaning towards including it. cc @findepi @electrum

@phd3
Copy link
Copy Markdown
Member

phd3 commented Nov 7, 2022

@cla-bot check

@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Nov 7, 2022

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Nov 7, 2022

The cla-bot has been summoned, and re-checked this pull request!

@phd3
Copy link
Copy Markdown
Member

phd3 commented Nov 7, 2022

cc @martint for processing @baohe-zhang 's CLA

Co-Authored-By: Jithesh T Rajan <jirajan@linkedin.com>
@baohe-zhang baohe-zhang force-pushed the fix_union_stream_checkpoint branch from 1c08ccc to f96a015 Compare November 9, 2022 15:16
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Nov 9, 2022

Thank you for your pull request and welcome to the Trino community. We require contributors to sign our Contributor License Agreement, and we don't seem to have you on file. Continue to work with us on the review and improvements in this PR, and submit the signed CLA to cla@trino.io. Photos, scans, or digitally-signed PDF files are all suitable. Processing may take a few days. The CLA needs to be on file before we merge your changes. For more information, see https://github.com/trinodb/cla

@martint
Copy link
Copy Markdown
Member

martint commented Nov 9, 2022

cc @martint for processing @baohe-zhang 's CLA

We never received the email with the signed CLA. @baohe-zhang, can you please resubmit?

@baohe-zhang
Copy link
Copy Markdown
Contributor Author

Hi @martint, I resubmitted my cla.

@martint
Copy link
Copy Markdown
Member

martint commented Nov 9, 2022

@cla-bot check

@cla-bot cla-bot bot added the cla-signed label Nov 9, 2022
@cla-bot
Copy link
Copy Markdown

cla-bot bot commented Nov 9, 2022

The cla-bot has been summoned, and re-checked this pull request!

@phd3 phd3 merged commit 6536c34 into trinodb:master Nov 10, 2022
@github-actions github-actions bot added this to the 403 milestone Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

4 participants