-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Orc : Fix inner struct field as partition (#4604) #4599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@shardulm94 can you take a look at this? |
kbendick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @tprelle! Thanks for this patch.
Normally we only change one version per project at a time and then backport. Becase of changes in api, that's not always possible but it does make reviewing easier. Would that be possible?
| return selectNot(struct, fieldIds, false); | ||
| } | ||
|
|
||
| public static Types.StructType selectNot(Types.StructType struct, Set<Integer> fieldIds, boolean filteredStruct) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This boolean name is a bit strange to me. Is there perhaps a more descriptive name for what this corresponds to? Maybe doesHaveInnerStruct or doesHaveInnerProjectedStruct?
I was going to ask you to inline the field name with the raw boolean values in the tests, like true /* filteredStruct /*, but that still left me with some confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @kbendick,
Thanks for the review.
I had trouble to name this variable. It's may more doesNeedToFilteredInnerStruct.
I add this variable because on some cases we need to filtered inner struct (ORC) in some cases we do not need.
Is doesNeedToFilteredInnerStruct is better ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i just renamed it to doesNeedToKeepInnerStruct
Because it's more if it's a struct and if all ids need to be filtered we need keep all. So we remove all ids from filtered ids set
So @kbendick, as i keep the old signature methode on the api it's possible. |
Instead of trying to add it as a constant, read directly from the files the value of a hidden partition based on inner struct value.
|
This pull request has been marked as stale due to 30 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time and @mention a reviewer or discuss it on the [email protected] list. Thank you for your contributions. |
|
This pull request has been closed due to lack of activity. This is not a judgement on the merit of the PR in any way. It is just a way of keeping the PR queue manageable. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time. |
Fix : #4604
When you you add a inner struct field as partition field without any transformation on an Iceberg orc table, you can not select anymore only this field because the as is a partition, iceberg do not read it from the file as it's inside a struct OrcValueReaders it's not able to reconstruct a constant struct.
It's working when you select also another field from the struct because as OrcValueReaders is able to read from the file using the another field reader
So instead of trying to add it as a constant, read directly from the files
the value of a hidden partition based on inner struct value even if you only read this column