[Native] Allow '$file_size' and '$file_modified_time' for HiveSplits in SQL queries#21965
Merged
aditi-pandit merged 1 commit intomasterfrom Mar 8, 2024
Merged
[Native] Allow '$file_size' and '$file_modified_time' for HiveSplits in SQL queries#21965aditi-pandit merged 1 commit intomasterfrom
aditi-pandit merged 1 commit intomasterfrom
Conversation
78cf196 to
534d361
Compare
This was referenced Feb 19, 2024
534d361 to
271a46a
Compare
facebook-github-bot
pushed a commit
to facebookincubator/velox
that referenced
this pull request
Mar 5, 2024
…me' to be queried in SQL (#8800) Summary: $file_size and $file_modified_time are queryable synthesized columns for Hive tables in Presto. Spark also has bunch of such queryable synthesized columns (#7880). The columns are passed by the co-ordinator to the worker in the HiveSplit. i) Velox HiveSplit needed to be enhanced to get filesize and file_modified_time metadata in a generic map data-structure of (column name, value) from Prestissimo. ii) These values should be populated by SplitReader into TableScanOperator output buffers. This also needs a Prestissimo change to populate the HiveSplit with this info sent in the fragment prestodb/presto#21965 Fixes prestodb/presto#21867 gaoyangxiaozhu will have a follow up PR on the Spark integration. Pull Request resolved: #8800 Reviewed By: mbasmanova Differential Revision: D54512245 Pulled By: Yuhta fbshipit-source-id: 190a97f9fcb1e869fff82e0a2264d57f9915376e
fd7414c to
3173c67
Compare
Contributor
Author
|
@Yuhta : This is the Prestissimo side change for info columns. PTAL. |
fcbb73c to
0d4a6f3
Compare
Yuhta
reviewed
Mar 7, 2024
Comment on lines
960
to
966
Contributor
There was a problem hiding this comment.
Suggested change
| for (const auto& entry : hiveLayout->predicateColumns) { | |
| if (toHiveColumnType(entry.second.columnType) == | |
| velox::connector::hive::HiveColumnHandle::ColumnType::kSynthesized) { | |
| if (assignments.count(entry.first) == 0) { | |
| assignments.emplace( | |
| entry.first, toColumnHandle(&entry.second, typeParser)); | |
| } | |
| for (const auto& [name, col] : hiveLayout->predicateColumns) { | |
| if (toHiveColumnType(col.columnType) == | |
| velox::connector::hive::HiveColumnHandle::ColumnType::kSynthesized) { | |
| VELOX_CHECK(assignments.emplace(name, toColumnHandle(&col, typeParser)).second, "Duplicate assignment: {}", name); |
Contributor
Author
There was a problem hiding this comment.
@Yuhta : We want to add these to the assignments list only if they are not present already. Its not quite the same code.
presto-native-execution/presto_cpp/main/types/PrestoToVeloxQueryPlan.cpp
Outdated
Show resolved
Hide resolved
0d4a6f3 to
f5eb848
Compare
Contributor
Author
|
@Yuhta : PTAL. |
majetideepak
approved these changes
Mar 8, 2024
| protocol::ColumnType columnType, | ||
| const protocol::ColumnHandle& column) { | ||
| if (toHiveColumnType(columnType) == | ||
| velox::connector::hive::HiveColumnHandle::ColumnType::kSynthesized) { |
Collaborator
There was a problem hiding this comment.
nit: connector::hive::HiveColumnHandle::ColumnType should be sufficient.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #21867
Test Plan
e2e tests