Skip to content

Conversation

@wForget
Copy link
Member

@wForget wForget commented Sep 30, 2022

What changes were proposed in this pull request?

Support recursiveFileLookup for partitioned datasource to query partitioned datasource containing subdirectories, such as the HIVE_UNION_SUBDIR directory generated when hive tez executes the union statement.

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

  1. Unit test added to FileIndexSuite.
  2. After adding the following properties, the partition table with subdirectories can be queried correctly.
ALTER TABLE ${partitioned_table_with_subdir} SET SERDEPROPERTIES ('recursiveFileLookup' = 'true');
ALTER TABLE ${partitioned_table_with_subdir} SET SERDEPROPERTIES ('inferRecursivePartition' = 'true');

@github-actions github-actions bot added the SQL label Sep 30, 2022
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@holdenk
Copy link
Contributor

holdenk commented Oct 6, 2022

cc @HyukjinKwon

} else {
if (recursiveFileLookup) {
throw new IllegalArgumentException(
"Datasource with partition do not allow recursive file loading.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added in #24830. cc @WeichenXu123 @cloud-fan @gengliangwang FYI

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Jan 20, 2023
@github-actions github-actions bot closed this Jan 21, 2023
@tanvn
Copy link
Contributor

tanvn commented Apr 4, 2023

@HyukjinKwon @wForget
Hi, may I know the status of this PR?
Would like to take part in this issue as we are facing this while reading data from an orc partitioned table and do not like to set spark.sql.hive.convertMetastoreOrc back to false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants