[HUDI-5684] Fix CTAS and Insert Into to avoid combine-on-insert by default#7813
Merged
codope merged 6 commits intoapache:masterfrom Feb 2, 2023
Merged
[HUDI-5684] Fix CTAS and Insert Into to avoid combine-on-insert by default#7813codope merged 6 commits intoapache:masterfrom
codope merged 6 commits intoapache:masterfrom
Conversation
nsivabalan
approved these changes
Feb 1, 2023
yihua
approved these changes
Feb 1, 2023
bd42788 to
4f2eef7
Compare
22bd2b7 to
cab2849
Compare
added 6 commits
February 1, 2023 21:35
- (Feature-)specific "default" configuration (that could be overridden by the user) - "Overriding" configuration (that could NOT be overridden by the user)
… (if pre-combine is specified)
a3b0274 to
3ff4e90
Compare
4 tasks
yihua
pushed a commit
that referenced
this pull request
Feb 2, 2023
…fault (#7813) * Remove `COMBINE_BEFORE_INSERT` config being overridden for insert operations * Revisited Spark SQL feature configuration to allow dichotomy of having: - (Feature-)specific "default" configuration (that could be overridden by the user) - "Overriding" configuration (that could NOT be overridden by the user) * Restoring existing behavior for Insert Into to deduplicate by default (if pre-combine is specified) * Fixing compilation * Fixing compilation (one more time) * Fixing options combination ordering
fengjian428
pushed a commit
to fengjian428/hudi
that referenced
this pull request
Apr 5, 2023
…fault (apache#7813) * Remove `COMBINE_BEFORE_INSERT` config being overridden for insert operations * Revisited Spark SQL feature configuration to allow dichotomy of having: - (Feature-)specific "default" configuration (that could be overridden by the user) - "Overriding" configuration (that could NOT be overridden by the user) * Restoring existing behavior for Insert Into to deduplicate by default (if pre-combine is specified) * Fixing compilation * Fixing compilation (one more time) * Fixing options combination ordering
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Change Logs
Currently,
InsertIntoHoodieTableby default setsCOMBINE_BEFORE_INSERTconfig whenever pre-combine field is specified and it's specified in a way that doesn't allow it to be overridden by the user.Following changes are made to address it, all Spark SQL feature-specific configs are split into dichotomy:
Impact
Avoids combining on insertion for Insert Into and CTAS statements in Spark SQL
Risk level (write none, low medium or high below)
Low
Documentation Update
N/A
Contributor's checklist