-
Notifications
You must be signed in to change notification settings - Fork 4.8k
HIVE-28132: Iceberg: Add support for Replace Tag. #5222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
zhangbutao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was good overall. Only some comments about code organization optimization. :)
| long targetSnapshot = replaceTagRefSpec.getTargetSnapshot(); | ||
| LOG.info("Replacing tag {} with snapshot {} on iceberg table {}", sourceTag, targetSnapshot, table.name()); | ||
| ManageSnapshots manageSnapshots = table.manageSnapshots().replaceTag(sourceTag, targetSnapshot); | ||
| IcebergBranchExec.setOptionalReplaceParams(replaceTagRefSpec, manageSnapshots, sourceTag); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this line want to reuse IcebergBranchExec code to reduce code duplication. But my original thought was to keep Branch and Tag operation separate.
I am not aginst doing this as Branch and Tag operation indeed have many similar code.
- Maybe we can merge the two
IcebergBranchExec&IcebergTagExecinto one class. - Or we can extract the common codes and put it into the utility class.
This optimization can be done once branch&tag feature are all done in the future. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I got you right, I will do [1] in the next iter & merge the classes as suggested as IcebergSnapshotRefExec
| iterable = newOrcIterable(inputFile, task, readSchema); | ||
| break; | ||
| case PARQUET: | ||
| if (inputFile instanceof HadoopInputFile) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the change meaning? Is it related the replace tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ohh shit, was something else running internally & it got added accidentally in this, I will remove in the next iter
| } | ||
|
|
||
| @Test | ||
| public void testReplaceTag() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already have TestHiveIcebergTagOperation & TestHiveIcebergBranchOperation. Should we put the branch&tag operation into there?
| import org.apache.hadoop.hive.ql.parse.SemanticException; | ||
|
|
||
| @DDLSemanticAnalyzerFactory.DDLType(types = HiveParser.TOK_ALTERTABLE_REPLACE_TAG) | ||
| public class AlterTableReplaceTagRefAnalyzer extends AbstractAlterTableAnalyzer { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good if we can merge AlterTableReplaceTagRefAnalyzer & AlterTableReplaceBranchRefAnalyzer into one class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have attempted to merge both in the next iter.
iceberg/patched-iceberg-core/src/main/java/org/apache/iceberg/hadoop/HadoopInputFile.java
Outdated
Show resolved
Hide resolved
| ALTERTABLE_REPLACESNAPSHOTREF, | ||
| ALTERTABLE_CREATETAG, | ||
| ALTERTABLE_DROPTAG, | ||
| ALTERTABLE_REPLACETAG, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nopes, removed
| import org.apache.commons.lang3.StringEscapeUtils; | ||
| import org.apache.hadoop.conf.Configuration; | ||
| import org.apache.hadoop.filecache.DistributedCache; | ||
| import org.apache.hadoop.fs.CommonPathCapabilities; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need this import...
| op2Priv.put(HiveOperationType.ALTERTABLE_REPLACEBRANCH, | ||
| op2Priv.put(HiveOperationType.ALTERTABLE_REPLACESNAPSHOTREF, | ||
| PrivRequirement.newIOPrivRequirement(OWNER_PRIV_AR, OWNER_PRIV_AR)); | ||
| op2Priv.put(HiveOperationType.ALTERTABLE_REPLACETAG, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here too. Do we still need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nopes, removed
|
zhangbutao
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM +1




What changes were proposed in this pull request?
Add support for replace tag for iceberg tables
Why are the changes needed?
Better usability
Does this PR introduce any user-facing change?
Yes, they can replace tags in iceberg tables
Is the change a dependency upgrade?
No
How was this patch tested?
UT