Skip to content

Conversation

@YannByron
Copy link
Contributor

This pr will remote the config on whether preserve commit metadata when compaction and do not change commit metadata any more.

#4428 (comment)

@YannByron YannByron changed the title [HUDI-3213] compaction do not change commit metadata [HUDI-3213] compaction should not change commit metadata Jan 17, 2022
@nsivabalan
Copy link
Contributor

nsivabalan commented Jan 17, 2022

@YannByron : Can we just mark the default value for config to true. Just incase anyone has requirement or if we had any bugs we have a way to revert it.
We are changing a behavior that has been working one way for the past 3 years or so. Just being cautious.

@nsivabalan nsivabalan added priority:critical Production degraded; pipelines stalled and removed priority:critical Production degraded; pipelines stalled labels Jan 17, 2022
@nsivabalan nsivabalan self-assigned this Jan 17, 2022
@YannByron
Copy link
Contributor Author

@YannByron : Can we just mark the default value for config to true. Just incase anyone has requirement or if we had any bugs we have a way to revert it.

This would make the definition of compaction unclear. If anyone has this requirement, he should expand the scope of incremental query.
But, in terms of compatibility, maybe yes. And also, indicates the config is about to be deprecated.
There is a trade-off between compatibility and correcting misbehavior. IMO, i prefer to remove this config.

@nsivabalan @vinothchandar @xushiyan need some thought to decide this.

@nsivabalan
Copy link
Contributor

synced up with Vinoth on this. This is a large enough change and we want to be cautious on how to go about this. May not pick it up for 0.10.1.

@YannByron
Copy link
Contributor Author

synced up with Vinoth on this. This is a large enough change and we want to be cautious on how to go about this. May not pick it up for 0.10.1.

yep, pick it into 0.11.0.

@nsivabalan
Copy link
Contributor

@YannByron : any updates on this.

@YannByron
Copy link
Contributor Author

YannByron commented Feb 4, 2022

@nsivabalan If the conclusion is to keep this hoodie.compaction.preserve.commit.metadata parameter and just set the default value to true, I will change pr in this way. wait for your confirmation.

@nsivabalan
Copy link
Contributor

@xushiyan : Can we jam on this sometime. lets see how we can make progress here.

@nsivabalan
Copy link
Contributor

@YannByron : yes. lets go w/ what you have suggested.

@nsivabalan
Copy link
Contributor

@YannByron : we also need to make a minor fix. When preserve commit metadata is enabled, we need to retain every meta field except filename. File name has to be overriden. guess I missed it when I added preserve commit metadata for compaction. Fix should go in here.

You can check out the fix we did for clustering for clustering preserve commit metadata.

If you are occupied, let me know. I can push changes or put up a new patch.

@YannByron
Copy link
Contributor Author

@nsivabalan i'll try to fix this asap. if have any problems, i'll let you know.

@YannByron
Copy link
Contributor Author

@hudi-bot run azure

@hudi-bot
Copy link
Collaborator

CI report:

Bot commands @hudi-bot supports the following commands:
  • @hudi-bot run azure re-run the last Azure build

@YannByron
Copy link
Contributor Author

@nsivabalan there are something i am not familiar to, so i need to your help to solve the failure of CI indeed. push changes or open a new patch.

@nsivabalan
Copy link
Contributor

closing this in favor of #4811

@nsivabalan nsivabalan closed this Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

priority:critical Production degraded; pipelines stalled

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants