-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-3435] Do not throw exception when instant to rollback does not … #4821
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
1e329eb to
b4f50a9
Compare
| if (config.isMetadataTableEnabled()) { | ||
| try (HoodieTableMetadata tableMetadata = HoodieTableMetadata.create(table.getContext(), config.getMetadataConfig(), | ||
| config.getBasePath(), FileSystemViewStorageConfig.SPILLABLE_DIR.defaultValue())) { | ||
| Option<String> latestCompactionTime = tableMetadata.getLatestCompactionTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @nsivabalan , i see that you put the code here, is there any special reason that the data set timeline archival should be in front of the metadata latest compaction time ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes.
CC @prashantwason
We have documented the reasoning here https://issues.apache.org/jira/browse/HUDI-2458
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current metadata table design, we validate the deltacommits to read for metadata table based on which completed commits are present in the dataset. So dataset should never be archiving the instants before compaction on metadata table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I read the code and this patch does not break the original limitations. Because the metadata table commits before dataset, with the new patch, the instants on the dataset timeline must also be in the metadata timeline, so we can still do the check for dataset instant existence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, @nsivabalan can you help confirm this ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be the case where deltacommit on the metadata table succeeds but commit on dataset fails (inflight -> complete transition fails).
When the job restarts, the last commit would be rollbacked first right ? And the view of the metadata table would be fixed then, so this should not be a problem.
In any case, the metadata table was read with latest filesystem view which includes the delta commits logs files, i don't know why we address metadata compaction here because compaction does not affect the table records.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When the job restarts, the last commit would be rollbacked first right ?
The last commit failed on the dataset but succeed on the metadata table. So yes it will be rolled back on the dataset eventually - depends on the settings (EAGER vs LAZY rollbacks).
Also we need to support the readers - they need to ignore the deltacommit. There can be a delay between the failed job and the retry and readers should read consistent data during that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we need to support the readers - they need to ignore the deltacommit
That is not the case for current metadata table reader i guess, and in personal i think this restriction is too much limited, doesn't the fail rollback already fixed the metadata table then after the rollback metadata was synced ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be the case where deltacommit on the metadata table succeeds but commit on dataset fails (inflight -> complete transition fails).
When the job restarts, the last commit would be rollbacked first right ? And the view of the metadata table would be fixed then, so this should not be a problem.
Last commit may not be rolled back immediately for the scenarios of single writer with async table services and multi-writer, since LAZY should be set for hoodie.cleaner.policy.failed.writes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry i don't understand why the cleaner policy can affect this, if the rollback metadata was synced, the metadata file list can be trusted right ? And the corrupt files should then be ignored automatically.
|
@hudi-bot run azure |
|
@nsivabalan The PR is ready, please review if you have time ~ |
|
@hudi-bot run azure |
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CC @yihua : guess this patch is addressing the issue you hit during the testing.
| if (config.isMetadataTableEnabled()) { | ||
| try (HoodieTableMetadata tableMetadata = HoodieTableMetadata.create(table.getContext(), config.getMetadataConfig(), | ||
| config.getBasePath(), FileSystemViewStorageConfig.SPILLABLE_DIR.defaultValue())) { | ||
| Option<String> latestCompactionTime = tableMetadata.getLatestCompactionTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes.
CC @prashantwason
We have documented the reasoning here https://issues.apache.org/jira/browse/HUDI-2458
| if (config.isMetadataTableEnabled()) { | ||
| try (HoodieTableMetadata tableMetadata = HoodieTableMetadata.create(table.getContext(), config.getMetadataConfig(), | ||
| config.getBasePath(), FileSystemViewStorageConfig.SPILLABLE_DIR.defaultValue())) { | ||
| Option<String> latestCompactionTime = tableMetadata.getLatestCompactionTime(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current metadata table design, we validate the deltacommits to read for metadata table based on which completed commits are present in the dataset. So dataset should never be archiving the instants before compaction on metadata table.
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java
Show resolved
Hide resolved
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
left comments
|
Hello @nsivabalan , i think after this path, the compaction instant constraint can be removed, can you double check this ? |
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java
Show resolved
Hide resolved
| .archiveCommitsWith(3, 4) | ||
| .retainCommits(1) | ||
| .build()) | ||
| .withMarkersType("DIRECT") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the reason for adding this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misadded can be removed.
|
@danny0405 : sorry, I don't think we can remove the data table relying on metadata table compaction. can you help me understand. Myself and @yihua did jam on this. here is our claim If I am missing something, let me know. |
For 'partially failed commit' do you mean the commit in the data set table ? Then it should not goes to the step of the archive i think, we do archive post of a successful data set commit. And when we have a commit t1, and t1 commit to metadata table successfully but failed for data set table, when restarts the job again, t1 would be rolled backed both in data set table and metadata table, so what's the problem here ? |
|
@danny0405 : here is the scenario. |
Can we disable lazy cleaner for restarts/bootstrap then ? The lazy cleaning makes sense for normal commit but it just make things complex for boostrap/restarts and it even does not gains much. |
For multi-writer scenario, we must have lazy cleaning since the job cannot tell if the inflight commit is due to failed write or actual inflight commit from another writer. So the job relies on the heartbeat timeout for determining the failed writes and lazily cleans up failed commits later. This is the whole point of having the guard we are discussing. |
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/client/HoodieTimelineArchiver.java
Show resolved
Hide resolved
yihua
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for the critical fix!
|
@hudi-bot run azure |
nsivabalan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the fix. LGTM
|
@hudi-bot run azure |
5f5a782 to
2f1368d
Compare
…exist in metadata table active timeline
…exist in metadata table active timeline (apache#4821)
…exist in metadata table active timeline
Tips
What is the purpose of the pull request
After this change, when the compaction metadata commits successfully but the data set commit state switch fails, the metadata table may bookkeep the compaction files which has been rolledback(removed), but the odds are far less than the case that metadata and data set commits never happens.
And the compaction files are actually idempotent.
Have no good solution to fix this completely, maybe we should check the archive timeline for accurate check whether the instant to rollback is archived.
Finally i add a new filtering condition for metadata table archiving to address this problem.
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.