-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[HUDI-1982] Remove unnecessary synchronization #3041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3041 +/- ##
============================================
+ Coverage 49.86% 55.14% +5.28%
- Complexity 3527 3865 +338
============================================
Files 488 488
Lines 23618 23616 -2
Branches 2528 2528
============================================
+ Hits 11777 13023 +1246
+ Misses 10802 9434 -1368
- Partials 1039 1159 +120
Flags with carried forward coverage won't be shown. Click here to find out more.
|
| String lastKnownInstantFromClient = | ||
| ctx.queryParam(RemoteHoodieTableFileSystemView.LAST_INSTANT_TS, HoodieTimeline.INVALID_INSTANT_TS); | ||
| SyncableFileSystemView view = viewManager.getFileSystemView(basePath); | ||
| synchronized (view) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi, @chaplinthink I think it is necessary to use synchronization to sync view locally since the handler would handle different request from clients concurrently. cc @bvaradar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean the implementation of view.sync(); already has WriteLock to handle multiple requests from clients concurrently @leesf
try {
writeLock.lock();
runSync(oldTimeline, newTimeline);
} finally {
writeLock.unlock();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chaplinthink Thanks for the explanation, make sense to me. @vinothchandar @bvaradar do you have any other concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was mulling about the reloading of timeline that happens before the write lock.
@Override
public void sync() {
HoodieTimeline oldTimeline = getTimeline();
HoodieTimeline newTimeline = metaClient.reloadActiveTimeline().filterCompletedAndCompactionInstants();
try {
writeLock.lock();
runSync(oldTimeline, newTimeline);
} finally {
writeLock.unlock();
}
}
runSync() actually could init/reassign metaClient, so in theory removing synchornized could in theory make it non-serializable.
I would suggest that we either move the timeline reload into the write lock and leave this as-is. Whatever we change, we need to validate with more concurrent testing. So not sure if this is all worth the trouble.
Are you hitting real concurrency bottlenecks around this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for reply. Do you mean the synchornized is to ensure HoodieTimeline newTimeline = metaClient.reloadActiveTimeline().filterCompletedAndCompactionInstants(); concurrently?
In fact, we are also to do this right ?
@Override
public void sync() {
HoodieTimeline oldTimeline = getTimeline();
try {
writeLock.lock();
HoodieTimeline newTimeline = metaClient.reloadActiveTimeline().filterCompletedAndCompactionInstants();
runSync(oldTimeline, newTimeline);
} finally {
writeLock.unlock();
}
}
I am confused when I see the code that we use synchornized and writeLock at the same time.
I agree to validate this with more concurrent testing. Currently i have not encountered concurrency bottlenecks.
|
@yihua : As you had fixed this issue in master, this PR can be closed. right ? |
Yes. @chaplinthink #8079 has simplified the synchronization with the fix to |
Tips
What is the purpose of the pull request
synchronized is not necessary, because the sync operation already has WriteLock to ensure synchronization
Brief change log
(for example:)
Verify this pull request
(Please pick either of the following options)
This pull request is a trivial rework / code cleanup without any test coverage.
(or)
This pull request is already covered by existing tests, such as (please describe tests).
(or)
This change added tests and can be verified as follows:
(example:)
Committer checklist
Has a corresponding JIRA in PR title & commit
Commit message is descriptive of the change
CI is green
Necessary doc changes done or have another open PR
For large changes, please consider breaking it into sub-tasks under an umbrella JIRA.