-
Notifications
You must be signed in to change notification settings - Fork 588
HDDS-11900. obs granular lock for existing request [To be splitted] #7647
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
65b8547 to
c6dda89
Compare
errose28
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the prototype @sumitagrawl. Just did a high level review more focused on code organization than correctness. Can we split this PR down further?
- One PR for the general locking framework + gatekeeper
- Separating this will make it easier to follow changes when looking at commit history.
- One PR for key related requests (create, allocate block, commit, delete, etc)
- One PR for bucket requests (create, delete, update)
- One PR for volume requests (create, delete, update)
- One PR for S3 MPU requests
The current patch is already too large and once the required testing is added for each request it will only get larger.
| import org.slf4j.LoggerFactory; | ||
|
|
||
| /** | ||
| * key locking. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a lot more detail to how this class is used
| */ | ||
| public class KeyLock { | ||
| private static final Logger LOG = LoggerFactory.getLogger(KeyLock.class); | ||
| private static final long LOCK_TIMEOUT_DEFAULT = 10 * 60 * 1000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make stripe sizes and timeouts configurable. As suggested above though, this would happen in a layer above this class.
| } | ||
| } | ||
|
|
||
| public List<Lock> readLock(List<String> keyList) throws OMException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need read lock? In the new flow RocksDB should handle consistent views without a cache. Is this just for compatibility with the old flow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Readlock is added for bucket when key operation is in progress. Some key type need read lock and some need write lock, as as Generic part, the interface is provided.
| } | ||
|
|
||
| public void start() { | ||
| ThreadFactory threadFactory = new ThreadFactoryBuilder().setDaemon(true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can extend the existing BackgroundService class to keep all background service management consistent
| /** | ||
| * key locking. | ||
| */ | ||
| public class KeyLock { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this name is confusing. It seems to imply that it for locking "keys" as they are defined in the Ozone namespace. It is actually a generic class that supports locking based on any string, regardless of its significance to the system.
I would move this to one of the common modules outside of OM, have it take any configurations as parameters, and throw general typed exceptions that upper layers can chain on to specific service type exceptions like OMException. This class seems to fit in the same family of utils as SimpleStripe.
| public OmLockOpr.OmLockInfo lock(OzoneManager ozoneManager, OmLockOpr lockOpr) throws IOException { | ||
| return null; | ||
| } | ||
|
|
||
| public void unlock(OmLockOpr lockOpr, OmLockOpr.OmLockInfo lockInfo) { | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of defining these stubs in the parent class? I don't see why anyone would call them from outside (public) or using the generic OMClientRequest type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lock key needs to be obtained from Request Object structure, and its different for different request.
So its not possible to get this directly as AOP. Two way of implementation:
- Switch case of each request type, and retrieve the lock information based on this
-- This is tried out but that class itself looks complex and un-manageble - Each request handle define how it can get and call respective lock
- Some operation do not need any lock, so provided with default implementation on no lock
| executorService.shutdown(); | ||
| } | ||
|
|
||
| public OmLockInfo volumeReadLock(String volumeName) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should try to greatly reduce the API surface required here. My understanding is that only one lock used at write time for a volume, bucket, or key is required, but the design doc unfortunately does not provide more info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add to the doc,
We have different lock group for volume, bucket and key, as they can conflict with each other.
As API surface,
For volume handling, we have, volume read and write lock
For bucket handling, we have bucket read and write lock
For OBS key handling, its obsLock
So had these set of API, will update design doc further.
|
|
||
| } | ||
|
|
||
| public OmLockOpr.OmLockInfo lock(OzoneManager ozoneManager, OmLockOpr lockOpr) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we touching FSO/Legacy requests in this OBS patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@errose28 We have, OMDirectoryCreateRequest for OBS and OMDirectoryCreateRequestWithFSO for FSO.
So above class is OBS only (not FSO)
Legacy many places is treated as OBS only, so from this code perspective, it can not distinguish if LEGACY or OBS, so precautionary added the check for OBS.
| if (!s3Auth) { | ||
| OzoneManagerRatisUtils.checkLeaderStatus(ozoneManager); | ||
| } | ||
| if (isRatisEnabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be necessary with the current 2.0 changes to remove non-ratis OM from master.
| } | ||
| } | ||
|
|
||
| private static void performUnlock( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The design doc mentioned some sort of "GateKeeper" entity, which I was expecting to see in this PR. Currently it looks like locking/unlocking is strewn about multiple places. Can we consolidate coordination to one new OMRequestGateKeeper class?
| long curTime = Time.monotonicNowNanos() - MONITOR_LOCK_THRESHOLD_NS; | ||
| for (OmLockInfo lockInfo : lockMonitorMap.values()) { | ||
| if ((curTime - lockInfo.getLockTakenTime()) > 0) { | ||
| LOG.warn("Lock {} is crossing threshold {}: ", lockInfo, MONITOR_LOCK_THRESHOLD_NS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will completely flood the logs in a deadlock situation, regardless of the interval the check runs. We need some in-memory tracker for which locks have already had warnings issued.
|
Being split. |
What changes were proposed in this pull request?
Granular locking for OBS for existing flow.
locking is added for external request at entry point. This provides execution of request at leader and existing flow simultaneouly without impacting for cache.
locking added for obs request(HDDS-11898. design doc leader side execution)Step-by-step integrationof existing request (interoperability)Parent Jira:
https://issues.apache.org/jira/browse/HDDS-11897
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-11900
How was this patch tested?