-
Notifications
You must be signed in to change notification settings - Fork 20
fix concurrent mpu duplicate objects bug #2459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: development/8.2
Are you sure you want to change the base?
Conversation
Hello sylvainsenechal,My role is to assist you with the merge of this Available options
Available commands
Status report is not available. |
d8c874c
to
739fa54
Compare
739fa54
to
1548fce
Compare
1548fce
to
2faa742
Compare
const putFilter = { _id: key }; | ||
if (params?.preventConcurrentCompletion) { | ||
putFilter['value.completeInProgress'] = { $ne: true }; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach does not work for Metadata (since metadata does not go through MongoClientInterface), and introduce some business-specific logic deep down in the "metadata layer" (completeInProgress is not related to metadata management, but really only to MPU objects) : thus creating more coupling and breaking the abstraction...
This may also relate to a larger issue that we have : we are currently not able (in the metadata interface/api) to distinguish between updates to an existing objet [like what is done in this specific case, we just want to try and set the completeInProgress flag] and an insertion. Here both are mostly undistinguishable, and thus we always upsert.
→ I wonder if having the distrinction would allow to handle this differently, i.e. on update we must not upsert and keep/increase the value of some other field.... (it would be a much larger endeavor, but useful debt cleanup needed in the long run anyway)
Alternatively, if we don't think this approach solves the issue or if we need that fix more urgently, this needs more work:
- We may use
putObjectWithCond
function instead to set thecompleteInProgress
flag. It allows performing such conditional operations, while keeping the condition in upper layer (cloudserver). It works only on non-versioned buckets -which is fine for shadow bucket-, and should be supported by metadata as well ; not sure however if we actually use it (i.e. is it stable enough, and/or if the upsert behavior is appropriate), so some testing may be needed. - If that does not work and we need to introduce an extra flag in
putObject
, we will need to design the new flag so that it has a generic semantic which allows to fix the issue, but makes sense generically and can be implemented in metadata and other cases of putObject (versionnned...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is also a (kind of generic) replayId
flag, which is set on cloudserver side based on the uploadId
, and implemented in Metadata but not in MongoClientInterface : may be worth reviewing this, to understand where it is set (in mpu shadow bucket, or in the actual bucket) and if it would solve our problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, Metadata uses the replayId
to deduplicate the incoming requests, it's used in this specific scenario to avoid duplicating versions and ending up with duplicate sproxyd keys in particular. It's also used for complete/abort conflicts, where we consider that the abort always wins (and the user has to re-create a new MPU in such case for the next PUT of the same object).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and Cloudserver passes the upload ID of the MPU as the unique property of the MPU
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jonathan-gramain ! Can you explain what is the expected semantics of the "backend", i.e. metadata/mongoClientInterface, regarding this field? And point us to the implementation in metadata?
cc @williamlardier maybe this also relates to the other issues we found/fixed on MPU ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@francoisferrand I don't think it is related to my recent PRs in the sense that we really had a wrong logic in some cases, but for this one (the current PR), it is definitely related yes. I saw we had logic in the metadata backend that we don't with MongoDB.
As we definitely reproduced the issue with a mongodb backend, that is, end up with two distinct versions and the same data location, I was just wondering if @jonathan-gramain knows if the metadata backend would be also affected. As you mention to avoid duplicating versions and ending up with duplicate sproxyd keys in particular
I would say this is handled.
If that is the case, it means two things:
- We need to fix it for mongodb backend as well
- We should list all these specific logic we have in metadata but not MongoDB, like this one, that allows to better support edge cases? I feel like this added logic in metadata is hiding the API behavior in the other backends case...
* with updating the operation log. | ||
* @param {boolean} params.preventConcurrentCompletion - If true, the object is only created if | ||
* completeInProgress is not already set to true. Fix this concurrent completeMultipartUpload issue : | ||
* https://scality.atlassian.net/jira/software/c/projects/OS/boards/268?selectedIssue=CLDSRV-687 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just ticket number of enough
* https://scality.atlassian.net/jira/software/c/projects/OS/boards/268?selectedIssue=CLDSRV-687 | |
* CLDSRV-687 |
Issue : ARSN-513
related : scality/cloudserver#5879