HDDS-10527. Rewrite key atomically #6385

sodonnel · 2024-03-15T11:20:15Z

What changes were proposed in this pull request?

This change introduces the ability to re-create / overwrite a key in Ozone using an optimistic locking technique.

Say there is a desire to replace a key with some additional data added somewhere in the key, or perhaps change its replication type from Ratis to EC. To do this, you can read the current key data, write a new key with the same name, and then on commitKey, the new key version will be visible.

However, there is a possibility that some other client deletes the original key, or re-writes it at the same time, resulting in potential lost updates.

To replace a key in this way, the proposal is to use the existing objectID and updateID on the key to ensure the key has not changed since it was read. The flow would be:

Get the keyInfo for the current key.
Call the new bucket.overWriteKey() method, passing the details of the existing key
This call will adjust the keyArgs to pass two new fields - overwriteObjectID and updateObjectID which are taken from the objectID and updateID of the existing key.
When OM receives the open key request, it checks that an existing key is present having the passed keyname, objectID and updateID. If not, an error is returned. Otherwise the key is added to the openKeyTable, storing the overwrite IDs.
The data is written to the key as usual.
On key commit, the values stored in the openKey table for the overwrite IDs are checked against the current key. If the current key is absent, or its IDs have changed, the commit will fail and an error is returned. Otherwise the key is committed as usual.

This technique is similar to optimistic locking used in relational databases, to avoid holding a lock on an object for a long period of time.

Notably there are no additional locks needed on OM and no additional calls or rocksDB reads required to implement this - passing and storing the IDs in the openKey table is all that is required. The overwriteIDs don't need to be stored in the keyTable.

This change only added the feature for Object Store buckets for now.

Additionally, there is a question over what to do about meta-data and ACLs. Should they be copied from the existing key, or passed from the client.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-10527

How was this patch tested?

New integration and unit tests added.

adoroszlai

Thanks @sodonnel for the patch.

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneKey.java

adoroszlai · 2024-03-20T08:48:49Z

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java

+        // TODO - if we are effectively cloning a key, probably the ACLs should
+        //        be copied over server side. I am not too sure how this works.
+        .setAcls(getAclList())


Good point. I think we should omit metadata and ACLs, if possible, and let OM copy them.

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/protocol/ClientProtocol.java

…ID is unchanged, and metadata is copied over to the open key

kerneltime · 2024-03-20T18:09:09Z

Why not just introduce an object generation field and expose this API (in the future for applications itself).
Reference: https://cloud.google.com/storage/docs/json_api/v1/objects/update

Parameter Name	Value	Description
Optional query parameters
generation	long	If present, selects a specific revision of this object (as opposed to the latest version, the default).
ifGenerationMatch	long	Makes the operation conditional on whether the object's current generation matches the given value. Setting to 0 makes the operation succeed only if there are no live versions of the object.
ifGenerationNotMatch	long	Makes the operation conditional on whether the object's current generation does not match the given value. If no live object exists, the precondition fails. Setting to 0 makes the operation succeed only if there is a live version of the object.
ifMetagenerationMatch	long	Makes the operation conditional on whether the object's current metageneration matches the given value.
ifMetagenerationNotMatch	long	Makes the operation conditional on whether the object's current metageneration does not match the given value.
overrideUnlockedRetention	boolean	Applicable for object's that have an unlocked retention configuration: Required to be set to true if the operation includes a retention property that changes the mode to Locked, reduces the retainUntilTime, or removes the retention configuration from the object.

sodonnel · 2024-03-20T18:29:53Z

Why not just introduce an object generation field and expose this API (in the future for applications itself). Reference: https://cloud.google.com/storage/docs/json_api/v1/objects/update
Parameter Name Value Description
Optional query parameters
generation long If present, selects a specific revision of this object (as opposed to the latest version, the default).
ifGenerationMatch long Makes the operation conditional on whether the object's current generation matches the given value. Setting to 0 makes the operation succeed only if there are no live versions of the object.
ifGenerationNotMatch long Makes the operation conditional on whether the object's current generation does not match the given value. If no live object exists, the precondition fails. Setting to 0 makes the operation succeed only if there is a live version of the object.
ifMetagenerationMatch long Makes the operation conditional on whether the object's current metageneration matches the given value.
ifMetagenerationNotMatch long Makes the operation conditional on whether the object's current metageneration does not match the given value.
overrideUnlockedRetention boolean Applicable for object's that have an unlocked retention configuration: Required to be set to true if the operation includes a retention property that changes the mode to Locked, reduces the retainUntilTime, or removes the retention configuration from the object.

I think it would be possible to extend to this in the future, but it may need more thought about what all operations it impacts. Implementing this sort of generic approach probably needs to cover key delete and create (overwrite) at least. Perhaps move / rename too. We would need to think through what it would mean for FSO buckets (perhaps nothing different, but I have not given it deep thought).

The workings of the proposed solution here is hidden behind the new overwriteKey API on the bucket object, and that is the only external API I have changed. If we were able to get the generation concept added, the overwrite key could be changed to use it and further enhance this sort of feature.

sodonnel · 2024-03-20T18:35:56Z

Create https://issues.apache.org/jira/browse/HDDS-10558 for the idea suggested by @kerneltime as it is something that is worth exploring, but probably needs a bit more design work than just this PR.

kerneltime · 2024-03-20T20:39:19Z

Create https://issues.apache.org/jira/browse/HDDS-10558 for the idea suggested by @kerneltime as it is something that is worth exploring, but probably needs a bit more design work than just this PR.

It would not make sense to have generation and also use objectID. It is essentially the same capability, one using just the objectID and another having a more developer friendly generation id. I would recommend to introduce generation ID here instead of using objectID. It would not make sense to expose object ID to developers.

sodonnel · 2024-03-20T21:39:34Z

ObjectID is already there, as is updateID. They have not been introduced here, and they are already persisted and managed in OM. This PR only publicly exposes the new method on the bucket, with no intention of exposing it further.

If we are going to expose GenerationID through the APIs in the way the google cloud docs indicate, then we need to decide if that is something we want to start with. Eg Google cloud supports it, but AWS does not. Its more test surface, and more code to write and expose via all the interfaces and most importantly to be supported going forward. We would also have to worry about forward / backward compatibility if we are adding another ID to be persisted in OM. Old keys will not have a generationID, but perhaps it can be derived from the object / update ID. How does it tie in with the object Version? If it is a new ID to be stored, it will add a small storage / memory overhead on OM. I don't believe we should just implement something like that without giving it due consideration.

What is there in this PR, could easily be changed to use a generationID if it was introduce later and this feature sees some adoption as the use of object / updateID is contained and hidden from users of the API, which is why I would suggest exploring GenerationID in the other Jira I raised. I am trying to make the smallest useful change possible to allow for atomic key overwrite, without closing any paths for future improvements. Therefore I would like to move any design around GenerationID into the other Jira and move ahead with this one in its current direction.

sodonnel · 2024-03-20T22:15:11Z

The generationID in google cloud appears to be like the version in AWS and in Ozone to some extent (I believe versioning is not fully implemented in Ozone).

From the docs:

There is no guarantee that generation numbers increase for successive versions, only that each new version has a unique generation number.
There is no relationship between the generation numbers of unrelated objects, even if the objects are in the same bucket.

From that, it is not clear if each generationID is unique across the bucket, or if two different keys can have the the same generationID.

In Ozone, if you create a key, with objectID=5, and then create a new key of the same name, the objectID, I believe, remains the same, but the update_id is incremented.

If you deleted the key (without versioning enabled) and recreated it, the object_ID will change.

UpdateID comes from the Ratis Transaction ID (if Ratis is enabled), so it is probably unique on its own without the objectID, but I am not sure if that can be trusted without Ratis enabled. Also based on comments in WithObjectID.java.

Therefore I think that to guarantee an object has not changed, we need both object and update ID.

For now, I don't think we should expose any of this via the S3 or Hadoop compatible filesystems. We could also tag the new bucket.overwriteKey() as experimental for now, giving us scope to remove it or change it later if we decide it is not useful. However I still think it could be adapted to a new approach easily behind the public interface. Basically, I think the whole Ozone version story and version ID along with object and update ID needs to be fully worked out before we expose anything more widely that is done in this PR.

kerneltime · 2024-03-21T00:25:00Z

The generationID in google cloud appears to be like the version in AWS and in Ozone to some extent (I believe versioning is not fully implemented in Ozone).

Versioning != generation ID. Generation is just a monotonically increasing number without preserving the older generations.

From the docs:

There is no guarantee that generation numbers increase for successive versions, only that each new version has a unique generation number.
There is no relationship between the generation numbers of unrelated objects, even if the objects are in the same bucket.

From that, it is not clear if each generationID is unique across the bucket, or if two different keys can have the the same generationID.

Each object has its own generation field, it is not based on the bucket.

In Ozone, if you create a key, with objectID=5, and then create a new key of the same name, the objectID, I believe, remains the same, but the update_id is incremented.

If you deleted the key (without versioning enabled) and recreated it, the object_ID will change.

UpdateID comes from the Ratis Transaction ID (if Ratis is enabled), so it is probably unique on its own without the objectID, but I am not sure if that can be trusted without Ratis enabled. Also based on comments in WithObjectID.java.

Therefore I think that to guarantee an object has not changed, we need both object and update ID.

For now, I don't think we should expose any of this via the S3 or Hadoop compatible filesystems. We could also tag the new bucket.overwriteKey() as experimental for now, giving us scope to remove it or change it later if we decide it is not useful. However I still think it could be adapted to a new approach easily behind the public interface. Basically, I think the whole Ozone version story and version ID along with object and update ID needs to be fully worked out before we expose anything more widely that is done in this PR.

devmadhuu

Thanks @sodonnel for working on this quick patch. Overall LGTM +1. However I think, since overwrite key write flow is same as normal create key first time, it would be good if we introduce some extra information in Audit logs for overwrite flow to differentiate.

sodonnel · 2024-03-21T12:13:58Z

@devmadhuu Thanks for the review. I added one more commit that adds details to the audits.

@kerneltime In google cloud, they use the generationID to access versions of an object like:

gs://bucket/object#generation_ID

So there is surely some overlap with the Ozone version and how AWS does things. Objects that are not versioned also have a generationID, so its not tied to versioning, but it is used in versioning.

errose28 · 2024-03-21T20:40:45Z

In Ozone, if you create a key, with objectID=5, and then create a new key of the same name, the objectID, I believe, remains the same, but the update_id is incremented.

Object ID is derived from the Ratis transaction index of the create request. See the method doing the calculation that is called from all create requests. Note the extra space created around each object ID to account for directories that might need to be created with a file create request. Creating a new key with the same name will have a different object ID.

Update ID is also set on key create but additionally changed when metadata like Native ACLs are changed. This is just the Ratis transaction index with no additional modifications. See this method and everywhere it is used for what operations modify the update ID. I'm not sure what you mean "update ID is incremented". The update ID of the new key will be arbitrarily larger than the old key since it is a Ratis transaction index, but its value is not related to the update ID of the previous version of the key.

sodonnel · 2024-03-21T21:12:57Z

@errose28 I think that for a HSync (HBase related changes), when a key is appended or synced, the update ID should be increased while the objectID stays the same. As I first read your comment, I thought maybe I can cut this down to just checking the objectID, but I think the append / hsync thing changes the picture slightly, needing the update ID to be checked too if we want to ensure there are no changes.

By "update ID is incremented" I just meant it was changed. Not that it is increased by 1. I was aware that both IDs are derived from the Ratis Transaction ID.

errose28 · 2024-03-21T22:20:45Z

As I first read your comment, I thought maybe I can cut this down to just checking the objectID

Actually I'm thinking we can just check the update ID to determine if the key changed. The case where object ID alone would be useful is if only a key's native ACLs were updated and we wanted to disregard this change. However, I think we want this API to preserve native ACLs in the original key. This makes it consistent with Ranger ACLs which will remain the same as long as the key is re-rewritten to the same location.

but I think the append / hsync thing changes the picture slightly, needing the update ID to be checked too if we want to ensure there are no changes.

I don't think a key with an open lease being hsync'ed should be eligible for overwrite. I think this overwrite case is already blocked by existing hsync changes on the feature branch but we should double check to make sure we don't have issues later. I believe append will require an OM side update every time it is called, so yes that should increment the update ID.

sodonnel · 2024-03-21T22:35:13Z

I don't think a key with an open lease being hsync'ed should be eligible for overwrite

A key being hsync'ed is "open but visible" in Ozone and hence should have a lease which blocks other writers.

I think the "hsync / hbase" work also allows for a key to be appended - ie the key is closed and committed. Then a writer reopens it and appends some new data and commits / closes it again.

While the first scenario should be blocked, I need to ensure we do the right thing if an append happens to a closed key that is currently being overwrriten, and whether that is even possible! I will ask around about that.

I think you could be right that update_id is all we need as it will change on key append, hsync and object delete and recreate. It would be much nicer to only need to use 1 field.

errose28

Thanks for working on this @sodonnel. I left some minor comments and have a few ideas on the design I think we can look into:

Can we just use updateID to check for changes as discussed above?
Can we avoid persisting the overwrite ID to the open key proto on key create and have the client hold it in memory, supply it on commit, and check it then?
I think we want native ACLs of the original key preserved when using this API. We should add tests for this.
Since the PR description states FSO is out of scope for now, let's have the code explicitly enforce this.
- Currently overwrite will fail if the key is renamed or its immediate parent is changed. It will not fail if a directory farther up the file's parent tree is moved or renamed.

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneKey.java

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java

errose28 · 2024-03-21T23:15:22Z

...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyCommitRequest.java

+      // Optimistic locking validation has passed. Now set the overwrite fields to null so they are
+      // not persisted in the key table.
+      omKeyInfo.setOverwriteUpdateID(null);
+      omKeyInfo.setOverwriteObjectID(null);


Do we even need these persisted to the open key table? The client could just hold the update ID in memory from the original read, and pass it as the overwrite update ID on commit for the OM to use only during the in-memory portion of the commit. If the client dies before finishing the overwrite it will lose its client ID and not be able to access the open key to resume writing anyways.

It seems like we can move all new logic to the commit phase without modifying key create requests or the protos that are written to the DB.

Adding the overwrite IDs to the commit phase is much more difficult as the commit is performed by various sub-classes on the client. Eg Ratis vs EC vs any new write class. I did look into added them only to the commit, but concluded it was much easier to persist them to the open key table.

Therefore I prefer to keep it the way it is now.

You could also argue, why persist any of the key meta data on key open (created date, ACLs, etc). I think persisting the overwriteID in the open key table keeps with the existing pattern.

You could also argue, why persist any of the key meta data on key open (created date, ACLs, etc).

After looking this over a bit I think these are actually bugs that we need to fix:

Create time:
This one is more of a debugging/auditing inconvenience. We set create time in the create phase before the object is visible, and then set modification time in the commit phase when the key is visible. Ideally for a key that has just been created, I would think ctime and mtime would be expected to be the same. However since both create and commit show up in the audit log I guess you could argue that the current implementation is correct as well.

ACLs
This one looks more concerning. I haven't tested this yet, but it looks like the ACLs at the time of create are what are also committed to the final key, without checking if the key being replaced had ACL updates in the mean time. For example:

key1 exists with acl1

key1' is created at the same path as key1

ACLs for key1 are updated to acl2 by another user/admin.

key1' is committed with acl1 that was read at create time.

Now the ACLs have gone back in time without the admin or user intending to make this change.

Looking at it from this angle, the existing approach looks like it should be fixed to write metadata like ACLs and create time at the time of commit. Once these bugs are fixed, persisting the overwrite update ID at the time of create does not make much sense in context either.

Any change to the key, including ACLs should change the keys updateID, as that is what we are relying on to test if the key has changed. That would cause the optimistic commit to fail.

From the client code perspective, I think its easier to set the ACLs etc at the time of key open, as otherwise they need to be potentially passed down to various sub-classes. This also adds different testing paths for each type of key (Ratis, EC). That is why I am persisting the overwriteUpdateID in the openKey part. It should be passed as part of open key so we can check they key has no change since it was read and the key write starts. It costs basically nothing to persist it in the open key table and saves some complexity on commit.

If you think there are bugs around existing ACLs / Created Time / Modification time handling then please raise Jiras for them. We cannot change that as part of this PR, as it would not make sense in this context. A different PR should take care of that.

On the created time / modification time front - I think could be argued either way. What is the creation time of a file in Linux? You open the file, you write a series of bytes of several minutes and close the file. The ctime is probably the time the file was opened. The mtime is the time the file was last changed - they can easily be different. With Ozone its not as clear cut as the key traditionally has not been visible until it is committed. They the difference between ctime and mtime is really the time it took to write the bytes. After HBase allows uncommitted keys to be visible, the behavior is more like Linux and hence is quite possibly correct as it stands.

...p-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/key/OMKeyRequest.java

adoroszlai · 2024-04-22T11:39:25Z

Converted to draft until #6482 is outstanding.

… the client as with create key

Conflicts: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneKey.java hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneKeyDetails.java hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyArgs.java hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/OmKeyInfo.java hadoop-ozone/interface-client/src/main/proto/OmClientProtocol.proto hadoop-ozone/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/request/RequestAuditor.java hadoop-ozone/s3gateway/src/test/java/org/apache/hadoop/ozone/client/OzoneBucketStub.java

sodonnel · 2024-04-30T19:05:53Z

@errose28 @kerneltime I have updated this PR to match what we discussed in the design doc PR.

The only things in the design doc that are not included here are:

Using the version framework in the client to detect an old server. As this is going to go on a branch, we will do that in a followup PR.
Have not considered failing an atomic write at a block boundary. That will be investigate later, and is somewhat nice to have at this stage.

I also plan to send this PR to a branch - it will not be committed to master. I need to create the branch and then see if we can re-point the PR at the branch.

sodonnel · 2024-05-03T08:42:05Z

@errose28 @kerneltime, @adoroszlai has created the branch HDDS-10656-atomic-key-overwrite and pointed this PR at that branch.

Please let us know if you are happy with the changes here, or if you have further comments so we can progress toward committing this onto the branch.

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneKey.java

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java

sodonnel · 2024-05-08T16:43:58Z

@errose28 @kerneltime @adoroszlai have you got any further comments on this PR?

adoroszlai

Thanks @sodonnel for updating the patch.

errose28

Thanks for updating this @sodonnel.

For the unit and integration tests, I think we need one static checker in a test util class that takes the old and new KeyInfo objects from a rewrite operation and checks that metadata is either altered or unaltered accordingly. This lets the code document and enforce exactly which fields are and are not supposed to be modified by rewrite. Currently metadata checks are done ad-hoc in unit and integration checks and I think some fields like mtime and key owner are not tested.

Also there are still some lingering references to "overwrite" in the change, but I think "rewrite" is the terminology we are using now. For clarity it would be good to just do a find/replace to make those match.

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneBucket.java

hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java

hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/helpers/WithObjectID.java

errose28 · 2024-05-09T02:52:29Z

hadoop-hdds/common/src/main/java/org/apache/hadoop/ozone/OzoneConsts.java

  public static final String BUCKET_LAYOUT = "bucketLayout";
  public static final String TENANT = "tenant";
  public static final String USER_PREFIX = "userPrefix";
+  public static final String OVERWRITE_GENERATION = "overwriteGeneration";


Should we just call this generation? We may use it for something other than overwrites in the future.

+1 on this, similarly for OmKeyArgs.overwriteGeneration (edit: OmkeyArgs.overwriteGeneration change can be ignored). Although I haven't taken a look deeply, I think the "generation" concept can be reused in the future. For example, we can ensure follower read adheres to "read-your-own-writes" consistency by passing the generation that was just written to the follower to ensure that the value has been updated in the follower (instead of linearizable read which has higher latency since it requires the follower to contact the leader as well).

This constant is only used in the audit log. I think it makes sense to call it overwrite (or probably rewrite) generation to make it distinct from the current key generation. Calling the rewriteGeneration just generation could be confusing if we have a current generation called just "generation" and then this atomic rewrite parameter called "generation".

This was meant as a more general comment on the PR which probably wasn't clear. I think there are some places where rewriteGeneration makes sense, like the audit log here and the KeyArgs that are passed in by the client. Basically the places where the client is instructing the server to use the generation for rewrite purposes. However, when the value is stored or returned like in OmKeyInfo and the KeyInfo proto I think we should just call it generation. At that point the value could technically be used for anything, not just a rewrite.

...one-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyCommitRequest.java

...ration-test/src/test/java/org/apache/hadoop/ozone/client/rpc/TestOzoneRpcClientAbstract.java

errose28 · 2024-05-09T03:59:13Z

...one-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyCommitRequest.java

  }

+  @Test
+  public void testAtomicRewrite() throws Exception {


Optional suggestion:

The large multi-scenario test methods used in the three request testing classes here would be clearer if each scenario was split into its own test case, like testRewriteWhenKeyDeleted, testRewriteWhenKeyUpdated, testRewritePreservesMetadata etc. These smaller units make it easier to see what is actually being tested, and what scenarios may have been missed.

That said there's technically nothing wrong with the way it is implemented here so you can disregard this if you think there would be code duplication in test setup or other factors that make this suboptimal.

...one-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyCreateRequest.java

sodonnel · 2024-05-09T20:23:13Z

@errose28 I believe I have addressed you further comments. Please have a look and let me know what you think.

errose28 · 2024-05-09T20:48:59Z

Looking good overall, I think we just need to determine when to call it rewriteGeneration and when to just use generation from this thread. I still think we need one method used by all tests that defines metadata expectations before and after an overwrite as described in this comment but if you want to do a follow up PR for that I'm ok.

kerneltime · 2024-05-09T21:58:50Z

Looking good overall, I think we just need to determine when to call it rewriteGeneration and when to just use generation from this thread. I still think we need one method used by all tests that defines metadata expectations before and after an overwrite as described in this comment but if you want to do a follow up PR for that I'm ok.

An object should have a generation when either metadata or data mutates it should change (ideally increment). We can add sub cases dataGeneration and metadataGeneration. Qualifying it as rewriteGeneration seems superfluous.

sodonnel · 2024-05-10T08:59:05Z

@kerneltime @errose28 Can you please clarify exactly what you want it to be called in this PR, keeping in mind we don't have metaGeneration at the moment? Is it to be "generation", "dataGeneration", or perhaps "expectedGeneration", which makes sense when making a call that should only exceed when the given expected generation is present.

It seems strange to have simply "generation" to me, as ideally that would be the current generation of a key, which is currently "updateID", as we are reusing it. If we ever added metaGeneration, I doubt we would call it metaUpdateID, but perhaps we would to keep consistency.

sodonnel · 2024-05-10T09:09:14Z

Looking good overall, I think we just need to determine when to call it rewriteGeneration and when to just use generation from this thread. I still think we need one method used by all tests that defines metadata expectations before and after an overwrite as described in this comment but if you want to do a follow up PR for that I'm ok.

I have created https://issues.apache.org/jira/browse/HDDS-10843 for the test improvements.

sodonnel · 2024-05-13T21:09:51Z

@kerneltime @errose28 Following up on the naming of the "expectedGeneration" field in the protobuf. Can you review my thoughts above so we can agree on what this should be called? This is the only outstanding item in this PR.

sodonnel · 2024-05-14T16:57:22Z

I have raised https://issues.apache.org/jira/browse/HDDS-10857 to decide on the naming of the passed generation in the API. Please comment on the Jira so we can decide on the naming.

As @adoroszlai has given a +1 and @errose28 voiced approval pending this last decision I will commit this PR onto the branch tomorrow. All the followup items are under the epic and will be taken care of before merging the branch to master.

S O'Donnell added 8 commits March 15, 2024 10:22

Add existing update / object IDs to OmKeyArgs and OmKeyInfo

5fe2ff7

modified createKeyRequest and commitKeyRequest with tests

aa3bd32

Push objectid / updateid through various classes

9799ba5

Adapt client to be able to overwrite a key

5040d2b

Remove debugging log messages

6f45e67

Fix find bugs

0635134

Ensure test only runs for non FSO for now

c151122

Fix similar commit test on FSO

908f9cd

adoroszlai reviewed Mar 20, 2024

View reviewed changes

S O'Donnell added 2 commits March 20, 2024 10:48

Rename overWrite to overwrite

3c9c0e2

Overload constructors to avoid passing nulls for the new variables

17b5ff1

adoroszlai requested review from devmadhuu and siddhantsangwan March 20, 2024 16:51

adoroszlai changed the title ~~HDDS-10527. KeyOverwrite with optimisitic locking~~ HDDS-10527. KeyOverwrite with optimistic locking Mar 20, 2024

Avoid setting acl in overwrite request. Enhance test to ensure object…

0fe12c5

…ID is unchanged, and metadata is copied over to the open key

devmadhuu reviewed Mar 21, 2024

View reviewed changes

Add overwrite object / update ID to the audit if they are present

ae4044c

errose28 reviewed Mar 21, 2024

View reviewed changes

adoroszlai marked this pull request as draft April 22, 2024 11:38

S O'Donnell added 5 commits April 30, 2024 14:00

Rename overwriteUpdateID to overwriteGeneration

112e2e1

Remove updateID references from client and replace with generation

bcd1a28

Change client API to rewriteKey

4e6a97c

Fixed logic to not copy meta data from existing key, but take it from…

247620a

… the client as with create key

sodonnel marked this pull request as ready for review April 30, 2024 19:02

adoroszlai changed the base branch from master to HDDS-10656-atomic-key-overwrite May 3, 2024 07:34

adoroszlai reviewed May 6, 2024

View reviewed changes

Review comments

b91a051

adoroszlai approved these changes May 8, 2024

View reviewed changes

errose28 reviewed May 9, 2024

View reviewed changes

ivandika3 reviewed May 9, 2024

View reviewed changes

...one-manager/src/test/java/org/apache/hadoop/ozone/om/request/key/TestOMKeyCreateRequest.java Show resolved Hide resolved

S O'Donnell added 4 commits May 9, 2024 17:24

Change BucketLayout.DEFAULT to getBucketLayout()

9dc6d31

Move getGeneration to OMKeyInfo

80e1802

Rename any instances of overwrite to rewrite

bb7b74f

Remove generation from OzoneKey

a87e092

sodonnel changed the title ~~HDDS-10527. KeyOverwrite with optimistic locking~~ HDDS-10527. Rewrite key atomically May 15, 2024

sodonnel merged commit 51466d7 into apache:HDDS-10656-atomic-key-overwrite May 15, 2024

HDDS-10527. Rewrite key atomically #6385

HDDS-10527. Rewrite key atomically #6385

Uh oh!

Conversation

sodonnel commented Mar 15, 2024

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

Uh oh!

adoroszlai left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

kerneltime commented Mar 20, 2024

Uh oh!

sodonnel commented Mar 20, 2024

Uh oh!

sodonnel commented Mar 20, 2024

Uh oh!

kerneltime commented Mar 20, 2024

Uh oh!

sodonnel commented Mar 20, 2024

Uh oh!

sodonnel commented Mar 20, 2024

Uh oh!

kerneltime commented Mar 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

devmadhuu left a comment

Choose a reason for hiding this comment

Uh oh!

sodonnel commented Mar 21, 2024

Uh oh!

errose28 commented Mar 21, 2024

Uh oh!

sodonnel commented Mar 21, 2024

Uh oh!

errose28 commented Mar 21, 2024

Uh oh!

sodonnel commented Mar 21, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

errose28 left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

adoroszlai commented Apr 22, 2024

Uh oh!

sodonnel commented Apr 30, 2024

Uh oh!

sodonnel commented May 3, 2024

Uh oh!

Uh oh!

Uh oh!

Uh oh!

sodonnel commented May 8, 2024

Uh oh!

adoroszlai left a comment

Choose a reason for hiding this comment

Uh oh!

errose28 left a comment

kerneltime commented Mar 21, 2024 •

edited

Loading

sodonnel commented Mar 21, 2024 •

edited

Loading

errose28 left a comment •

edited

Loading

ivandika3 May 9, 2024 •

edited

Loading

errose28 May 9, 2024 •

edited

Loading