Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@
import java.util.BitSet;
import java.util.HashMap;
import java.util.LinkedList;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
import java.util.Objects;
Expand All @@ -33,7 +34,7 @@
* in the user given path and a pointer to its parent directory element in the
* path. Also, it stores directory node related metdata details.
*/
public class OmDirectoryInfo extends WithParentObjectId {
public class OmDirectoryInfo extends WithParentObjectId implements Cloneable {
private String name; // directory name

private long creationTime;
Expand Down Expand Up @@ -266,4 +267,26 @@ public OmDirectoryInfo copyObject() {

return builder.build();
}

/**
* Return a new copy of the object.
*/
@Override
public Object clone() throws CloneNotSupportedException {
OmDirectoryInfo omDirectoryInfo = (OmDirectoryInfo) super.clone();

omDirectoryInfo.metadata = new HashMap<>();
omDirectoryInfo.acls = new ArrayList<>();

acls.stream().filter(acl -> acl != null).forEach(acl ->
omDirectoryInfo.acls.add(new OzoneAcl(acl.getType(),
acl.getName(), (BitSet) acl.getAclBitSet().clone(),
acl.getAclScope())));

if (metadata != null) {
metadata.forEach((k, v) -> omDirectoryInfo.metadata.put(k, v));
}

return omDirectoryInfo;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@
* This is returned from OM to client, and client use class to talk to
* datanode. Also, this is the metadata written to om.db on server side.
*/
public final class OmKeyInfo extends WithParentObjectId {
public final class OmKeyInfo extends WithParentObjectId implements Cloneable {
private static final Logger LOG = LoggerFactory.getLogger(OmKeyInfo.class);
private final String volumeName;
private final String bucketName;
Expand Down Expand Up @@ -789,6 +789,36 @@ public OmKeyInfo copyObject() {
return builder.build();
}

/**
* Return a new copy of the object.
*/
@Override
public Object clone() throws CloneNotSupportedException {
OmKeyInfo omKeyInfo = (OmKeyInfo) super.clone();

omKeyInfo.metadata = new HashMap<>();
omKeyInfo.keyLocationVersions = new ArrayList<>();
omKeyInfo.acls = new ArrayList<>();

keyLocationVersions.stream().filter(keyLocationVersion ->
keyLocationVersion != null).forEach(keyLocationVersion ->
omKeyInfo.keyLocationVersions.add(
new OmKeyLocationInfoGroup(keyLocationVersion.getVersion(),
keyLocationVersion.getLocationList(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we deep copy the contents of this list as well? OmKeyLocationInfo extends BlockLocationInfo which is not immutable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same question but looks like it is a deep copy

   * Use this expensive method only when absolutely needed!
   * It creates a new list so it is not an O(1) operation.
   * Use getLocationLists() instead.
   * @return a list of OmKeyLocationInfo
   */
  public List<OmKeyLocationInfo> getLocationList() {
    return locationVersionMap.values().stream().flatMap(List::stream)
        .collect(Collectors.toList());
  }```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @errose28 thanks for the review.
We could have done shallow copy in HeapEntry#getStatus and modify just the two fields which is required, but still cache value having same reference is present in HeapEntry and in future if some other fields gets modified there is still possibility of corruption by not doing deep copy to the modified field.

OmKeyLocationInfo is also deep copied as pointed out by Ritesh.

Copy link
Contributor

@errose28 errose28 Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Use this expensive method only when absolutely needed!

There is a risk that fixing this issue with deep copy causes a performance regression in list keys. Since this is a pretty critical bug I think the first version of the fix should do a shallow copy. A later implementation can decide to either leave the fix this way, use deep copy (with perf benchmarks to support no regressions), or fix the method to not require modifying the key info.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@errose28 agreed that we need something more efficient. The entirety of OzoneListStatusHelper.java and listing logic needs to be revisited. Listing in general needs to be reviewed given that 99% of the use case does not care about the key info, to begin with. But getting something off the cache and returning a mutable reference is dangerous and this fix is needed to avoid corrupting keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The follow up jira is fine with me, as long as we are aware that list keys is probably going to be slower than before until it is resolved.

keyLocationVersion.isMultipartKey())));

acls.stream().filter(acl -> acl != null).forEach(acl ->
omKeyInfo.acls.add(new OzoneAcl(acl.getType(),
acl.getName(), (BitSet) acl.getAclBitSet().clone(),
acl.getAclScope())));

if (metadata != null) {
metadata.forEach((k, v) -> omKeyInfo.metadata.put(k, v));
}

return omKeyInfo;
}

/**
* Method to clear the fileEncryptionInfo.
* This method is called when a KeyDelete operation is performed.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@

import com.google.common.base.Preconditions;
import com.google.common.base.Strings;
import org.apache.commons.lang3.ObjectUtils;
import org.apache.commons.lang3.StringUtils;
import org.apache.hadoop.hdds.utils.db.Table;
import org.apache.hadoop.hdds.utils.db.TableIterator;
Expand Down Expand Up @@ -444,6 +445,11 @@ private void getCacheValues() {
continue;
}

// Copy cache value to local copy and work on it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good find @ashishkumar50. Could there be similar patterns with other objects leading to corruption besides OmDirectoryInfo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @swagle , Thanks for the review. OmDirectoryInfo and OmKeyInfo are only objects used for accessing cache in OzoneListStatusHelper.

Value copyOmInfo = ObjectUtils.clone(cacheOmInfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we depend on the bucket locking for concurrency correctness here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kerneltime , Thanks for the review. In MinHeapIterator already read bucket locking exists.

if (copyOmInfo != null) {
cacheOmInfo = copyOmInfo;
}
if (StringUtils.isBlank(startKey)) {
// startKey is null or empty, then the seekKeyInDB="1024/"
if (cacheKey.startsWith(prefixKey)) {
Expand Down