Skip to content

Conversation

@ashishkumar50
Copy link
Contributor

What changes were proposed in this pull request?

ListStatus should use local cache copy. It should not use and update main cache which is used for DB updation.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8292

How was this patch tested?

Tested using below steps

  1. Create file
  2. Commit key
  3. Rename file (cache value updated in OM memory, still value to be updated in Rocksdb which in this case happened at step 5)
  4. List status is called, which reads from cache as well as updating cache value to full path(which it should not mean to do))-->At this step keyName is becoming wrong.
    If X number of times, List status is called before commit to DB, X number of times keys are appended.
  5. Write to DB(commit to DB for rename file operation) -->Since memory value is corrupted it inserts repeated keyName to DB.
    After the fix, it works in local cache and doesn't impact doublebuffer cache. RocksDB updates with correct keyName and also ListStatus returns proper result.

@arp7 arp7 requested review from duongkame and kerneltime April 4, 2023 15:52
Copy link
Contributor

@sumitagrawl sumitagrawl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ashishkumar50 Thanks for working over this, LGTM +1

continue;
}

// Copy cache value to local copy and work on it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good find @ashishkumar50. Could there be similar patterns with other objects leading to corruption besides OmDirectoryInfo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @swagle , Thanks for the review. OmDirectoryInfo and OmKeyInfo are only objects used for accessing cache in OzoneListStatusHelper.

}

// Copy cache value to local copy and work on it
Value copyOmInfo = ObjectUtils.clone(cacheOmInfo);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we depend on the bucket locking for concurrency correctness here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kerneltime , Thanks for the review. In MinHeapIterator already read bucket locking exists.

@kerneltime
Copy link
Contributor

@swamirishi @aswinshakil @SaketaChalamchala can you please review this?

Copy link
Contributor

@errose28 errose28 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fix LGTM @ashishkumar50. I'm just wondering your thought on the approach in this PR versus some other alternatives:

  1. Make ListStatusHelper not modify the OmKeyInfo object.
    or
  2. Shallow copy in HeapEntry#getStatus and just update the string field that is needed, instead of deep copying the entire object just to update two string fields in the copy.

keyLocationVersion != null).forEach(keyLocationVersion ->
omKeyInfo.keyLocationVersions.add(
new OmKeyLocationInfoGroup(keyLocationVersion.getVersion(),
keyLocationVersion.getLocationList(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we deep copy the contents of this list as well? OmKeyLocationInfo extends BlockLocationInfo which is not immutable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same question but looks like it is a deep copy

   * Use this expensive method only when absolutely needed!
   * It creates a new list so it is not an O(1) operation.
   * Use getLocationLists() instead.
   * @return a list of OmKeyLocationInfo
   */
  public List<OmKeyLocationInfo> getLocationList() {
    return locationVersionMap.values().stream().flatMap(List::stream)
        .collect(Collectors.toList());
  }```

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @errose28 thanks for the review.
We could have done shallow copy in HeapEntry#getStatus and modify just the two fields which is required, but still cache value having same reference is present in HeapEntry and in future if some other fields gets modified there is still possibility of corruption by not doing deep copy to the modified field.

OmKeyLocationInfo is also deep copied as pointed out by Ritesh.

Copy link
Contributor

@errose28 errose28 Apr 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

* Use this expensive method only when absolutely needed!

There is a risk that fixing this issue with deep copy causes a performance regression in list keys. Since this is a pretty critical bug I think the first version of the fix should do a shallow copy. A later implementation can decide to either leave the fix this way, use deep copy (with perf benchmarks to support no regressions), or fix the method to not require modifying the key info.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@errose28 agreed that we need something more efficient. The entirety of OzoneListStatusHelper.java and listing logic needs to be revisited. Listing in general needs to be reviewed given that 99% of the use case does not care about the key info, to begin with. But getting something off the cache and returning a mutable reference is dangerous and this fix is needed to avoid corrupting keys.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The follow up jira is fine with me, as long as we are aware that list keys is probably going to be slower than before until it is resolved.

@kerneltime kerneltime merged commit d34322f into apache:master Apr 14, 2023
jojochuang pushed a commit that referenced this pull request Apr 19, 2023
k5342 pushed a commit to pfnet/ozone that referenced this pull request Apr 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants