Skip to content

Conversation

@swamirishi
Copy link
Contributor

What changes were proposed in this pull request?

Currently inorder to figure out the deletion, the SST file for from snapshot is also read. With changes in HDDS-8028 tombstone entries in sst file can be read & thus can be used to read the deleted keys. Thus can be used as a hint to get the deleted key entry.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8137

How was this patch tested?

Unit Tests

@adoroszlai adoroszlai added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Mar 10, 2023
@adoroszlai
Copy link
Contributor

This is based on top of commits that belong to another open PR, so this should be marked as draft:

  1. to avoid someone accidentally merging this one before the other
  2. to save CI resources, since this one will need to be updated after the other PR gets merged

@adoroszlai adoroszlai marked this pull request as draft March 10, 2023 20:06
@swamirishi swamirishi marked this pull request as ready for review March 13, 2023 15:34
@swamirishi swamirishi changed the title HDDS-8137. Snapdiff to use tombstone entries in SST file HDDS-8137. Snapdiff to use tombstone entries in SST file Mar 13, 2023
@smengcl
Copy link
Contributor

smengcl commented Mar 15, 2023

@hemantk-12 Would you take a look at this?

Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @swamirishi.

Making thinks generic is a good idea but it is not needed always (YAGNI). In this patch, you tried to make ManagedSstFileIterator and MultipleSstFileIterator generic. Which is not a bad idea if they were generic enough. Both of them are internal to ManagedSstFileReader and have defined type String. Making these generic is just making the code complex, instead of simplifying it. It would be great if you could simplify it.

@smengcl smengcl changed the title HDDS-8137. Snapdiff to use tombstone entries in SST file HDDS-8137. [Snapshot] SnapDiff to use tombstone entries in SST files Mar 17, 2023
@smengcl
Copy link
Contributor

smengcl commented Mar 22, 2023

@swamirishi Can you resolve the comments that are addressed?

Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if you missed my general comment on Making thinks generic . It would be great if you could reply to that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, the whole comment is unnecessary. @Override will get the Javadoc comment from iterator interface. There is no needed to specifying the same thing. Talking about the RTE, anything can cause that and stating that doesn't make much difference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. I meant something gets added in between definition and closing which can cause some exception, there will be a memory leak. It is ideal to define items closer to where are used.

  2. ManagedReadOptions to declare variable as well.

ManagedReadOptions readOptions = new ManagedReadOptions();

@prashantpogde prashantpogde self-requested a review March 23, 2023 20:38
@swamirishi
Copy link
Contributor Author

Thanks for the patch @swamirishi.

Making thinks generic is a good idea but it is not needed always (YAGNI). In this patch, you tried to make ManagedSstFileIterator and MultipleSstFileIterator generic. Which is not a bad idea if they were generic enough. Both of them are internal to ManagedSstFileReader and have defined type String. Making these generic is just making the code complex, instead of simplifying it. It would be great if you could simplify it.

I actually made it generic, currently we are using the sst files & the tombstone entries as mere hints. But when we have to implement the optimal snapdiff calculation just using the delta files then we need other info & the just the key wouldn't be sufficient we would require the the objectId & the seqId from the SST file as well. That is why I made it generic.

@prashantpogde
Copy link
Contributor

@hemantk-12 @swamirishi Can we mark all the comments that are resolved here ?

Copy link
Contributor

@prashantpogde prashantpogde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We need to add some tests to validate these code paths. The tests can go in a separate PR.

@smengcl
Copy link
Contributor

smengcl commented Apr 4, 2023

Spotted a SnapDiff robot test failure in CI:

Snapshot Diff                                                         | FAIL |
key-rnpgd: 'Snapshot diff job is IN_PROGRESS
Please retry after 1000 ms.' does not contain '+'
------------------------------------------------------------------------------
Read Snapshot                                                         | PASS |
------------------------------------------------------------------------------
Snapshot.Snapshot-Sh :: Test for using sh commands with snapshots.    | FAIL |
4 tests, 3 passed, 1 failed
==============================================================================
Snapshot                                                              | FAIL |
12 tests, 6 passed, 1 failed, 5 skipped

Could you double check to make sure this is unrelated? File a jira if it is just a flaky test. @swamirishi

Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved with some minor comments.

@smengcl
Copy link
Contributor

smengcl commented Apr 14, 2023

@swamirishi
Copy link
Contributor Author

@smengcl Can this be merged?

@smengcl
Copy link
Contributor

smengcl commented Apr 18, 2023

@smengcl Can this be merged?

CI passed. nice. Will merge this.

@smengcl smengcl merged commit 8f745fc into apache:master Apr 18, 2023
errose28 added a commit to errose28/ozone that referenced this pull request Apr 20, 2023
* master: (440 commits)
  HDDS-8445. Move PlacementPolicy back to SCM (apache#4588)
  HDDS-8335. ReplicationManager: EC Mis and Under replication handlers should handle overloaded exceptions (apache#4593)
  HDDS-8355. Intermittent failure in TestOMRatisSnapshots#testInstallSnapshot (apache#4592)
  HDDS-8444. Increase timeout of CI build (apache#4586)
  HDDS-8446. Selective checks: handle change in ci.yaml (apache#4587)
  HDDS-8440. Ozone Manager crashed with ClassCastException when deleting FSO bucket. (apache#4582)
  HDDS-7309. Enable by default GRPC between S3G and OM (apache#3820)
  HDDS-8458. Mark TestBlockDeletion#testBlockDeletion as flaky
  HDDS-8385. Ozone can't process snapshot when service UID > 2097151 (apache#4580)
  HDDS-8424: Preserve legacy bucket getKeyInfo behavior (apache#4576)
  HDDS-8453. Mark TestDirectoryDeletingServiceWithFSO#testDirDeletedTableCleanUpForSnapshot as flaky
  HDDS-8137. [Snapshot] SnapDiff to use tombstone entries in SST files (apache#4376)
  HDDS-8270. Measure checkAccess latency for Ozone objects (apache#4467)
  HDDS-8109. Seperate Ratis and EC MisReplication Handling (apache#4577)
  HDDS-8429. Checkpoint is not closed properly in OMDBCheckpointServlet (apache#4575)
  HDDS-8253. Set ozone.metadata.dirs to temporary dir if not defined in S3 Gateway (apache#4455)
  HDDS-8400. Expose rocksdb last sequence number through metrics (apache#4557)
  HDDS-8333. ReplicationManager: Allow partial EC reconstruction if insufficient nodes available (apache#4579)
  HDDS-8147. Introduce latency metrics for S3 Gateway operations (apache#4383)
  HDDS-7908. Support OM Metadata operation Generator in `Ozone freon` (apache#4251)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants