Skip to content

Conversation

@swamirishi
Copy link
Contributor

What changes were proposed in this pull request?

SstFiltering service directly updates the snapshotInfo and doesn't go through DoubleBufferFlush to update the DB. This interferes with SnapshotInfo updated during SnapshotPurgeRequest.

Consider the case:

S1 <- S2(SstFiltered) <- S3 (Sst unfiltered)

At T1 if S2 is purged and submitted to OM ratis but it hasn't been flushed to the rocksDb but just present in the cache which makes the chain.

S1 <- S3 (Sst Unfiltered in cache)

S2(SstFiltered, deleted in Cache but present in table)

Suppose at T2 SstFiltering service runs on the chain and marks S3 to be SstFiltered which would be directly flushed to DB

S1 <- S3 (Sst filtered in DB)

S1 <- S2 (Present in DB but not present in cache)

Now if the OM is restarted, snapshot chain would have gotten corrupted since both S2 & S3 are pointing S1 in the DB.

Changes made in the patch:

  • Create an sstFiltered file in the same snapshot directory which will denote the snapshot is filtered and take a snapshot lock while creating this file.
  • Take a lock while deleting the Snapshot directory since that operation is not atomic.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11068

How was this patch tested?

Unit tests

Change-Id: I9e16fe4b117c232423c1d2de3f48d99acb71827f
@swamirishi swamirishi requested review from hemantk-12 and smengcl July 19, 2024 03:41
Change-Id: I36cee730a30496264b32c74a5f25ae2e92936eba
@adoroszlai
Copy link
Contributor

Please wait for clean CI run in fork before opening PR.

@adoroszlai adoroszlai added the snapshot https://issues.apache.org/jira/browse/HDDS-6517 label Jul 19, 2024
Change-Id: If23d180a31439cb2338fc852f8515dd6422aa1a6
Change-Id: Id5e3ae98555b3102e24bdfb3fb38e2e7e24c26a8
@swamirishi swamirishi marked this pull request as draft July 19, 2024 19:52
@swamirishi
Copy link
Contributor Author

Please wait for clean CI run in fork before opening PR.

Yeah I forgot to switch it to draft.

Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the patch @swamirishi
Overall patch looks good to me. Left some minor comments and suggestions.

snapshotLimit--;
snapshotFilteredCount.getAndIncrement();
} catch (RocksDBException | IOException e) {
LOG.error("Exception encountered while filtering a snapshot", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we log it at an info level, if SstFileringService fails because the snapshot is purged? Otherwise, it will be unnecessary noisy logs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot be sure why the issue occured. RocksDbException can thrown for many reasons. One of them being the db being closed by another thread.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's why I asked to log it at the info level if it fails because snapshot dir doesn't exist which will be the case when it is purged. RocksDB throws exception like org.rocksdb.RocksDBException: While mkdir if missing: path_to_db_dir: No such file or directory..

Change-Id: I438ede5cd999cb991064cf9f4bf142568b03f98f
private static final int SST_FILTERING_CORE_POOL_SIZE = 1;

public static final String SST_FILTERED_FILE = "sstFiltered";
private static final byte[] SST_FILTERED_FILE_CONTENT = StringUtils.string2Bytes("This directory holds information " +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static final byte[] SST_FILTERED_FILE_CONTENT = StringUtils.string2Bytes("This directory holds information " +
private static final byte[] SST_FILTERED_FILE_CONTENT = StringUtils.string2Bytes("This file holds information " +
"if a particular snapshot has filtered out the irrelevant sst files or not.\nDO NOT add, change or delete " +
"this file unless you know what you are doing.\n");

// multiple times.
private static final int SST_FILTERING_CORE_POOL_SIZE = 1;

public static final String SST_FILTERED_FILE = "sstFiltered";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we have .txt or any other text file extension?

snapshotLimit--;
snapshotFilteredCount.getAndIncrement();
} catch (RocksDBException | IOException e) {
LOG.error("Exception encountered while filtering a snapshot", e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's why I asked to log it at the info level if it fails because snapshot dir doesn't exist which will be the case when it is purged. RocksDB throws exception like org.rocksdb.RocksDBException: While mkdir if missing: path_to_db_dir: No such file or directory..

Change-Id: I373acfefbf65812665794dfeb292a928900b6c73
public static final String SST_FILTERED_FILE = "sstFiltered";
private static final byte[] SST_FILTERED_FILE_CONTENT = StringUtils.string2Bytes("This file holds information " +
"if a particular snapshot has filtered out the relevant sst files or not.\nDO NOT add, change or delete " +
"any files in this directory unless you know what you are doing.\n");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"any files in this directory unless you know what you are doing.\n");
"this file unless you know what you are doing.\n");

Copy link
Contributor Author

@swamirishi swamirishi Jul 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is inside the rocksdb directory. The statement is correct right isn't? Basically asking people not to change any file in this directory.

@swamirishi swamirishi marked this pull request as ready for review July 31, 2024 22:09
Copy link
Contributor

@hemantk-12 hemantk-12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Change-Id: I73029fede1d667b70c68c11527398216e458f8ae
@swamirishi swamirishi merged commit 1ae2701 into apache:master Jul 31, 2024
@swamirishi
Copy link
Contributor Author

Thank you for reviewing the patch @hemantk-12

xichen01 pushed a commit to xichen01/ozone that referenced this pull request Oct 1, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Oct 1, 2024
xichen01 pushed a commit to xichen01/ozone that referenced this pull request Oct 1, 2024
ptlrs pushed a commit to ptlrs/ozone that referenced this pull request Mar 8, 2025
…ot directory (apache#6965)

(cherry picked from commit 1ae2701)
Change-Id: I176724d052cbca8f345d471683320f131222982f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

snapshot https://issues.apache.org/jira/browse/HDDS-6517

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants