Skip to content

Conversation

@ArafatKhan2198
Copy link
Contributor

@ArafatKhan2198 ArafatKhan2198 commented Feb 10, 2023

What changes were proposed in this pull request?

The Recon server is encountering an issue while processing a key in the file system, leading to an unexpected exception. The error message cites a "ClassCastException," indicating that the Recon server is attempting to cast an object of type org.apache.hadoop.ozone.om.helpers.RepeatedOmKeyInfo as an object of type org.apache.hadoop.ozone.om.helpers.OmKeyInfo. As a result, the server is unable to properly process the key and accurately display the number of files/keys.

So I have made changes to the fileSizeCountTask which will handle both RepeatedOmKeyInfo and OmKeyInfo objects.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7926

How was this patch tested?

UT's ran successfully

// Handle RepeatedOmKeyInfo object
RepeatedOmKeyInfo repeatedKeyInfo = (RepeatedOmKeyInfo) omKeyInfo;
keyInfo = repeatedKeyInfo.getOmKeyInfoList().get(0);
oldKeyInfo = repeatedKeyInfo.getOmKeyInfoList().get(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it guarantee repeatedKeyInfo has only one key? Also why is keyInfo and oldKeyInfo the same? typo?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comment @jojochuang
I was mistaken in believing that all OmKeyInfo objects stored in the RepeatedOmKeyInfo class are similar in terms of size, but differ only in their paths. In fact, it is the complete opposite. Once a key is deleted, it is moved to the "om metadata deletedTable." Hence having a mapping of {label: List<OMKeyInfo>} ensures that if users repeatedly create and delete keys with the exact same URI, all instances of deletion are grouped together under the same key name.
Hence I believe we must iterate through the list and call handleDeleteKeyEvent() on each OmKeyInfo object in the list. Do let me know if I am making a mistake in my understanding of RepeatedOmKeyInfo class and the rocksDB DeletedTable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is confusing me. The fileSizeCount task should only be processing events from the file table and the key table. They both have value type of OmKeyInfo. The only table that has RepeatedOmKeyInfo values is the deletedTable. But those should be skipped here:

What am I missing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GeorgeJahad Just saw your comment here. I agree with you (to be double checked with UT/experiments). I have a short writeup below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smengcl Don't you think this check would prevent the deleted table from being referenced? and it shouldn't be possible for RepeatedOmKeyInfo from ever being thrown up?

if (!taskTables.contains(omdbUpdateEvent.getTable())) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 Looks like it is correctly limiting the events to keyTable.

Then we need to find out how those deletedTable events could have gotten through this check in the first place.

Any chance for a quick repro UT? Generate an event from deletedTable and see if the filter logic is working?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it could have set the wrong table from the source:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@smengcl I made modifications to the UTs for the process in the corresponding test class for filesizecount, and the filter check successfully prevented deletedTable from going through.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ArafatKhan2198 So the possibility of OMDBUpdatesHandler setting the wrong table above still exists. We need to double check its logic.

@jojochuang Do you have an idea how to repro this? Would doing hsync alone trigger the exception on Recon?

@ArafatKhan2198 ArafatKhan2198 marked this pull request as ready for review February 13, 2023 07:47
@smengcl
Copy link
Contributor

smengcl commented Feb 14, 2023

The OMDBUpdateEvent being processed by FileSizeCountTask should have come from processEvent here.

And because deletedTable is the only table that carries those RepeatedOmKeyInfo values, the correct action to take in this case could simply be ignoring the event when its value is of class RepeatedOmKeyInfo. -- Because those same keys are very likely processed once already when they are "moved" from keyTable to deletedTable during key delete request, where the former effectively deletes the entries from keyTable. Please double check if this is the case. CMIIW.

@jojochuang
Copy link
Contributor

Thanks @smengcl
This event is caused by the hsync implementation (still under code review) which is implemented by modifying OMKeyCommit. Essentially this means it could happen during key commit already, if the key to be committed overwrites an existing key (possible today if key is versioned)

I am not familiar with recon implementation. If the recon does respond correctly when key is removed from namespace, then I agree the best course of action for this issue is to ignore.

@jojochuang
Copy link
Contributor

Oh it would still be a good idea to track deletedTable (and deletedDirectoryTable) in Recon too. Probably seperately.

We encountered an issue recently where keys are removed from namespace, but because the number was high, the actual deleted backlogged and it took a huge amount of time to track down where it got backlogged (multiple entities are involved: OM, SCM and DN).

We managed to track down the deletion by looking at rocksdb entries, tailing logs and metrics though.

@ArafatKhan2198
Copy link
Contributor Author

The OMDBUpdateEvent being processed by FileSizeCountTask should have come from processEvent here.

And because deletedTable is the only table that carries those RepeatedOmKeyInfo values, the correct action to take in this case could simply be ignoring the event when its value is of class RepeatedOmKeyInfo. -- Because those same keys are very likely processed once already when they are "moved" from keyTable to deletedTable during key delete request, where the former effectively deletes the entries from keyTable. Please double check if this is the case. CMIIW.

  • The deletedTable stores information about deleted keys in Ozone Manager's metadata. Whenever an object is deleted, the OmMetadataManagerImpl class creates an entry in the deletedTable in the form of RepeatedOmKeyInfo class that contains multiple OmKeyInfo objects with the same URI. Periodic scans by Ozone's background services identify objects eligible for permanent deletion, which are then removed from the deletedTable and their corresponding data is freed from storage.

  • Since this information is hidden from the user, there's no need to worry about the contents of the deletedTable for the fileSizeCountTask, hence we should like you suggested ignore the repeatedOmKeyInfo objects being thrown as we only deal with OmKeyInfo object from the key and file table.

@ArafatKhan2198
Copy link
Contributor Author

Oh it would still be a good idea to track deletedTable (and deletedDirectoryTable) in Recon too. Probably seperately.

We encountered an issue recently where keys are removed from namespace, but because the number was high, the actual deleted backlogged and it took a huge amount of time to track down where it got backlogged (multiple entities are involved: OM, SCM and DN).

We managed to track down the deletion by looking at rocksdb entries, tailing logs and metrics though.

I think that's a great suggestion. We could start by creating an endpoint that exposes the data from the deletedTable and later work on implementing a user interface for it. I'll go ahead and create a JIRA for this task.

@ArafatKhan2198
Copy link
Contributor Author

@smengcl @jojochuang can you please take a look!

@ArafatKhan2198
Copy link
Contributor Author

ArafatKhan2198 commented Feb 19, 2023

I am not familiar with recon implementation. If the recon does respond correctly when key is removed from namespace, then I agree the best course of action for this issue is to ignore.

When a key is deleted in an Ozone cluster, the Ozone Manager updates its OM-DB to reflect the deletion. This change is then detected by Recon during its next incremental update or full snapshot scan, depending on the mode in which Recon is operating.

Once Recon detects the deletion, it updates its metadata store to reflect the change. Specifically, Recon updates the index associated with the deleted key to remove it from the index. This ensures that the deleted key is not included in any future searches or queries against the metadata store.

Recon also retains a record of the deleted key in its metadata store inside the DeletedTable. This allows administrators to query the metadata store for deleted keys and to see when they were deleted. This information can be useful for auditing, compliance, and troubleshooting purposes.

Overall, Apache Recon does respond to key deletions in an Ozone cluster by detecting the change and updating its metadata store to reflect the new state of the cluster.

@ChenSammi
Copy link
Contributor

@ArafatKhan2198 , NSSummaryTaskWithLegacy also throws this ClassCastException exception in the Recon log. It need mitigation too.

@ArafatKhan2198 ArafatKhan2198 requested review from ChenSammi and removed request for jojochuang February 27, 2023 07:29
@smengcl
Copy link
Contributor

smengcl commented Feb 27, 2023

Thanks @smengcl This event is caused by the hsync implementation (still under code review) which is implemented by modifying OMKeyCommit. Essentially this means it could happen during key commit already, if the key to be committed overwrites an existing key (possible today if key is versioned)

I am not familiar with recon implementation. If the recon does respond correctly when key is removed from namespace, then I agree the best course of action for this issue is to ignore.

Got it. I was not aware of this new usage of RepeatedOmKeyInfo before this. In this case I agree that it should be properly handled like in this PR.

@GeorgeJahad
Copy link
Contributor

Got it. I was not aware of this new usage of RepeatedOmKeyInfo before this. In this case I agree that it should be properly handled like in this PR.

Just want to be sure I understand. Does that mean that the value in the KeyTable here:

KEY_TABLE =
new DBColumnFamilyDefinition<>(
OmMetadataManagerImpl.KEY_TABLE,
String.class,
new StringCodec(),
OmKeyInfo.class,
new OmKeyInfoCodec(true));

is changing from OmKeyInfo to RepeatedOmKeyInfo?

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1.

I'm fine with ignoring RepeatedOmKeyInfo for now to solve the issue at hand as (IIUC) the new usage of it in OMKeyCommitRequest for hsync seems to be related to deletion.

But would that cause inaccurate counting of the file size in FileSizeCountTask? In that case shall we add back the for loop to process each KeyInfo in RepeatedOmKeyInfo? 84a1e60#diff-b149c501820a6046d4e78940b783982729ffd53a54ebdcf8d44412d477d271c5L137-L151 We can open another jira for this discussion.

@smengcl
Copy link
Contributor

smengcl commented Feb 27, 2023

Got it. I was not aware of this new usage of RepeatedOmKeyInfo before this. In this case I agree that it should be properly handled like in this PR.

Just want to be sure I understand. Does that mean that the value in the KeyTable here:

KEY_TABLE =
new DBColumnFamilyDefinition<>(
OmMetadataManagerImpl.KEY_TABLE,
String.class,
new StringCodec(),
OmKeyInfo.class,
new OmKeyInfoCodec(true));

is changing from OmKeyInfo to RepeatedOmKeyInfo?

No. None of the table schema has changed.

The deletion in OmKeyCommit is done to deletedTable. However, that still doesn't explain why the table filter didn't work. I still suspect the event source is the culprit. If we have a repro it will be easy to dig into that.

@ChenSammi
Copy link
Contributor

Agree, we should find the root case if this issue is reproducible.
I double checked the code change in OmKeyCommit, although there is new key delete logic, it will go to the deleted table. This exception should not happen at all.

If it's not easily reproducible, I'm fine with this intermediate approach.

@ChenSammi ChenSammi merged commit f0c4d41 into apache:master Mar 1, 2023
@ChenSammi
Copy link
Contributor

LGTM + 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

hbase HBase on Ozone support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants