Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-8175. getFileChecksum() throws exception in debug mode. #7611

Merged
merged 4 commits into from
Jan 2, 2025

Conversation

chiacyu
Copy link
Contributor

@chiacyu chiacyu commented Dec 23, 2024

What changes were proposed in this pull request?

The blockChecksumForDebug would throw exception due to some of the checksum data is not a single CRC, it would be better to use toMultiCrcString() to avoid exception for debug information.

What is the link to the Apache JIRA

Please check HDDS-8175 for more details, thanks.

How was this patch tested?

ozone sh key put /vol1/bucket1/README.md README.md
  • then verify the exception didn't throw in DEBUG mode after checksum command.
ozone sh key checksum /vol1/bucket1/README.md

The result is belowed, please take a look.

2025-01-01 12:08:15,364 [main] DEBUG checksum.CrcComposer: crcPolynomial=0x-12477ce0, precomputedMonomialForHint=0x27b1b63d, bytesPerCrcHint=4068, stripeLength=9223372036854775807
2025-01-01 12:08:15,364 [main] DEBUG checksum.CrcComposer: crcPolynomial=0x-12477ce0, precomputedMonomialForHint=0x30362f1a, bytesPerCrcHint=16384, stripeLength=9223372036854775807
2025-01-01 12:08:15,364 [main] DEBUG checksum.ReplicatedBlockChecksumComputer: number of chunks = 1, chunk checksum type is CRC32, composite checksum = [63, -24, -96, 28]
2025-01-01 12:08:15,365 [main] DEBUG checksum.BaseFileChecksumHelper: Got reply from pipeline Pipeline[ Id: 9943f91f-7c16-4820-adbd-1d8903fe9dad, Nodes: 6738a3ee-bf8c-4238-b17a-0c9935d518bb(ozone-datanode-1.ozone_default/172.19.0.6) ReplicaIndex: 0, ReplicationConfig: RATIS/ONE, State:OPEN, leaderId:6738a3ee-bf8c-4238-b17a-0c9935d518bb, CreationTimestamp2025-01-01T12:03:41.688Z[UTC]] for block conID: 1 locID: 115816896921600001 bcsId: 2 replicaIndex: null: blockChecksum=[0x3fe8a01c], blockChecksumType=CRC32
2025-01-01 12:08:15,365 [main] DEBUG checksum.CrcComposer: crcPolynomial=0x-12477ce0, precomputedMonomialForHint=0x27b1b63d, bytesPerCrcHint=4068, stripeLength=9223372036854775807
2025-01-01 12:08:15,365 [main] DEBUG checksum.BaseFileChecksumHelper: Added blockCrc 0x3fe8a01c for block index 0 of size 4068
{
  "volumeName" : "vol1",
  "bucketName" : "bucket1",
  "name" : "README.md",
  "dataSize" : 4068,
  "algorithm" : "COMPOSITE-CRC32",
  "checksum" : "3FE8A01C"
}

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chiacyu for the patch.

It can tested by CI.

I don't think it is tested by CI, since it is passing without the fix. Referring to CI is enough for refactoring, when no functional change is expected. For functional changes (like fixing an error) we need to reproduce the problem and verify the fix with the same steps. Of course CI is also useful in this case, as it may catch regressions in other areas.

Please try to reproduce the exception with log level for BaseFileChecksumHelper set to DEBUG. I think you can use ozone sh key checksum for checksum calculation. Then verify that the exception no longer happens with the fix.

It seems to me ECFileChecksumHelper may need the same fix. Please test with both RATIS and EC keys.

Also, the two implementations of populateBlockChecksumBuf look identical. Can we move it up to BaseFileChecksumHelper? (Please leave this as last step, just in case I'm missing some detail in these classes.)

Copy link
Contributor

@chungen0126 chungen0126 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @chiacyu for the patch.

@@ -140,7 +140,7 @@ protected String populateBlockChecksumBuf(ByteBuffer checksumData)
}*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the comments could be removed.

@jojochuang
Copy link
Contributor

Also, the two implementations of populateBlockChecksumBuf look identical. Can we move it up to BaseFileChecksumHelper? (Please leave this as last step, just in case I'm missing some detail in these classes.)

Ah good catch we missed it in #7264

case COMPOSITE_CRC:
byte[] crcBytes = blockChecksumByteBuffer.array();
if (LOG.isDebugEnabled()) {
blockChecksumForDebug = CrcUtil.toSingleCrcString(crcBytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you wanted to switch to CrcUtil.toMultiCrcString(crcBytes)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

applied, thanks for the reminder, almost miss that one.

@jojochuang jojochuang merged commit c282d91 into apache:master Jan 2, 2025
42 checks passed
@jojochuang
Copy link
Contributor

Merged. Thanks all!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants