[SPARK-51756][CORE][FOLLOWUP] Avoid the risk of overflow of long #52776

beliefer · 2025-10-29T10:23:38Z

What changes were proposed in this pull request?

This PR proposes to avoid the risk of overflow of long for getAggregatedChecksumValue.
This PR follows up #50230

Why are the changes needed?

I guess the size of rowBasedChecksums is very great and row-based checksum could be big one. Then there exists the risk of overflow of long.

Does this PR introduce any user-facing change?

'No'.
New feature.

How was this patch tested?

GA tests.

Was this patch authored or co-authored using generative AI tooling?

'No'.

sarutak · 2025-10-29T11:22:23Z

@beliefer Aggregated checksum might overflow and become a negative value but is it really a problem? If negative checksum value causes a problem, should we have a test for a problematic case?

beliefer · 2025-10-30T07:05:13Z

ping @JiexingLi cc @cloud-fan @mridulm @peter-toth @attilapiros

peter-toth · 2025-10-30T08:14:49Z

I don't think that negative checksum is a problem, we would just lose a bit from the checksum range with that & 011.....

beliefer · 2025-10-30T09:14:01Z

I don't think that negative checksum is a problem, we would just lose a bit from the checksum range with that & 011.....

Does the lost bit cause some unexpected issues ?

peter-toth · 2025-10-30T09:20:43Z

I don't think that negative checksum is a problem, we would just lose a bit from the checksum range with that & 011.....

Does the lost bit cause some unexpected issues ?

It makes the quality of the checksum worse.

beliefer · 2025-10-30T09:29:18Z

It makes the quality of the checksum worse.

I'm worry that some bits may be lost, which could actually affect the reliability of the comparative checksum.

peter-toth · 2025-10-30T09:56:32Z

It makes the quality of the checksum worse.

I'm worry that some bits may be lost, which could actually affect the reliability of the comparative checksum.

Checksum computation is always about losing bits 😄, the less we lose the better quality checksum we can get.
Here we actually combine order independent RowBasedChecksums computed by partitions into one final checksum that represents the whole data. Losing 1 more bit doesn't seem like a good idea, but if you have a problematic test case then let's investigate it.

cloud-fan · 2025-10-30T11:19:46Z

core/src/main/java/org/apache/spark/shuffle/checksum/RowBasedChecksum.scala

  def getAggregatedChecksumValue(rowBasedChecksums: Array[RowBasedChecksum]): Long = {
    Option(rowBasedChecksums)
-      .map(_.foldLeft(0L)((acc, c) => acc * 31L + c.getValue))
+      .map(_.foldLeft(0L)((acc, c) => (acc * 31L + c.getValue) & Long.MaxValue))


how does & Long.MaxValue help here? The current code is kind of very common and many hash code are also calculated like this.

beliefer · 2025-10-31T02:25:24Z

I'm not sure. Let me close this PR and open it if exists the actual problem in future.

[SPARK-51756][CORE][FOLLOWUP] Avoid the risk of overflow of long

e7c425d

github-actions bot added the CORE label Oct 29, 2025

cloud-fan reviewed Oct 30, 2025

View reviewed changes

beliefer closed this Oct 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-51756][CORE][FOLLOWUP] Avoid the risk of overflow of long #52776

[SPARK-51756][CORE][FOLLOWUP] Avoid the risk of overflow of long #52776

Uh oh!

beliefer commented Oct 29, 2025

Uh oh!

sarutak commented Oct 29, 2025 •

edited

Loading

Uh oh!

beliefer commented Oct 30, 2025

Uh oh!

peter-toth commented Oct 30, 2025

Uh oh!

beliefer commented Oct 30, 2025

Uh oh!

peter-toth commented Oct 30, 2025

Uh oh!

beliefer commented Oct 30, 2025

Uh oh!

peter-toth commented Oct 30, 2025 •

edited

Loading

Uh oh!

cloud-fan Oct 30, 2025

Uh oh!

beliefer commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[SPARK-51756][CORE][FOLLOWUP] Avoid the risk of overflow of long #52776

[SPARK-51756][CORE][FOLLOWUP] Avoid the risk of overflow of long #52776

Uh oh!

Conversation

beliefer commented Oct 29, 2025

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Was this patch authored or co-authored using generative AI tooling?

Uh oh!

sarutak commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

beliefer commented Oct 30, 2025

Uh oh!

peter-toth commented Oct 30, 2025

Uh oh!

beliefer commented Oct 30, 2025

Uh oh!

peter-toth commented Oct 30, 2025

Uh oh!

beliefer commented Oct 30, 2025

Uh oh!

peter-toth commented Oct 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cloud-fan Oct 30, 2025

Choose a reason for hiding this comment

Uh oh!

beliefer commented Oct 31, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

sarutak commented Oct 29, 2025 •

edited

Loading

peter-toth commented Oct 30, 2025 •

edited

Loading