Skip to content

Conversation

@mgaido91
Copy link
Contributor

@mgaido91 mgaido91 commented Oct 1, 2018

What changes were proposed in this pull request?

In #20850 when writing non-null decimals, instead of zero-ing all the 16 allocated bytes, we zero-out only the padding bytes. Since we always allocate 16 bytes, if the number of bytes needed for a decimal is lower than 9, then this means that the bytes between 8 and 16 are not zero-ed.

I see 2 solutions here:

Hence I propose here the first solution in order to fix the correctness issue. We can eventually switch to the second if we think is more efficient later.

How was this patch tested?

Running the test attached in the JIRA + added UT

@mgaido91
Copy link
Contributor Author

mgaido91 commented Oct 1, 2018

cc @cloud-fan @kiszk

@mgaido91
Copy link
Contributor Author

mgaido91 commented Oct 1, 2018

Sorry, I realize only now I linked the wrong JIRA. This is for 25538. Unfortunately I am not in front of my laptop right now so I cannot update the title. I'll do asap. Sorry for the mistake. Thanks for understanding.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM for SPARK-25538 correctness issue.

@dongjoon-hyun
Copy link
Member

cc @gatorsmile

@mgaido91
Copy link
Contributor Author

mgaido91 commented Oct 1, 2018

Thanks for the review @dongjoon-hyun

@mgaido91 mgaido91 changed the title [SPARK-25582][SQL] Zero-out all bytes when writing decimal [SPARK-25538][SQL] Zero-out all bytes when writing decimal Oct 1, 2018
@gatorsmile
Copy link
Member

@mgaido91 Could you change the title to [WIP] before you add the test case?

Also cc @hvanhovell @kiszk who are the best person to review these code.

@SparkQA
Copy link

SparkQA commented Oct 1, 2018

Test build #96822 has finished for PR 22602 at commit 851d723.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Member

kiszk commented Oct 1, 2018

Thank you. The first option looks good. Let me think about a good UT, too.

// grow the global buffer before writing data.
holder.grow(16);

// zero-out the bytes
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can we refine a comment like the following to avoid this problem in the future?

// always zero-out the 16-byte buffer

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@cloud-fan
Copy link
Contributor

good catch! LGTM, waiting for the UT.

@cloud-fan
Copy link
Contributor

I think we can create a UnsafeWriterSuite to do some low-level checking. We can leave the end-to-end test if it's too hard to write.

Copy link
Contributor

@HeartSaVioR HeartSaVioR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 Nice finding.

@mgaido91
Copy link
Contributor Author

mgaido91 commented Oct 2, 2018

thank you all for the reviews! I added the UT according to @cloud-fan's suggestion as I was unable to set up a reasonable the end-to-end UT. Thanks.

val res = unsafeRowWriter.getRow
assert(res.getDecimal(0, decimal1.precision, decimal1.scale) == decimal1)
// Check that the bytes which are not used by decimal1 (but are allocated) are zero-ed out
assert(res.getBytes()(25) == 0x00)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a comment about how we get the 25?

unsafeRowWriter.reset()
unsafeRowWriter.write(0, decimal1, decimal1.precision, decimal1.scale)
val res = unsafeRowWriter.getRow
assert(res.getDecimal(0, decimal1.precision, decimal1.scale) == decimal1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I think this assert is strong enough, we don't need to check the zero bytes below.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would pass also before the fix, as only the first 8 bytes are read. I added it as an additional safety fix.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then this doesn't demonstrate how this bug could affect correctness.

one better idea for this test: we create 2 row writers, one writes the decimal as what you did here, and the other one writes decimal 0.431 directly. Then we compare the 2 result rows and make sure they equals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the suggestion @cloud-fan, I like this approach.

@SparkQA
Copy link

SparkQA commented Oct 2, 2018

Test build #96852 has finished for PR 22602 at commit 6b84b41.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

LGTM, pending jenkins

1 similar comment
@kiszk
Copy link
Member

kiszk commented Oct 2, 2018

LGTM, pending jenkins

@SparkQA
Copy link

SparkQA commented Oct 2, 2018

Test build #96857 has finished for PR 22602 at commit 64f4ed0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

import org.apache.spark.SparkFunSuite
import org.apache.spark.sql.types.Decimal

class UnsafeWriterSuite extends SparkFunSuite {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnsafeWriterSuite -> UnsafeRowWriterSuite? Also, renaming the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is necessary, as we may want to include here also tests for other UnsafeWriter in the future.

Copy link
Member

@dongjoon-hyun dongjoon-hyun Oct 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. We had better have both UnsafeRowWriterSuite and UnsafeWriterSuite in the future if needed.

@SparkQA
Copy link

SparkQA commented Oct 2, 2018

Test build #96859 has finished for PR 22602 at commit 72b7c5c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

unsafeRowWriter1.reset()
unsafeRowWriter1.write(0, decimal1, decimal1.precision, decimal1.scale)
val res1 = unsafeRowWriter1.getRow
// On a second UnsafeRowWriter we write directly decimal2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... we write directly decimal1.

val decimal1 = Decimal(0.431)
decimal1.changePrecision(38, 18)
// This decimal holds 11 bytes
val decimal2 = Decimal(123456789.1232456789)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we verify the number of bytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, I added a check for that.

@viirya
Copy link
Member

viirya commented Oct 3, 2018

two minor comments. LGTM and good catch!

@SparkQA
Copy link

SparkQA commented Oct 3, 2018

Test build #96892 has finished for PR 22602 at commit d7d17d8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class UnsafeRowWriterSuite extends SparkFunSuite

@SparkQA
Copy link

SparkQA commented Oct 3, 2018

Test build #96893 has finished for PR 22602 at commit be38c4c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun
Copy link
Member

Thank you, @mgaido91 and all.

According to the all review comments, I'll merge this to master and branch-2.4.

asfgit pushed a commit that referenced this pull request Oct 3, 2018
## What changes were proposed in this pull request?

In #20850 when writing non-null decimals, instead of zero-ing all the 16 allocated bytes, we zero-out only the padding bytes. Since we always allocate 16 bytes, if the number of bytes needed for a decimal is lower than 9, then this means that the bytes between 8 and 16 are not zero-ed.

I see 2 solutions here:
 - we can zero-out all the bytes in advance as it was done before #20850 (safer solution IMHO);
 - we can allocate only the needed bytes (may be a bit more efficient in terms of memory used, but I have not investigated the feasibility of this option).

Hence I propose here the first solution in order to fix the correctness issue. We can eventually switch to the second if we think is more efficient later.

## How was this patch tested?

Running the test attached in the JIRA + added UT

Closes #22602 from mgaido91/SPARK-25582.

Authored-by: Marco Gaido <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
(cherry picked from commit d7ae36a)
Signed-off-by: Dongjoon Hyun <[email protected]>
@asfgit asfgit closed this in d7ae36a Oct 3, 2018
jackylee-ch pushed a commit to jackylee-ch/spark that referenced this pull request Feb 18, 2019
## What changes were proposed in this pull request?

In apache#20850 when writing non-null decimals, instead of zero-ing all the 16 allocated bytes, we zero-out only the padding bytes. Since we always allocate 16 bytes, if the number of bytes needed for a decimal is lower than 9, then this means that the bytes between 8 and 16 are not zero-ed.

I see 2 solutions here:
 - we can zero-out all the bytes in advance as it was done before apache#20850 (safer solution IMHO);
 - we can allocate only the needed bytes (may be a bit more efficient in terms of memory used, but I have not investigated the feasibility of this option).

Hence I propose here the first solution in order to fix the correctness issue. We can eventually switch to the second if we think is more efficient later.

## How was this patch tested?

Running the test attached in the JIRA + added UT

Closes apache#22602 from mgaido91/SPARK-25582.

Authored-by: Marco Gaido <[email protected]>
Signed-off-by: Dongjoon Hyun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants