Skip to content

[SPARK-25069][CORE]Using UnsafeAlignedOffset to make the entire record of 8 byte Items aligned like which is used in UnsafeExternalSorter #22053

Closed
eatoncys wants to merge 2 commits intoapache:masterfrom
eatoncys:UnsafeAlignedOffset
Closed

[SPARK-25069][CORE]Using UnsafeAlignedOffset to make the entire record of 8 byte Items aligned like which is used in UnsafeExternalSorter #22053
eatoncys wants to merge 2 commits intoapache:masterfrom
eatoncys:UnsafeAlignedOffset

Conversation

@eatoncys
Copy link
Copy Markdown
Contributor

@eatoncys eatoncys commented Aug 9, 2018

What changes were proposed in this pull request?

The class of UnsafeExternalSorter used UnsafeAlignedOffset to make the entire record of 8 byte Items aligned, but ShuffleExternalSorter not.
The SPARC platform requires this because using a 4 byte Int for record lengths causes the entire record of 8 byte Items to become misaligned by 4 bytes. Using a 8 byte long for record length keeps things 8 byte aligned.

How was this patch tested?

Existing Test.

@eatoncys eatoncys changed the title [SPARK-25069][Core]Using UnsafeAlignedOffset to make the entire record of 8 byte Items aligned like which is used in UnsafeExternalSorter [SPARK-25069][CORE]Using UnsafeAlignedOffset to make the entire record of 8 byte Items aligned like which is used in UnsafeExternalSorter Aug 9, 2018
@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 9, 2018

Test build #94486 has finished for PR 22053 at commit 8559c45.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Aug 9, 2018

Good catch
LGTM with a minor comment: Would it be better to update comments regarding 4 bytes with 4 or 8 bytes in UnsafeExternalSorter.java.

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Aug 9, 2018

cc @cloud-fan

@eatoncys
Copy link
Copy Markdown
Contributor Author

@kiszk The comments updated , Thanks for review.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 10, 2018

Test build #94532 has finished for PR 22053 at commit d95d357.

  • This patch fails MiMa tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Aug 10, 2018

retest this please

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Aug 10, 2018

cc @hvanhovell

@cloud-fan
Copy link
Copy Markdown
Contributor

LGTM

is this a data correctness issue? how far shall we backport it?

cc @tgravescs

@eatoncys
Copy link
Copy Markdown
Contributor Author

@cloud-fan Unaligned accesses are not supported on SPARC architecture, which is discussed on the issure:
https://issues.apache.org/jira/browse/SPARK-16962.

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Aug 10, 2018

I think that this is not a data correctness issue. This may cause unexpected program abort due to hardware memory access error.
BTW, it would be good to backport it to increase stability.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 10, 2018

Test build #94534 has finished for PR 22053 at commit d95d357.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Aug 10, 2018

retest this please

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 10, 2018

Test build #94545 has finished for PR 22053 at commit d95d357.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Aug 10, 2018

retest this please

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 10, 2018

Test build #94548 has finished for PR 22053 at commit d95d357.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Aug 10, 2018

retest this please

@tgravescs
Copy link
Copy Markdown
Contributor

Sounds like its not a correctness issue. I wasn't aware we were support sparc, although looking at our docs I don't see that we list anything explicitly. This is listed as an improvement and generally we don't backport those, it does sound more like a defect to me if https://issues.apache.org/jira/browse/SPARK-16962 was done in an effort to support it.

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 10, 2018

Test build #94560 has finished for PR 22053 at commit d95d357.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@kiszk
Copy link
Copy Markdown
Member

kiszk commented Aug 12, 2018

retest this please

@SparkQA
Copy link
Copy Markdown

SparkQA commented Aug 12, 2018

Test build #94645 has finished for PR 22053 at commit d95d357.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Copy Markdown
Contributor

thanks, merging to master!

@asfgit asfgit closed this in 02d0a1f Aug 13, 2018
cloud-fan pushed a commit that referenced this pull request Apr 17, 2020
…dKeyValueBatch should also respect UnsafeAlignedOffset

### What changes were proposed in this pull request?

Make `UnsafeKVExternalSorter` / `VariableLengthRowBasedKeyValueBatch ` also respect `UnsafeAlignedOffset` when reading the record and update some out of date comemnts.

### Why are the changes needed?

Since `BytesToBytesMap` respects `UnsafeAlignedOffset` when writing the record, `UnsafeKVExternalSorter` should also respect `UnsafeAlignedOffset` when reading the record from `BytesToBytesMap` otherwise it will causes data correctness issue.

Unlike `UnsafeKVExternalSorter` may reading records from `BytesToBytesMap`, `VariableLengthRowBasedKeyValueBatch` writes and reads records by itself. Thus, similar to #22053 and [comment](#22053 (comment)) there, fix for `VariableLengthRowBasedKeyValueBatch` more likely an improvement for the support of SPARC platform.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manually tested `HashAggregationQueryWithControlledFallbackSuite` with `UAO_SIZE=8`  to simulate SPARC platform. And tests only pass with this fix.

Closes #28195 from Ngone51/fix_uao.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
cloud-fan pushed a commit that referenced this pull request Apr 17, 2020
…dKeyValueBatch should also respect UnsafeAlignedOffset

### What changes were proposed in this pull request?

Make `UnsafeKVExternalSorter` / `VariableLengthRowBasedKeyValueBatch ` also respect `UnsafeAlignedOffset` when reading the record and update some out of date comemnts.

### Why are the changes needed?

Since `BytesToBytesMap` respects `UnsafeAlignedOffset` when writing the record, `UnsafeKVExternalSorter` should also respect `UnsafeAlignedOffset` when reading the record from `BytesToBytesMap` otherwise it will causes data correctness issue.

Unlike `UnsafeKVExternalSorter` may reading records from `BytesToBytesMap`, `VariableLengthRowBasedKeyValueBatch` writes and reads records by itself. Thus, similar to #22053 and [comment](#22053 (comment)) there, fix for `VariableLengthRowBasedKeyValueBatch` more likely an improvement for the support of SPARC platform.

### Does this PR introduce any user-facing change?

No.

### How was this patch tested?

Manually tested `HashAggregationQueryWithControlledFallbackSuite` with `UAO_SIZE=8`  to simulate SPARC platform. And tests only pass with this fix.

Closes #28195 from Ngone51/fix_uao.

Authored-by: yi.wu <yi.wu@databricks.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 40f9dbb)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants