[SPARK-25069][CORE]Using UnsafeAlignedOffset to make the entire record of 8 byte Items aligned like which is used in UnsafeExternalSorter #22053
Conversation
|
Test build #94486 has finished for PR 22053 at commit
|
|
Good catch |
|
cc @cloud-fan |
|
@kiszk The comments updated , Thanks for review. |
|
Test build #94532 has finished for PR 22053 at commit
|
|
retest this please |
|
cc @hvanhovell |
|
LGTM is this a data correctness issue? how far shall we backport it? cc @tgravescs |
|
@cloud-fan Unaligned accesses are not supported on SPARC architecture, which is discussed on the issure: |
|
I think that this is not a data correctness issue. This may cause unexpected program abort due to hardware memory access error. |
|
Test build #94534 has finished for PR 22053 at commit
|
|
retest this please |
|
Test build #94545 has finished for PR 22053 at commit
|
|
retest this please |
|
Test build #94548 has finished for PR 22053 at commit
|
|
retest this please |
|
Sounds like its not a correctness issue. I wasn't aware we were support sparc, although looking at our docs I don't see that we list anything explicitly. This is listed as an improvement and generally we don't backport those, it does sound more like a defect to me if https://issues.apache.org/jira/browse/SPARK-16962 was done in an effort to support it. |
|
Test build #94560 has finished for PR 22053 at commit
|
|
retest this please |
|
Test build #94645 has finished for PR 22053 at commit
|
|
thanks, merging to master! |
…dKeyValueBatch should also respect UnsafeAlignedOffset ### What changes were proposed in this pull request? Make `UnsafeKVExternalSorter` / `VariableLengthRowBasedKeyValueBatch ` also respect `UnsafeAlignedOffset` when reading the record and update some out of date comemnts. ### Why are the changes needed? Since `BytesToBytesMap` respects `UnsafeAlignedOffset` when writing the record, `UnsafeKVExternalSorter` should also respect `UnsafeAlignedOffset` when reading the record from `BytesToBytesMap` otherwise it will causes data correctness issue. Unlike `UnsafeKVExternalSorter` may reading records from `BytesToBytesMap`, `VariableLengthRowBasedKeyValueBatch` writes and reads records by itself. Thus, similar to #22053 and [comment](#22053 (comment)) there, fix for `VariableLengthRowBasedKeyValueBatch` more likely an improvement for the support of SPARC platform. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manually tested `HashAggregationQueryWithControlledFallbackSuite` with `UAO_SIZE=8` to simulate SPARC platform. And tests only pass with this fix. Closes #28195 from Ngone51/fix_uao. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com>
…dKeyValueBatch should also respect UnsafeAlignedOffset ### What changes were proposed in this pull request? Make `UnsafeKVExternalSorter` / `VariableLengthRowBasedKeyValueBatch ` also respect `UnsafeAlignedOffset` when reading the record and update some out of date comemnts. ### Why are the changes needed? Since `BytesToBytesMap` respects `UnsafeAlignedOffset` when writing the record, `UnsafeKVExternalSorter` should also respect `UnsafeAlignedOffset` when reading the record from `BytesToBytesMap` otherwise it will causes data correctness issue. Unlike `UnsafeKVExternalSorter` may reading records from `BytesToBytesMap`, `VariableLengthRowBasedKeyValueBatch` writes and reads records by itself. Thus, similar to #22053 and [comment](#22053 (comment)) there, fix for `VariableLengthRowBasedKeyValueBatch` more likely an improvement for the support of SPARC platform. ### Does this PR introduce any user-facing change? No. ### How was this patch tested? Manually tested `HashAggregationQueryWithControlledFallbackSuite` with `UAO_SIZE=8` to simulate SPARC platform. And tests only pass with this fix. Closes #28195 from Ngone51/fix_uao. Authored-by: yi.wu <yi.wu@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 40f9dbb) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
What changes were proposed in this pull request?
The class of UnsafeExternalSorter used UnsafeAlignedOffset to make the entire record of 8 byte Items aligned, but ShuffleExternalSorter not.
The SPARC platform requires this because using a 4 byte Int for record lengths causes the entire record of 8 byte Items to become misaligned by 4 bytes. Using a 8 byte long for record length keeps things 8 byte aligned.
How was this patch tested?
Existing Test.