Add 3.5.1-SNAPSHOT Shim#9962
Conversation
Signed-off-by: Raza Jafri <rjafri@nvidia.com>
|
I'm assuming your decimal multiple is related to #9859... If so pleas emake sure it fixes it all the way or we comment on that issue. the shim is very hard to read, one calls mul128 the other calls multiply128. I haven't went and looked at those but one its hard to even see that diff so you should in the very least add a comment or point to issue and explain. |
I will go ahead and put in some comments to highlight the change |
Discussed this offline. I missed the division bit of the puzzle. Will verify division and post an update here |
sql-plugin/src/main/spark311/scala/com/nvidia/spark/rapids/shims/DecimalUtilShims.scala
Outdated
Show resolved
Hide resolved
I have verified the Decimal division and we match Spark 3.5.1 output. It turns out that we were always doing the right thing on the GPU for decimal division. So to match Spark bug for bug we should "fix" the versions Databricks 330+ and Spark versions 340+ by returning the bad answer. I have created an issue for that here |
|
build |
|
premerge failing due to an unrelated change |
|
build |
|
build |
|
build |
|
build |
|
build |
This reverts commit 533504f.
|
I have reverted the tests for versions that we don't support yet. They will be added in other shims |
|
build |
|
@andygrove can you PTAL? |
| throw RapidsErrorUtils. | ||
| arithmeticOverflowError("One or more rows overflow for Add operation.") |
There was a problem hiding this comment.
let us leave formatting-only changes to dedicated PRs
| withResource(actualSize) { _ => | ||
| val mergedEquals = withResource(start.equalTo(stop)) { equals => | ||
| if (step.hasNulls) { | ||
| // Also set the row to null where step is null. | ||
| equals.mergeAndSetValidity(BinaryOp.BITWISE_AND, equals, step) | ||
| } else { | ||
| equals.incRefCount() | ||
| } | ||
| } | ||
| withResource(mergedEquals) { _ => | ||
| mergedEquals.ifElse(one, actualSize) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| withResource(sizeAsLong) { _ => | ||
| // check max size | ||
| withResource(Scalar.fromInt(MAX_ROUNDED_ARRAY_LENGTH)) { maxLen => | ||
| withResource(sizeAsLong.lessOrEqualTo(maxLen)) { allValid => | ||
| require(isAllValidTrue(allValid), | ||
| s"Too long sequence found. Should be <= $MAX_ROUNDED_ARRAY_LENGTH") | ||
| } | ||
| } | ||
| // cast to int and return | ||
| sizeAsLong.castTo(DType.INT32) | ||
| } | ||
| } |
There was a problem hiding this comment.
The bottom portion L85-L111 in 311 and L98-L126 in 351 differ only in the require message let us refactor to minimize shimming
|
Thanks for taking a look @gerashegalov PTAL |
|
build |
|
Will we plan to run nightly integration tests against spark-3.5.1-SNAPSHOT? |
Yes, we do |
This PR adds shims for Spark 3.5.1-SNAPSHOT.
Changes Made:
Shimplifycommand was runThe only files that were manually changed were
pom.xmlandShimServiceProvider.scalato add the SNAPSHOT version to theVERSIONNAMES. Also removed some empty lines as a result of the aboveShimplifycommandDecimalUtilShims.scalawhich calls the respective multiplication method depending on the Spark version. In Spark 3.5.1 and other versions, the multiplication doesn't perform an interim cast and as part of spark-rapids-jni PR another method calledmul128was added which skips the interim cast.ComputeSequenceSize.scalato provide a shim for the new method to calculate sequence size and to make sure it's within limit.GpuBatchScanExecto match the changes in SparkTests:
All integration tests were run locally
fixes #9258
fixes #9859
fixes #9875
fixes #9743