Skip to content

Conversation

@mythrocks
Copy link
Collaborator

@mythrocks mythrocks commented Nov 19, 2020

Fixes #1039. Depends on rapidsai/cudf#6811.

This change aims to allow for unbounded time-range window function queries to return the right results. It depends on the explicit support to specify unbounded window boundaries in CUDF. This cannot be checked in without rapidsai/cudf#6811.

@mythrocks mythrocks self-assigned this Nov 19, 2020
@mythrocks mythrocks changed the title Fixed UNBOUNDED time-range query. [WIP] Fixed UNBOUNDED time-range query. Nov 19, 2020
@mythrocks mythrocks requested a review from revans2 November 20, 2020 00:18
@mythrocks mythrocks changed the title [WIP] Fixed UNBOUNDED time-range query. [WIP] Use CUDF's "UNBOUNDED" window boundaries for time-range queries. Nov 20, 2020
@revans2
Copy link
Collaborator

revans2 commented Nov 20, 2020

@mythrocks you need to update the commit with a signoff

revans2
revans2 previously approved these changes Nov 20, 2020
@sameerz sameerz added the bug Something isn't working label Nov 23, 2020
@mythrocks
Copy link
Collaborator Author

Depends on rapidsai/cudf#6811.

This was merged a couple of hours ago. I have rebased the change and signed the commit.

@mythrocks
Copy link
Collaborator Author

mythrocks dismissed revans2’s stale review via 1d6d130 26 minutes ago

With apologies to @revans2, this is a little harsh; I did no such thing. Signing the commit invalidated the previous review. The new commit has no material difference from the old one.

@mythrocks mythrocks changed the title [WIP] Use CUDF's "UNBOUNDED" window boundaries for time-range queries. Use CUDF's "UNBOUNDED" window boundaries for time-range queries. Nov 24, 2020
@revans2
Copy link
Collaborator

revans2 commented Nov 25, 2020

build

@mythrocks
Copy link
Collaborator Author

An odd test failure:

11:59:31  �[32m- a block fits entirely, but a subsequent block doesn't�[0m
11:59:31  �[32mHashAggregatesSuite:�[0m
11:59:46  �[31m*** RUN ABORTED ***�[0m
11:59:46  �[31m  java.lang.NoSuchMethodError: org.apache.spark.sql.execution.exchange.ShuffleExchangeExec.canChangeNumPartitions()Z�[0m
11:59:46  �[31m  at org.apache.spark.sql.rapids.execution.GpuShuffleMeta.convertToGpu(GpuShuffleExchangeExec.scala:68)�[0m
11:59:46  �[31m  at org.apache.spark.sql.rapids.execution.GpuShuffleMeta.convertToGpu(GpuShuffleExchangeExec.scala:42)�[0m
11:59:46  �[31m  at com.nvidia.spark.rapids.SparkPlanMeta.convertIfNeeded(RapidsMeta.scala:615)�[0m
11:59:46  �[31m  at com.nvidia.spark.rapids.GpuOverrides$$anon$138.convertToGpu(GpuOverrides.scala:2046)�[0m
11:59:46  �[31m  at com.nvidia.spark.rapids.GpuOverrides$$anon$138.convertToGpu(GpuOverrides.scala:2040)�[0m
11:59:46  �[31m  at com.nvidia.spark.rapids.SparkPlanMeta.convertIfNeeded(RapidsMeta.scala:615)�[0m
11:59:46  �[31m  at com.nvidia.spark.rapids.GpuSortMeta.convertToGpu(GpuSortExec.scala:42)�[0m
11:59:46  �[31m  at com.nvidia.spark.rapids.GpuSortMeta.convertToGpu(GpuSortExec.scala:33)�[0m
11:59:46  �[31m  at com.nvidia.spark.rapids.SparkPlanMeta.convertIfNeeded(RapidsMeta.scala:615)�[0m
11:59:46  �[31m  at com.nvidia.spark.rapids.GpuSortAggregateMeta.convertToGpu(aggregate.scala:285)�[0m
11:59:46  �[31m  ...�[0m
11:59:46  [INFO] ------------------------------------------------------------------------
11:59:46  [INFO] Reactor Summary:
11:59:46  [INFO] 
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark Root Project ... SUCCESS [  1.368 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark SQL Plugin ..... SUCCESS [  7.770 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark Shuffle Plugin . SUCCESS [  0.460 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark SQL Plugin Shims SUCCESS [  0.068 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark SQL Plugin Spark 3.0.0 Shim SUCCESS [  0.592 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark SQL Plugin Spark 3.0.0 EMR Shim SUCCESS [  0.432 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark SQL Plugin Spark 3.0.1 Shim SUCCESS [  0.528 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark SQL Plugin Spark 3.0.1 EMR Shim SUCCESS [  0.445 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark SQL Plugin Spark 3.1.0 Shim SUCCESS [ 11.990 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark SQL Plugin Spark 3.0.2 Shim SUCCESS [ 11.075 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark SQL Plugin Shim Aggregator SUCCESS [  0.069 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark Scala UDF Plugin SUCCESS [ 35.900 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark Distribution ... SUCCESS [  0.124 s]
11:59:46  [INFO] RAPIDS Accelerator for Apache Spark Tests .......... FAILURE [ 44.202 s]
11:59:46  [INFO] rapids-4-spark-integration-tests_2.12 .............. SKIPPED
11:59:46  [INFO] rapids-4-spark-api-validation ...................... SKIPPED
11:59:46  [INFO] ------------------------------------------------------------------------
11:59:46  [INFO] BUILD FAILURE
11:59:46  [INFO] ------------------------------------------------------------------------
11:59:46  [INFO] Total time: 01:55 min
11:59:46  [INFO] Finished at: 2020-11-25T19:59:48+00:00
11:59:46  [INFO] Final Memory: 116M/1315M
11:59:46  [INFO] ------------------------------------------------------------------------

@abellina
Copy link
Collaborator

Looks related to: apache/spark#30432. So ShuffleExchangeLike changed and canChangeNumPartitions became shuffleOrigin

@mythrocks
Copy link
Collaborator Author

Looks related to: apache/spark#30432. So ShuffleExchangeLike changed and canChangeNumPartitions became shuffleOrigin

#1206 has been committed. (Thanks, @jlowe.) I'll try a rebuild.

@mythrocks
Copy link
Collaborator Author

build

@jlowe
Copy link
Contributor

jlowe commented Nov 30, 2020

#1206 has been committed. (Thanks, @jlowe.) I'll try a rebuild.\

Note that you'll need to upmerge to branch-0.3 to pick up the fix in CI.

@mythrocks
Copy link
Collaborator Author

build

@sameerz
Copy link
Collaborator

sameerz commented Dec 3, 2020

build

@mythrocks mythrocks merged commit 5853fc1 into NVIDIA:branch-0.3 Dec 3, 2020
@mythrocks mythrocks deleted the unbounded-range-window-query branch December 3, 2020 06:17
@mythrocks
Copy link
Collaborator Author

build

Thanks, @sameerz. The builds launched when I rebased seemed to hang as well. It seems to have worked that time. w007!

nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] UNBOUNDED window ranges on null timestamp columns produces incorrect results.

5 participants