Skip to content

JNI bindings for NTH_ELEMENT window aggregation#11201

Merged
rapids-bot[bot] merged 15 commits intorapidsai:branch-22.08from
mythrocks:window-first-last-jni
Jul 8, 2022
Merged

JNI bindings for NTH_ELEMENT window aggregation#11201
rapids-bot[bot] merged 15 commits intorapidsai:branch-22.08from
mythrocks:window-first-last-jni

Conversation

@mythrocks
Copy link
Contributor

Depends on #11158.

This commit adds the JNI bindings required to invoke NTH_ELEMENT aggregations as a window aggregation. Java tests are included.

This is to address spark-rapids/issues/4005 and
spark-rapids/issues/5061.

This change adds support for `NTH_ELEMENT` window aggregations.
This should allow for the implementation of `FIRST()`, `LAST()`,
and `NTH_VALUE()` window functions in Spark SQL.

`NTH_ELEMENT` in window function returns the Nth element from the
specified window for each row in a column. `N` is deemed to be
zero based, so `NTH_ELEMENT(0)` translates to the first element
in a window. Similarly, `NTH_ELEMENT(-1)` translates to the last.

If for any window of size `W`, if the specified `N` falls outside
the range `[ -W, W-1 ]`, a null element is returned for that row.
This commit adds the JNI bindings required to invoke `NTH_ELEMENT`
aggregations as a window aggregation. Java tests are included.

Depends on rapidsai#11158.
@mythrocks mythrocks requested review from a team as code owners July 5, 2022 23:31
@mythrocks mythrocks self-assigned this Jul 5, 2022
@mythrocks mythrocks requested a review from a team as a code owner July 5, 2022 23:31
@mythrocks mythrocks requested review from rgsl888prabhu and vyasr July 5, 2022 23:31
@mythrocks mythrocks marked this pull request as draft July 5, 2022 23:31
@github-actions github-actions bot added CMake CMake build issue Java Affects Java cuDF API. libcudf Affects libcudf (C++/CUDA) code. labels Jul 5, 2022
@mythrocks mythrocks added feature request New feature or request 4 - Needs cuDF (Java) Reviewer non-breaking Non-breaking change and removed libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue labels Jul 5, 2022
@mythrocks
Copy link
Contributor Author

mythrocks commented Jul 5, 2022

Note: While the changes listed include the changes from #11158, only the changes under the java/ directory (specifically commmit#7d02895) are material to this review.

@codecov
Copy link

codecov bot commented Jul 6, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.08@cdd4e03). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.08   #11201   +/-   ##
===============================================
  Coverage                ?   86.30%           
===============================================
  Files                   ?      144           
  Lines                   ?    22698           
  Branches                ?        0           
===============================================
  Hits                    ?    19589           
  Misses                  ?     3109           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update cdd4e03...3827a30. Read the comment docs.

@mythrocks mythrocks marked this pull request as ready for review July 6, 2022 05:21
1. Using #pragma once.
2. Doxygen format.
3. Renamed gather map functor.
4. Materialized gather indices.
@github-actions github-actions bot added CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. labels Jul 7, 2022
@mythrocks mythrocks added DO NOT MERGE Hold off on merging; see PR for details and removed libcudf Affects libcudf (C++/CUDA) code. labels Jul 7, 2022
@mythrocks
Copy link
Contributor Author

Setting DO NOT MERGE until the dependency is merged.

Copy link
Contributor

@rwlee rwlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending dependency being merged. 1 formatting/whitespace nit

@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jul 7, 2022
@mythrocks mythrocks requested review from razajafri and rwlee July 7, 2022 21:19
@mythrocks mythrocks removed the DO NOT MERGE Hold off on merging; see PR for details label Jul 7, 2022
@github-actions github-actions bot removed CMake CMake build issue libcudf Affects libcudf (C++/CUDA) code. labels Jul 7, 2022
@mythrocks
Copy link
Contributor Author

@razajafri, @rwlee, I have addressed your concerns. I'd appreciate it if you could take another gander.
Since #11158 has been merged, rebasing this change has reduced the diff to its core.

@vyasr vyasr removed the request for review from a team July 8, 2022 00:54
@razajafri
Copy link
Contributor

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 9ecb672 into rapidsai:branch-22.08 Jul 8, 2022
@mythrocks
Copy link
Contributor Author

Thank you for the reviews, @rwlee, @razajafri.

@vyasr vyasr added 4 - Needs Review Waiting for reviewer to review or respond and removed 4 - Needs cuDF (Java) Reviewer labels Feb 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4 - Needs Review Waiting for reviewer to review or respond feature request New feature or request Java Affects Java cuDF API. non-breaking Non-breaking change

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants