-
Notifications
You must be signed in to change notification settings - Fork 1k
Struct binary comparison op functionality for spark rapids #11153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
rapids-bot
merged 140 commits into
rapidsai:branch-22.08
from
rwlee:rwlee/spark_structbinops
Jul 8, 2022
Merged
Changes from all commits
Commits
Show all changes
140 commits
Select commit
Hold shift + click to select a range
262c2a2
struct binop first pass
rwlee a865fe0
vector-vector nested struct comparison
rwlee ce4440c
cleanup and simplify core code
rwlee 1472c5f
remove type dispatch and other code cleanup
rwlee 4a31fb6
move struct comparison to compiled binops code
rwlee ce2d727
improved testing, type checks, and skipped null value calculations
rwlee 10c95f9
cleanup
rwlee 12cd09e
Merge branch 'branch-21.12' into rwlee/struct_col_compare
rwlee b6fa590
fix upmerge issues
rwlee d64c1f9
fix logic and improve documentation
rwlee 57900da
clean up logic for nulls
rwlee f170149
remove unecessary call to superimpose parent nulls
rwlee de129a1
PR fixes
rwlee 5e84e89
Merge branch 'branch-21.12' into rwlee/struct_col_compare
rwlee b632192
pr fixes
rwlee 4266f8c
Merge branch 'branch-21.12' into rwlee/struct_col_compare
rwlee 367ec07
restructure struct binop code and other pr fixes
rwlee 5a1f016
Merge branch 'branch-21.12' into rwlee/struct_col_compare
rwlee 1f29168
Merge branch 'branch-21.12' into rwlee/struct_col_compare
rwlee 1d6263e
full paths for includes
rwlee 97bd5e1
Merge branch 'branch-21.12' into rwlee/struct_col_compare
rwlee 48d0355
move to new TU and remove common code
rwlee 6cf0e16
fix logic errors and push down struct branching
rwlee 9ec2acf
remove deleted file from CMakeLists
rwlee 3016abf
Naming and comment fixes
rwlee b2a7973
naming
rwlee 191da69
style formatting
rwlee 2cf2b28
merge apply_binary_op and _impl implementation
rwlee 2b634c4
all apply_binary_op calls call apply_binary_op_impl
rwlee 19f1afb
common code path
rwlee f316a0a
explicit instantiation of struct_compare
rwlee c684ef1
Merge branch 'branch-22.02' into rwlee/struct_col_compare
rwlee 7f36241
streamline explicit instantiation
rwlee 8cf0660
Merge remote-tracking branch 'pub/branch-22.02' into rwlee/struct_col…
rwlee 2abefd5
remove op argument
rwlee 8cc05e2
documentation
rwlee 1bde152
Merge branch 'branch-22.02' into rwlee/struct_col_compare
rwlee 470acfe
Fix upmerge errors
rwlee ce21d90
Merge remote-tracking branch 'pub/branch-22.02' into rwlee/struct_col…
rwlee 83fa370
Merge remote-tracking branch 'pub/branch-22.04' into rwlee/struct_col…
rwlee 2b77739
fix new ops from upmerge
rwlee de09cec
Fix floating point nan handling in struct comparison binops
rwlee 8ad9545
Merge remote-tracking branch 'pub/branch-22.04' into rwlee/struct_col…
rwlee 251d607
fix formatting
rwlee 703aaf8
fix copyright
rwlee 43e451b
fix accidently deletd function
rwlee 9ec4a41
style fix
rwlee 201a89b
copyright fix
rwlee cc164d6
Merge remote-tracking branch 'pub/branch-22.04' into rwlee/struct_col…
rwlee 1bb1534
fix cmake style
rwlee 6c6c8ab
re-add missing function name
rwlee 42e58ae
style fix
rwlee 475c896
Fix struct equality binop comparisons
rwlee 1dae04a
PR reviews
rwlee 62224cf
Merge remote-tracking branch 'pub/branch-22.04' into rwlee/struct_col…
rwlee a35600d
refactor row comparison operators into common spaceship operator
rwlee b6f0397
Merge remote-tracking branch 'pub/branch-22.04' into rwlee/struct_col…
rwlee fcc1dd2
first pass, test failures
rwlee 5abf2a8
Merge remote-tracking branch 'pub/branch-22.06' into rwlee/struct_col…
rwlee 9d50ac0
Refactor struct binop comparison to use experimental ops
rwlee 8628c24
Merge remote-tracking branch 'pub/branch-22.06' into rwlee/struct_col…
rwlee a836a96
Merge remote-tracking branch 'pub/branch-22.06' into rwlee/struct_col…
rwlee a537805
fix performance regression and code cleanup
rwlee f7af41f
Merge remote-tracking branch 'pub/branch-22.06' into rwlee/struct_col…
rwlee 1af4643
fix merge errors
rwlee 4d929d9
Merge remote-tracking branch 'upstream/branch-22.06' into rwlee/struc…
bdice 2298988
Revert include changes.
bdice bf1c6ee
split off weak ordering row operator changes
rwlee 5d87db2
device_row_comparator private with friend class
rwlee fd716b9
Merge remote-tracking branch 'pub/branch-22.06' into rwlee/row_op_split
rwlee 2dd2045
device_less conversion to templated struct
rwlee 7ba960e
fold parameter pack
rwlee 84833e7
Apply suggestions from code review
rwlee a944b4f
Merge remote-tracking branch 'pub/branch-22.06' into rwlee/struct_col…
rwlee 4d197ea
fix code style
rwlee 08092fe
Merge branch 'rwlee/row_op_split' of github.com:rwlee/cudf into rwlee…
rwlee d8986c5
Merge branch 'rwlee/row_op_split' into rwlee/struct_col_compare
rwlee 548dcf1
fix code format
rwlee 1dd1159
Merge remote-tracking branch 'upstream/branch-22.06' into rwlee/struc…
bdice eaffdea
NaN handling in device_row_comparators
rwlee a657b14
Merge remote-tracking branch 'pub/branch-22.06' into rwlee/struct_col…
rwlee 3d2a475
Merge remote-tracking branch 'pub/branch-22.06' into rwlee/nanconfig
rwlee e518668
template the comparator
rwlee c99e3c5
partial fix to performance regression
rwlee bb00193
Merge remote-tracking branch 'pub/branch-22.08' into rwlee/nanconfig
rwlee 798d6c5
Template NaN config lexicographic and equality
rwlee c5f9961
Add experimental row operator tests
rwlee 0d4e798
switch to CUDF_ENABLE_IF
rwlee f35d9e3
Naming and add equality tests
rwlee d7fca8c
pr fixes, split off struct op
rwlee e4cee95
reorder cmake test file
rwlee 58d2663
fix cmake formatting
rwlee 3a6cd68
Merge remote-tracking branch 'pub/branch-22.08' into rwlee/nanconfig
rwlee 6d57678
Merge branch 'rwlee/nanconfig' into rwlee/struct_col_compare
rwlee 49be087
Apply suggestions from code review
rwlee d0f64f2
ctad refactor + split device_comaparator call
rwlee 0576e6c
Merge branch 'rwlee/nanconfig' of github.com:rwlee/cudf into rwlee/na…
rwlee 1bd5405
rename experimental op test file
rwlee 70da3b4
comment cleanup and pr fixes
rwlee e4a7029
physical comparator clarification docs
rwlee 1240c85
fix whitespace
rwlee 3304615
fix formatting
rwlee 97c116e
update binary op functionality for nanconfig comparators
rwlee b1bc702
Merge branch 'rwlee/nanconfig' into rwlee/struct_col_compare
rwlee 4f80b1c
Merge remote-tracking branch 'pub/branch-22.08' into rwlee/nanconfig
rwlee 9ca2d29
fix copyright
rwlee 933b25d
Merge remote-tracking branch 'origin/rwlee/nanconfig' into rwlee/stru…
rwlee 51938fd
Functors as arguments, enabling CTAD
rwlee 3f545fe
Docs, cleanup, and renaming
rwlee fe4fc60
update struct ops for new nan configurations
rwlee d0c58a6
Merge remote-tracking branch 'origin/rwlee/nanconfig' into rwlee/stru…
rwlee 2eb8103
device_comparator --> equal_to
rwlee 2e1c9a3
Merge remote-tracking branch 'origin/rwlee/nanconfig' into rwlee/stru…
rwlee 576fab0
Merge remote-tracking branch 'pub/branch-22.08' into rwlee/struct_col…
rwlee 44ac4e0
fix nullate logic in binop
rwlee 1138c5a
JNI work and restructuring for comparison ops
rwlee fc5f339
Merge remote-tracking branch 'pub/branch-22.08' into rwlee/spark_stru…
rwlee 944e05c
restore files that should match main
rwlee feede52
code cleanup, strip out struct binop utilities
rwlee b9337df
JNI tests, checks, fixes, and cleanup for struct binop compare
rwlee 7239675
fix C++ code style
rwlee 20d76e7
Merge remote-tracking branch 'pub/branch-22.08' into rwlee/spark_stru…
rwlee 8bdac0d
Update copyright.
bdice 6e6a476
Blank lines around namespaces.
bdice f32aba5
Make fallthrough explicit, enable clang-format.
bdice 7fe1fb5
Merge branch 'rwlee/spark_structbinops' of github.com:rwlee/cudf into…
rwlee aec885f
code cleanup and PR fixes
rwlee 1ada1be
Revert chnages to scalar_to_column_view
rwlee b828d37
cleanup jni code and fix error
rwlee 5424aa1
Merge remote-tracking branch 'pub/branch-22.08' into rwlee/spark_stru…
rwlee fd3ae69
cleanup java tests
rwlee 589abe4
pr fixes and cleanup
rwlee 8116a23
Merge remote-tracking branch 'pub/branch-22.08' into rwlee/spark_stru…
rwlee f9c2c5c
naming and sfinae formatting
rwlee d7cd266
change argument ordering
rwlee 9767653
Revert "change argument ordering"
rwlee f8a2910
review fixes
rwlee b18a277
Apply suggestions from code review
rwlee a27732c
change templating and cleanup
rwlee 5f9e565
comment and remove extra ;
rwlee File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,139 @@ | ||
| /* | ||
| * Copyright (c) 2022, NVIDIA CORPORATION. | ||
| * | ||
| * Licensed under the Apache License, Version 2.0 (the "License"); | ||
| * you may not use this file except in compliance with the License. | ||
| * You may obtain a copy of the License at | ||
| * | ||
| * http://www.apache.org/licenses/LICENSE-2.0 | ||
| * | ||
| * Unless required by applicable law or agreed to in writing, software | ||
| * distributed under the License is distributed on an "AS IS" BASIS, | ||
| * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| * See the License for the specific language governing permissions and | ||
| * limitations under the License. | ||
| */ | ||
|
|
||
| #pragma once | ||
|
|
||
| #include "binary_ops.hpp" | ||
| #include "operation.cuh" | ||
|
|
||
| #include <cudf/binaryop.hpp> | ||
| #include <cudf/column/column_device_view.cuh> | ||
| #include <cudf/column/column_view.hpp> | ||
| #include <cudf/detail/iterator.cuh> | ||
| #include <cudf/table/experimental/row_operators.cuh> | ||
|
|
||
| #include <rmm/cuda_stream_view.hpp> | ||
| #include <rmm/exec_policy.hpp> | ||
|
|
||
| namespace cudf::binops::compiled::detail { | ||
| template <class T, class... Ts> | ||
| inline constexpr bool is_any_v = std::disjunction<std::is_same<T, Ts>...>::value; | ||
|
|
||
| template <typename OptionalIterator, typename DeviceComparator> | ||
| struct device_comparison_functor { | ||
| // Explicit constructor definition required to avoid a "no instance of constructor" compilation | ||
| // error | ||
| device_comparison_functor(OptionalIterator const optional_iter, | ||
| bool const is_lhs_scalar, | ||
| bool const is_rhs_scalar, | ||
| DeviceComparator const& comparator) | ||
| : _optional_iter(optional_iter), | ||
| _is_lhs_scalar(is_lhs_scalar), | ||
| _is_rhs_scalar(is_rhs_scalar), | ||
| _comparator(comparator) | ||
| { | ||
| } | ||
|
|
||
| bool __device__ operator()(size_type i) | ||
| { | ||
| return _optional_iter[i].has_value() && | ||
| _comparator(cudf::experimental::row::lhs_index_type{_is_lhs_scalar ? 0 : i}, | ||
| cudf::experimental::row::rhs_index_type{_is_rhs_scalar ? 0 : i}); | ||
| } | ||
|
|
||
| OptionalIterator const _optional_iter; | ||
| bool const _is_lhs_scalar; | ||
| bool const _is_rhs_scalar; | ||
| DeviceComparator const _comparator; | ||
| }; | ||
|
|
||
| template <class BinaryOperator, | ||
| typename PhysicalElementComparator = | ||
| cudf::experimental::row::lexicographic::sorting_physical_element_comparator> | ||
| void apply_struct_binary_op(mutable_column_view& out, | ||
| column_view const& lhs, | ||
| column_view const& rhs, | ||
| bool is_lhs_scalar, | ||
| bool is_rhs_scalar, | ||
| PhysicalElementComparator comparator = {}, | ||
| rmm::cuda_stream_view stream = cudf::default_stream_value) | ||
| { | ||
| auto const compare_orders = std::vector<order>( | ||
| lhs.size(), | ||
| is_any_v<BinaryOperator, ops::Greater, ops::GreaterEqual> ? order::DESCENDING | ||
| : order::ASCENDING); | ||
| auto const tlhs = table_view{{lhs}}; | ||
| auto const trhs = table_view{{rhs}}; | ||
| auto const table_comparator = cudf::experimental::row::lexicographic::two_table_comparator{ | ||
| tlhs, trhs, compare_orders, {}, stream}; | ||
| auto outd = column_device_view::create(out, stream); | ||
| auto optional_iter = | ||
| cudf::detail::make_optional_iterator<bool>(*outd, nullate::DYNAMIC{out.has_nulls()}); | ||
| auto const comparator_nulls = nullate::DYNAMIC{has_nested_nulls(tlhs) || has_nested_nulls(trhs)}; | ||
|
|
||
| auto tabulate_device_operator = [&](auto device_comparator) { | ||
| thrust::tabulate( | ||
| rmm::exec_policy(stream), | ||
| out.begin<bool>(), | ||
| out.end<bool>(), | ||
| device_comparison_functor{optional_iter, is_lhs_scalar, is_rhs_scalar, device_comparator}); | ||
| }; | ||
| is_any_v<BinaryOperator, ops::LessEqual, ops::GreaterEqual> | ||
| ? tabulate_device_operator(table_comparator.less_equivalent(comparator_nulls, comparator)) | ||
| : tabulate_device_operator(table_comparator.less(comparator_nulls, comparator)); | ||
| } | ||
|
|
||
| template <typename PhysicalEqualityComparator = | ||
| cudf::experimental::row::equality::physical_equality_comparator> | ||
| void apply_struct_equality_op(mutable_column_view& out, | ||
| column_view const& lhs, | ||
| column_view const& rhs, | ||
| bool is_lhs_scalar, | ||
| bool is_rhs_scalar, | ||
| binary_operator op, | ||
| PhysicalEqualityComparator comparator = {}, | ||
| rmm::cuda_stream_view stream = cudf::default_stream_value) | ||
| { | ||
| CUDF_EXPECTS(op == binary_operator::EQUAL || op == binary_operator::NOT_EQUAL, | ||
| "Unsupported operator for these types"); | ||
|
|
||
| auto tlhs = table_view{{lhs}}; | ||
| auto trhs = table_view{{rhs}}; | ||
| auto table_comparator = | ||
| cudf::experimental::row::equality::two_table_comparator{tlhs, trhs, stream}; | ||
| auto device_comparator = | ||
| table_comparator.equal_to(nullate::DYNAMIC{has_nested_nulls(tlhs) || has_nested_nulls(trhs)}, | ||
| null_equality::EQUAL, | ||
| comparator); | ||
|
|
||
| auto outd = column_device_view::create(out, stream); | ||
| auto optional_iter = | ||
| cudf::detail::make_optional_iterator<bool>(*outd, nullate::DYNAMIC{out.has_nulls()}); | ||
| thrust::tabulate(rmm::exec_policy(stream), | ||
| out.begin<bool>(), | ||
| out.end<bool>(), | ||
| [optional_iter, | ||
| is_lhs_scalar, | ||
| is_rhs_scalar, | ||
| preserve_output = (op != binary_operator::NOT_EQUAL), | ||
| device_comparator] __device__(size_type i) { | ||
| auto lhs = cudf::experimental::row::lhs_index_type{is_lhs_scalar ? 0 : i}; | ||
| auto rhs = cudf::experimental::row::rhs_index_type{is_rhs_scalar ? 0 : i}; | ||
| return optional_iter[i].has_value() and | ||
| (device_comparator(lhs, rhs) == preserve_output); | ||
| }); | ||
| } | ||
| } // namespace cudf::binops::compiled::detail |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.