Skip to content

Refactor cudf::contains#10997

Merged
rapids-bot[bot] merged 346 commits intorapidsai:branch-22.08from
ttnghia:refactor_cudfcontains
May 31, 2022
Merged

Refactor cudf::contains#10997
rapids-bot[bot] merged 346 commits intorapidsai:branch-22.08from
ttnghia:refactor_cudfcontains

Conversation

@ttnghia
Copy link
Contributor

@ttnghia ttnghia commented May 27, 2022

This is just a simple refactor to cudf::contains:

  • Remove cudf/structs/detail/contains.hpp and its corrresponding source file src/structs/search/contains.cu, moving its (modified) implementation into src/search/contains_nested.cu.
  • Adopt experimental::row::equality::two_table_comparator for struct equality comparison.
  • Add const qualifier for the operator() functions.
  • Rename some variables, and reorganize code to make it cleaner.

No new feature is added in this PR, just modifying the existing functions and moving things around.

This PR is extracted from the bigger PR for easier review. The original PR is #10656 for fully supporting nested type in cudf::contains. As such, this blocks it.

devavret added 30 commits March 23, 2022 15:14
Improve doc of decompose structs to account for lists
- avoid copying shared_ptrs for linked_col
Fixes the problem of having multiple copies of children
@ttnghia ttnghia requested a review from a team as a code owner May 27, 2022 16:45
@github-actions github-actions bot added CMake CMake build issue conda labels May 27, 2022
@ttnghia ttnghia removed the improvement Improvement / enhancement to an existing function label May 27, 2022
Signed-off-by: Nghia Truong <nghiatruong.vn@gmail.com>
@ttnghia ttnghia added improvement Improvement / enhancement to an existing function and removed feature request New feature or request labels May 27, 2022
@codecov

This comment was marked as off-topic.

ttnghia added 5 commits May 27, 2022 13:29
Signed-off-by: Nghia Truong <nghiatruong.vn@gmail.com>
Signed-off-by: Nghia Truong <nghiatruong.vn@gmail.com>
Signed-off-by: Nghia Truong <nghiatruong.vn@gmail.com>
Signed-off-by: Nghia Truong <nghiatruong.vn@gmail.com>
@ttnghia
Copy link
Contributor Author

ttnghia commented May 29, 2022

Rerun tests.

ttnghia added 2 commits May 30, 2022 18:42
Signed-off-by: Nghia Truong <nghiatruong.vn@gmail.com>
Co-authored-by: David Wendt <45795991+davidwendt@users.noreply.github.com>
Copy link
Contributor

@robertmaynard robertmaynard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CMake changes LGTM

@ttnghia ttnghia requested a review from PointKernel May 31, 2022 15:35
@ttnghia
Copy link
Contributor Author

ttnghia commented May 31, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 164e28e into rapidsai:branch-22.08 May 31, 2022
@ttnghia ttnghia deleted the refactor_cudfcontains branch May 31, 2022 15:54
rapids-bot bot pushed a commit that referenced this pull request Aug 17, 2022
This extends the `cudf::contains` API to support nested types (lists + structs) with arbitrarily nested levels. As such, `cudf::contains` will work with literally any type of input data.

In addition, this fixes null handling of `cudf::contains` with structs column + struct scalar input when the structs column contains null rows at the top level while the scalar key is valid but all nulls at children levels.

Closes: #8965
Depends on:
 * #10730
 * #10883
 * #10802
 * #10997
 * NVIDIA/cuCollections#172
 * NVIDIA/cuCollections#173
 * #11037
 * #11356

Authors:
  - Nghia Truong (https://github.com/ttnghia)
  - Devavret Makkar (https://github.com/devavret)
  - Bradley Dice (https://github.com/bdice)
  - Karthikeyan (https://github.com/karthikeyann)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Bradley Dice (https://github.com/bdice)
  - Yunsong Wang (https://github.com/PointKernel)

URL: #10656
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

3 - Ready for Review Ready for review by team CMake CMake build issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change Spark Functionality that helps Spark RAPIDS

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants