Skip to content

build(cudf): Pin cuDF dependencies and add script for updates#15992

Closed
bdice wants to merge 9 commits intofacebookincubator:mainfrom
bdice:build/update-cudf-deps
Closed

build(cudf): Pin cuDF dependencies and add script for updates#15992
bdice wants to merge 9 commits intofacebookincubator:mainfrom
bdice:build/update-cudf-deps

Conversation

@bdice
Copy link
Copy Markdown
Collaborator

@bdice bdice commented Jan 12, 2026

This PR pins cuDF's dependencies: rapids-cmake, rmm, and kvikio.

This is a follow-up to #15937, which updates the cuDF dependency to 26.02. It pulls a recent cuDF commit from main. However, cuDF's core RAPIDS dependencies (rapids-cmake, RMM, KvikIO) are currently unpinned, meaning that they will continue to pull the latest from main while cuDF remains pinned. We would like to ensure that builds are stable and reproducible, which means we should pin these core cuDF dependencies.

To make future updates simpler, I added a script scripts/update-cudf-deps.sh. To update to a recent branch, developers can run:

./scripts/update-cudf-deps.sh --branch main
./scripts/update-cudf-deps.sh --branch release/26.02

A pain point that cuDF developers have faced is testing Velox with cuDF pull requests in progress. To support that use case, developers can run:

./scripts/update-cudf-deps.sh --pr <pr-number>

To use a specific cuDF commit with compatible dependency versions, developers can run:

./scripts/update-cudf-deps.sh --commit <sha>

This automatically finds compatible rapids-cmake, rmm, and kvikio versions by selecting the most recent main branch commit before the specified cuDF commit date, ensuring all dependencies are temporally compatible.

@bdice bdice self-assigned this Jan 12, 2026
@netlify
Copy link
Copy Markdown

netlify bot commented Jan 12, 2026

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 7ed1f9c
🔍 Latest deploy log https://app.netlify.com/projects/meta-velox/deploys/69791d1d8aaf320008c91c32

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 12, 2026
Update cudf.cmake to pin rapids-cmake, rmm, and kvikio to specific commits with SHA256 checksums and refactor commit hashes into variables. Add --commit mode to update-cudf-deps.sh that automatically finds compatible dependency versions by selecting the most recent main branch commit before the specified cuDF commit date, ensuring all dependencies are temporally compatible.
@bdice bdice marked this pull request as ready for review January 27, 2026 00:45
Copy link
Copy Markdown
Collaborator Author

@bdice bdice Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is optional -- if Velox maintainers have reservations about including this script, we don't have to include it here. We can move it somewhere else.

Its purpose is to pin a compatible cudf dependency tree (rapids-cmake, rmm, kvikio) based on the desired cuDF branch/PR/commit hash.

@karthikeyann karthikeyann added cudf cudf related - GPU acceleration ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall labels Jan 27, 2026
bdice added 3 commits January 27, 2026 10:48
cuDF PR #20937 changed partitioning APIs to return num_partitions + 1
offsets where offsets[num_partitions] is the total row count. Update
the offset handling to expect the new format and exclude the trailing
total row count when calling cudf::split.
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Jan 28, 2026

@kagamiori has imported this pull request. If you are a Meta employee, you can view this in D91639452.

@meta-codesync meta-codesync bot closed this in 6280634 Jan 28, 2026
@meta-codesync
Copy link
Copy Markdown

meta-codesync bot commented Jan 28, 2026

@kagamiori merged this pull request in 6280634.

paul-aiyedun added a commit to paul-aiyedun/velox that referenced this pull request Jan 30, 2026
Fix a regression that is caused by commit 6280634
("build(cudf): Pin cuDF dependencies and add script for updates (facebookincubator#15992)"),
which updates cudf to version 26.04. This cudf version changed the
partition offset API to return numPartitions + 1 elements instead of
numPartitions, but the CudfPartitionedOutput.cpp file in the
cudf-exchange directory was not updated to handle this change.
paul-aiyedun added a commit to paul-aiyedun/velox that referenced this pull request Jan 31, 2026
Fix a regression that is caused by commit 6280634
("build(cudf): Pin cuDF dependencies and add script for updates (facebookincubator#15992)"),
which updates cudf to version 26.04. This cudf version changed the
partition offset API to return numPartitions + 1 elements instead of
numPartitions, but the CudfPartitionedOutput.cpp file in the
cudf-exchange directory was not updated to handle this change.
paul-aiyedun added a commit to paul-aiyedun/velox that referenced this pull request Feb 3, 2026
Fix a regression that is caused by commit 6280634
("build(cudf): Pin cuDF dependencies and add script for updates (facebookincubator#15992)"),
which updates cudf to version 26.04. This cudf version changed the
partition offset API to return numPartitions + 1 elements instead of
numPartitions, but the CudfPartitionedOutput.cpp file in the
cudf-exchange directory was not updated to handle this change.
paul-aiyedun added a commit to paul-aiyedun/velox that referenced this pull request Feb 3, 2026
Fix a regression that is caused by commit 6280634
("build(cudf): Pin cuDF dependencies and add script for updates (facebookincubator#15992)"),
which updates cudf to version 26.04. This cudf version changed the
partition offset API to return numPartitions + 1 elements instead of
numPartitions, but the CudfPartitionedOutput.cpp file in the
cudf-exchange directory was not updated to handle this change.
paul-aiyedun added a commit to paul-aiyedun/velox that referenced this pull request Feb 5, 2026
Fix a regression that is caused by commit 6280634
("build(cudf): Pin cuDF dependencies and add script for updates (facebookincubator#15992)"),
which updates cudf to version 26.04. This cudf version changed the
partition offset API to return numPartitions + 1 elements instead of
numPartitions, but the CudfPartitionedOutput.cpp file in the
cudf-exchange directory was not updated to handle this change.
paul-aiyedun added a commit to paul-aiyedun/velox that referenced this pull request Feb 6, 2026
Fix a regression that is caused by commit 6280634
("build(cudf): Pin cuDF dependencies and add script for updates (facebookincubator#15992)"),
which updates cudf to version 26.04. This cudf version changed the
partition offset API to return numPartitions + 1 elements instead of
numPartitions, but the CudfPartitionedOutput.cpp file in the
cudf-exchange directory was not updated to handle this change.
paul-aiyedun added a commit to paul-aiyedun/velox that referenced this pull request Feb 7, 2026
Fix a regression that is caused by commit 6280634
("build(cudf): Pin cuDF dependencies and add script for updates (facebookincubator#15992)"),
which updates cudf to version 26.04. This cudf version changed the
partition offset API to return numPartitions + 1 elements instead of
numPartitions, but the CudfPartitionedOutput.cpp file in the
cudf-exchange directory was not updated to handle this change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. cudf cudf related - GPU acceleration Merged ready-to-merge PR that have been reviewed and are ready for merging. PRs with this tag notify the Velox Meta oncall

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants