Skip to content

WIP: split wheel into libcuml and cuml#6006

Closed
msarahan wants to merge 1 commit intorapidsai:branch-24.08from
msarahan:libcuml
Closed

WIP: split wheel into libcuml and cuml#6006
msarahan wants to merge 1 commit intorapidsai:branch-24.08from
msarahan:libcuml

Conversation

@msarahan
Copy link
Contributor

@msarahan msarahan commented Aug 1, 2024

This is part of rapidsai/build-planning#33

This PR has symbol visibility considerations, as detailed in https://nvidia.slack.com/archives/C07E95NRXJM/p1722468281754129?thread_ts=1722462671.639599&cid=C07E95NRXJM

The end goal will be to have this libcuml wheel depend on the libraft wheel, but we don't strictly need that to have useful libcuml wheels.

@jameslamb
Copy link
Member

closing this as stale... I've linked to it from the issue tracking this work, so it can be used as a reference. But when this is picked up, I think it'll be easier to start from a new PR.

@jameslamb jameslamb closed this Dec 17, 2024
rapids-bot bot pushed a commit that referenced this pull request Jan 24, 2025
Replaces #6006, contributes to rapidsai/build-planning#33.

Proposes packaging `libcuml` as a wheel, which is then re-used by `cuml-cu{11,12}` wheels.

## Notes for Reviewers

### Benefits of these changes

* smaller wheels (see "Size Changes" below)
* faster compile times
  - *no more re-compiling RAFT, thanks to rapidsai/raft#2531
* less use of CI resources (only compiling once per CPU architecture / CUDA versions, instead of once per those + Python minor version)
* other benefits mentioned in rapidsai/build-planning#33

### Wheel contents

`libcuml`:

* `libcuml++.so` (shared library) and its headers
* `libcumlprims_mg.so` (shared library) and its headers
* other vendored dependencies (CCCL, `fmt`)

`cuml`:

* `cuml` Python / Cython code and compiled Cython extensions

### Dependency Flows

In short.... `libcuml` contains `libcuml.so` and `libcumlprims_mg.so` dynamic libraries and the headers to link against them.

* Anything that needs to link against cuML at build time pulls in `libcugraph` wheels as a build dependency.
* Anything that needs cuML's symbols at runtime pulls it in as a runtime dependency, and calls `libcuml.load_library()`.

For more details and some flowcharts, see rapidsai/build-planning#33 (comment)

### Size changes (CUDA 12, Python 3.12, x86_64)

| wheel                | num files (before) | num files (this PR) | size (before)  | size (this PR) |
|:---------------:|------------------:|-----------------:|--------------:|-------------:|
| `libcuml`           |   ---                       |   1766                   | ---                   | 289M                 |
| `cuml`               |   442                     |   441                    | 527M               | 9M                 |
|**TOTAL**          |   **442**              |   **2207**               | **527M**        | **298M**    |

*NOTES: size = compressed, "before" = 2025-01-22 nightlies*

<details><summary>how I calculated those (click me)</summary>

```shell
docker run \
    --rm \
    --network host \
    --env RAPIDS_NIGHTLY_DATE=2025-01-22 \
    --env CUML_NIGHTLY_SHA=01e19bba9821954b062a04fbf31d3522afa4b0b1 \
    --env CUML_PR="pull-request/6199" \
    --env CUML_PR_SHA="9d5100ec4589e20230a31817518427efa1e49c6d" \
    --env RAPIDS_PY_CUDA_SUFFIX=cu12 \
    --env WHEEL_DIR_BEFORE=/tmp/wheels-before \
    --env WHEEL_DIR_AFTER=/tmp/wheels-after \
    -it rapidsai/ci-wheel:cuda12.5.1-rockylinux8-py3.12 \
    bash

# --- nightly wheels --- #
mkdir -p ./wheels-before

export RAPIDS_BUILD_TYPE=branch
export RAPIDS_REF_NAME="branch-25.02"

# cuml
RAPIDS_PY_WHEEL_NAME="cuml_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cuml \
RAPIDS_SHA=${CUML_NIGHTLY_SHA} \
    rapids-download-wheels-from-s3 python ./wheels-before

# --- wheels from CI --- #
mkdir -p ./wheels-after

export RAPIDS_BUILD_TYPE="pull-request"

# libcuml
RAPIDS_PY_WHEEL_NAME="libcuml_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cuml \
RAPIDS_REF_NAME="${CUML_PR}" \
RAPIDS_SHA="${CUML_PR_SHA}" \
    rapids-download-wheels-from-s3 cpp ./wheels-after

# cuml
RAPIDS_PY_WHEEL_NAME="cuml_${RAPIDS_PY_CUDA_SUFFIX}" \
RAPIDS_REPOSITORY=rapidsai/cuml \
RAPIDS_REF_NAME="${CUML_PR}" \
RAPIDS_SHA="${CUML_PR_SHA}" \
    rapids-download-wheels-from-s3 python ./wheels-after

pip install pydistcheck
pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ./wheels-before/*.whl \
| grep -E '^checking|files: | compressed' \
> ./before.txt

# get more exact sizes
du -sh ./wheels-before/*

pydistcheck \
    --inspect \
    --select 'distro-too-large-compressed' \
    ./wheels-after/*.whl \
| grep -E '^checking|files: | compressed' \
> ./after.txt

# get more exact sizes
du -sh ./wheels-after/*
```

</details>

### How I tested this

These other PRs:

* rapidsai/devcontainers#442

Authors:
  - James Lamb (https://github.com/jameslamb)
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Divye Gala (https://github.com/divyegala)

URL: #6199
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ci CMake Cython / Python Cython or Python issue

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants