Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

port to rattler-build #1796

Open
wants to merge 38 commits into
base: branch-25.04
Choose a base branch
from
Open

Conversation

gforsyth
Copy link
Contributor

@gforsyth gforsyth commented Jan 27, 2025

Some notes on progress and changes needed:

  • GIT_DESCRIBE_HASH and GIT_DESCRIBE_NUMBER aren't supported, so those
    need to be set in the environment.

    • This is now handled by rapids-configure-rattler in gha-tools
  • For most of our recipes, we'll want to use the cache key that I have set
    up in librmm, otherwise each output is built as a separate recipe so you
    end up compiling everything N times

  • I suspect it's faster to port over recipes by hand, rather than with conda-recipe-manager convert. There's a fair bit of preparation required to make the meta.yaml files compatible in the first place, and failures can be hard to diagnose

    • Need to remove jinja conditionals in favor of minijinja syntax
    • Only some support for ternary operators (no !=)
    • conda-recipe-manager doesn't support generating multi-output recipes at this point (although it can parse them)

xref: rapidsai/build-planning#47

Copy link

copy-pr-bot bot commented Jan 27, 2025

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

{% if cuda_major != "11" %}
- cuda-cudart-dev
{% endif %}
- {{ "cuda-cudart-dev" if cuda_major == "12" else "cuda-version" }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

conda-recipe-manager doesn't like != as a comparison operator

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could do not and ==s?

Suggested change
- {{ "cuda-cudart-dev" if cuda_major == "12" else "cuda-version" }}
- {{ "cudatoolkit" if not (cuda_major == "12") else "cuda-version" }}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that works!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to write this in a way that only mentions CUDA 11, so that the condition can be trivially deleted when adding future major version support.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that was the idea. I just goofed on the syntax. Took another go below

#1796 (comment)

Comment on lines +5 to +13
version: ${{ env.get("RAPIDS_PACKAGE_VERSION") }}
cuda_version: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[:2] | join(".") }}
cuda_major: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[0] }}
date_string: ${{ env.get("RAPIDS_DATE_STRING") }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grabbed these from rapidsai/cugraph#4551 as currently conda-recipe-manager doesn't have support for handling converting from the extended jinja2 syntax to the subset that rattler supports

@github-actions github-actions bot added the ci label Jan 29, 2025
@gforsyth gforsyth marked this pull request as ready for review January 29, 2025 19:48
@gforsyth gforsyth requested review from a team as code owners January 29, 2025 19:48
looks like adding `$CPP_CHANNEL` to the mix overrides the existing channels
@gforsyth
Copy link
Contributor Author

Ok, now sccache is working with rattler-build and compared to the latest run on the 25.02 branch, rattler-build seems to be a nice improvement over mambabuild (this is only a point-in-time comparison -- I'll put together some averages of previous CI runs):

recipe lang cuda arch mamba build time rattler build time percentage
librmm c++ 11.8 amd64 3m 47s 2m 49s 74.45
librmm c++ 12.8 amd64 4m 20s 2m 28s 56.92
librmm c++ 11.8 arm64 3m 43s 2m 50s 76.23
librmm c++ 12.8 arm64 3m 17s 2m 27s 74.62
rmm py310 11.8 amd64 2m 42s 2m 4s 76.54
rmm py311 11.8 amd64 2m 51s 2m 19s 81.29
rmm py312 11.8 amd64 3m 59s 2m 3s 51.46
rmm py310 12.8 amd64 2m 36s 1m 33s 59.62
rmm py311 12.8 amd64 2m 24s 1m 32s 63.89
rmm py312 12.8 amd64 2m 17s 1m 32s 67.15
rmm py310 11.8 arm64 2m 45s 2m 28s 89.70
rmm py311 11.8 arm64 2m 27s 2m 10s 88.44
rmm py312 11.8 arm64 2m 19s 2m 5s 89.93
rmm py310 12.8 arm64 2m 25s 1m 39s 68.28
rmm py311 12.8 arm64 2m 04s 1m 44s 83.87
rmm py312 12.8 arm64 2m 06s 1m 38s 77.78

@github-actions github-actions bot added the Python Related to RMM Python API label Jan 31, 2025
@github-actions github-actions bot removed the Python Related to RMM Python API label Jan 31, 2025
@jakirkham
Copy link
Member

Thanks Gil! 🙏

How is channel priority handled in those cases?

@gforsyth
Copy link
Contributor Author

How is channel priority handled in those cases?

It's disabled for both mambabuild and rattler-build at the moment. Ostensibly we should see marginally faster solves once we can support strict channel priority, but the existing solves are already quite speedy

Comment on lines +25 to +26
python/rmm/rmm/librmm/*.cpp
!python/rmm/rmm/librmm/_torch_allocator.cpp
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It turns out that rattler-build does respect gitignore negations, but the unignored locationw as incorrect, which is why the file wasn't included in the builds.

ci/build_cpp.sh Outdated
Comment on lines 37 to 40
# These are probably set via `rapids-configure-conda-channels`
# -c rapidsai \
# -c conda-forge \
# -c nvidia
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are set via rapids-configure-conda-channels BUT, for the python builds where we also specify ${CPP_CHANNEL} that includes an implicit --override-channels, so we need to list them explicitly

Comment on lines +41 to +45
-c "${CPP_CHANNEL}" \
-c rapidsai \
-c rapidsai-nightly \
-c conda-forge \
-c nvidia
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned above, adding ${CPP_CHANNEL} adds an implicit --override-channels so we need to list them inline

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels unexpected. Should we report this upstream as a bug, or at least as something that should be documented?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll report it upstream -- it feels like a bug, but it might be a conscious break from some implicit conda behavior, but in that case, documentation is certainly merited.

@gforsyth gforsyth added non-breaking Non-breaking change improvement Improvement / enhancement to an existing function labels Feb 5, 2025
Comment on lines +9 to +10
GIT_DESCRIBE_HASH: ${{ env.get("GIT_DESCRIBE_HASH") }}
GIT_DESCRIBE_NUMBER: ${{ env.get("GIT_DESCRIBE_NUMBER") }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are set by rapids-configure-rattler

Comment on lines 46 to 50
- if: cuda_major == "11"
then:
- ${{ compiler('cuda') }} =${{ cuda_version }}
else:
- ${{ compiler('cuda') }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jakirkham -- do you think we can collapse this down into just

Suggested change
- if: cuda_major == "11"
then:
- ${{ compiler('cuda') }} =${{ cuda_version }}
else:
- ${{ compiler('cuda') }}
- ${{ compiler('cuda') }}

given the constraint on the line below?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes we can. Let's do it 👍

Thanks for continuing to whittle this down Gil! 🙏

Copy link
Contributor

@vyasr vyasr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall! While we are rewriting our recipes already, I think we should take this opportunity to consider some additional changes and simplifications. I've left some notes inline.

For most of our recipes, we'll want to use the cache key that I have set
up in librmm, otherwise each output is built as a separate recipe so you
end up compiling everything N times

By this, I assume you're not referring to the sccache ccache key but rather the rattler-build multi-output cache? If so, yes that is a must. It is the only way that I found to ensure a single compilation. Fortunately that feature has been released as fully-featured at this point.

One debate that we might as well have now on the first conversion: do we like sticking with conda_build_config.yaml, or should we move that file to variants.yaml? Personally I find the latter name much clearer, whereas the relationship of the former to variants was not clear until I read conda-build documentation. Also it's a clear sign of the changes needed for rattler-build. That being said, it's not strictly necessary since rattler-build ]does respect conda_build_config.yaml by default](https://rattler.build/latest/variants/#automatic-discovery) (unlike meta.yaml, which you would have to specify explicitly). If we've already discussed this elsewhere and decided not to make the change, feel free to ignore this.

source rapids-configure-rattler

rattler-build build --recipe conda/recipes/librmm \
--experimental \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this? I don't think we are using any experimental features any more, but please correct me if I am wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the multi-output cache is still gated behind the --experimental flag (on 0.35.9 which I believe is the latest release)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK got it. I was basing my statement on the fact that their docs no longer list the cache on their experimental features page and instead on its own. Either an oversight in their docs or maybe it'll be moved out in the next release.


rattler-build build --recipe conda/recipes/librmm \
--experimental \
--no-build-id \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leave a comment pointing to https://rattler.build/latest/tips_and_tricks/#using-sccache-or-ccache-with-rattler-build for why we need this?

Comment on lines +41 to +45
-c "${CPP_CHANNEL}" \
-c rapidsai \
-c rapidsai-nightly \
-c conda-forge \
-c nvidia
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels unexpected. Should we report this upstream as a bug, or at least as something that should be documented?

version: ${{ version }}
build:
number: ${{ GIT_DESCRIBE_NUMBER }}
string: cuda${{ cuda_major }}_${{ date_string }}_${{ GIT_DESCRIBE_HASH }}_${{ GIT_DESCRIBE_NUMBER }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good time to reconsider the build string and build number.

  • Do we really need to include the hash or the number? The number is now part of the package version itself via rapids-generate-version, while the commit hash is written by the build backend. Admittedly, the latter does not apply to C++ libraries, so I suppose some information is lost, but in theory that information is largely redundant to the git describe number (which is in the version). I created both the versioning strategy and the build backend well after these build strings were put in place and didn't question them then, but perhaps now is a good time to do so?
  • We don't need to specify a build number above anymore because the versioning scheme guarantees a unique version for each new commit, so we could drop that as well.

If either of these changes caused problems in any repos, that would almost certainly be an indication of bugs like rapidsai/kvikio#616 where the versioning scheme was not being correctly applied, so that would be a win too.

Dropping these would also obviate the need for rapids-configure-rattler altogether.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove them and test

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with removing the GIT_DESCRIBE_NUMBER from the build string as long as it stays in the alpha version number. That is redundant.

I do not want to remove the GIT_DESCRIBE_HASH. The GIT_DESCRIBE_HASH is very useful for debugging other developers' conda environments because we can see what commit their packages were built from, and we can determine if their build is expected to have a certain feature or not. Otherwise we are stuck trying to use GIT_DESCRIBE_NUMBER to count how many commits their build was from the last tag...

I think the hash also provides a nice confirmation that releases were built correctly from the tagged commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bdice I think that's reasonable. Are you good with removing the build number? Given that we basically never rebuild a particular build, the current behavior feels very wrong to me: for a release YY.MM.00aN, it will always be released with build number N. We'll never see builds 0-(N-1).

As long as the only piece of information that we need is the git commit hash, and given that we are keeping the experimental flag on anyway based on the above thread, I wonder if we could make use of rattler's new git functions to get the hash. At surface level they seem to have a problem, which (if I'm understanding the docs right) is that they will always pull the latest commit hash for the head of the repo, so when you're building during a PR they hash will be wrong. More concerning is that the hash could be wrong during burndown when we're simultaneously building two different release branches; the commit hash rattler chooses might be the wrong one half the time. However, most git operations will accept local filesystem paths, so if we add

head_rev: ${{ git.head_rev( "." ) }}

to our context section we may be able to get the git hash from the local checkout and be all set.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tested this in the rapids-logger recipe and it works.


build:
script:
file: build.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we double-check the contents of each of the different output packages and make sure that we're putting what we expect in each package? IIRC when I experimented with rattler-build the way the cache worked was that when you restored files from the cache it dropped everything into the PREFIX and you had to do more careful filtering of each output to ensure that the files were present. I don't remember why I couldn't just leverage a cmake --install command for each output, but I think the issue was that the build directory would no longer be available between outputs and only what was restored to the PREFIX from the cache was available. Is that no longer the case?

Looking at some of the job build outputs, it does seems like the right things are getting packaged into each output, which is encouraging, but it would be good to download the packages and inspect them to be doubly sure.

While looking at the outputs, I also noticed that the entire build cache is being uploaded to s3 as well by the rapids-upload-to-s3 command. If possible it would be good to filter that out somehow (or simply delete that directory entirely before the upload) to reduce network traffic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Answered the contents bit below in a separate comment -- in re:

While looking at the outputs, I also noticed that the entire build cache is being uploaded to s3 as well by the rapids-upload-to-s3 command. If possible it would be good to filter that out somehow (or simply delete that directory entirely before the upload) to reduce network traffic.

I will do that

- python:
imports:
- rmm
pip_check: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this just fail outright for us?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently, yes -- the listed dependency bound in rmm/pyproject.toml is

dependencies = [
    "cuda-python>=11.8.5,<12.0a0",
    "numpy>=1.23,<3.0a0",
] # This list was generated by `rapids-dependency-file-generator`. To make changes, edit ../../dependencies.yaml and run `rapids-dependency-file-generator`.

which fails when run against the cuda12 build

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably invoke rapids-build-backend or rapids-dependency-file-generator to write out the appropriate dependencies before we build the package. We should try to pass pip check, because it gives us a (strong? weak?) cross-check that dependencies.yaml is compatible with recipe.yaml.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I'm not sure that I really understand what pip check is doing here. Is it actually looking at the contents of pyproject.toml? I would have expected that it is only looking at installed packages, i.e. the Requires-Dist metadata keys in the installed package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much all of the comments from the other recipe also apply here so let's resolve those discussions then we can apply the same changes (or not) to this one.

@gforsyth
Copy link
Contributor Author

gforsyth commented Feb 7, 2025

Thanks for the detailed review @vyasr ! Still working through some of it, but to this point:

Can we double-check the contents of each of the different output packages and make sure that we're putting what we expect in each package? IIRC when I experimented with rattler-build the way the cache worked was that when you restored files from the cache it dropped everything into the PREFIX and you had to do more careful filtering of each output to ensure that the files were present. I don't remember why I couldn't just leverage a cmake --install command for each output, but I think the issue was that the build directory would no longer be available between outputs and only what was restored to the PREFIX from the cache was available. Is that no longer the case?

I believe that the packages contain what's expected, but I'm still pretty new to these packages, so I'll post the contents for others to inspect at their leisure. librmm is in a gist because it's quite big, but the librmm-test contents are just displayed inline below.

librmm pkg contents: https://gist.github.com/gforsyth/e467b61bacbc37d47ee027b690e35ebf
librmm-test pkg contents:

-rwxr-xr-x 0/0           36112 2025-02-07 11:28 bin/benchmarks/librmm/CUDA_STREAM_POOL_BENCH
-rwxr-xr-x 0/0          955712 2025-02-07 11:28 bin/benchmarks/librmm/MULTI_STREAM_ALLOCATIONS_BENCH
-rwxr-xr-x 0/0          865488 2025-02-07 11:28 bin/benchmarks/librmm/RANDOM_ALLOCATIONS_BENCH
-rwxr-xr-x 0/0          904408 2025-02-07 11:28 bin/benchmarks/librmm/REPLAY_BENCH
-rwxr-xr-x 0/0          697096 2025-02-07 11:28 bin/benchmarks/librmm/UVECTOR_BENCH
-rwxr-xr-x 0/0         1076216 2025-02-07 11:28 bin/gtests/librmm/ADAPTOR_TEST
-rwxr-xr-x 0/0          642544 2025-02-07 11:28 bin/gtests/librmm/ALIGNED_TEST
-rwxr-xr-x 0/0         1232192 2025-02-07 11:28 bin/gtests/librmm/ARENA_MR_TEST
-rwxr-xr-x 0/0          470312 2025-02-07 11:28 bin/gtests/librmm/BINNING_MR_TEST
-rwxr-xr-x 0/0          582536 2025-02-07 11:28 bin/gtests/librmm/CALLBACK_MR_TEST
-rwxr-xr-x 0/0          514512 2025-02-07 11:28 bin/gtests/librmm/CONTAINER_MULTIDEVICE_TEST
-rw-r--r-- 0/0           13735 2025-02-07 11:28 bin/gtests/librmm/CTestTestfile.cmake
-rwxr-xr-x 0/0          430696 2025-02-07 11:28 bin/gtests/librmm/CUDA_ASYNC_MR_SHARED_CUDART_TEST
-rwxr-xr-x 0/0         1406040 2025-02-07 11:28 bin/gtests/librmm/CUDA_ASYNC_MR_STATIC_CUDART_TEST
-rwxr-xr-x 0/0          554064 2025-02-07 11:28 bin/gtests/librmm/CUDA_STREAM_TEST
-rwxr-xr-x 0/0         3136264 2025-02-07 11:28 bin/gtests/librmm/DEVICE_BUFFER_TEST
-rwxr-xr-x 0/0         1502120 2025-02-07 11:28 bin/gtests/librmm/DEVICE_MR_REF_TEST
-rwxr-xr-x 0/0          828568 2025-02-07 11:28 bin/gtests/librmm/DEVICE_SCALAR_TEST
-rwxr-xr-x 0/0         1081960 2025-02-07 11:28 bin/gtests/librmm/DEVICE_UVECTOR_TEST
-rwxr-xr-x 0/0          452400 2025-02-07 11:28 bin/gtests/librmm/FAILURE_CALLBACK_TEST
-rwxr-xr-x 0/0          591088 2025-02-07 11:28 bin/gtests/librmm/HOST_MR_REF_TEST
-rwxr-xr-x 0/0          461832 2025-02-07 11:28 bin/gtests/librmm/LIMITING_TEST
-rwxr-xr-x 0/0          864000 2025-02-07 11:28 bin/gtests/librmm/LOGGER_TEST
-rwxr-xr-x 0/0          830424 2025-02-07 11:28 bin/gtests/librmm/PINNED_POOL_MR_TEST
-rwxr-xr-x 0/0          493152 2025-02-07 11:28 bin/gtests/librmm/POLYMORPHIC_ALLOCATOR_TEST
-rwxr-xr-x 0/0          914576 2025-02-07 11:28 bin/gtests/librmm/POOL_MR_TEST
-rwxr-xr-x 0/0          493216 2025-02-07 11:28 bin/gtests/librmm/PREFETCH_ADAPTOR_TEST
-rwxr-xr-x 0/0          495064 2025-02-07 11:28 bin/gtests/librmm/PREFETCH_TEST
-rwxr-xr-x 0/0          496408 2025-02-07 11:28 bin/gtests/librmm/STATISTICS_TEST
-rwxr-xr-x 0/0          483744 2025-02-07 11:28 bin/gtests/librmm/STREAM_ADAPTOR_TEST
-rwxr-xr-x 0/0          504744 2025-02-07 11:28 bin/gtests/librmm/SYSTEM_MR_TEST
-rwxr-xr-x 0/0         2027656 2025-02-07 11:28 bin/gtests/librmm/THRUST_ALLOCATOR_TEST
-rwxr-xr-x 0/0          873144 2025-02-07 11:28 bin/gtests/librmm/TRACKING_TEST
-rwxr-xr-x 0/0         1038312 2025-02-07 11:28 bin/gtests/librmm/generate_ctest_json
-rw-r--r-- 0/0            1691 2025-02-07 11:28 bin/gtests/librmm/run_gpu_test.cmake
**

@bdice
Copy link
Contributor

bdice commented Feb 7, 2025

@gforsyth The package contents look right to me. I'm not aware of anything missing but it could be helpful to do a direct comparison against the existing nightlies and see if any paths are different.

Copy link
Contributor

@bdice bdice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another round of comments -- great work so far!

Comment on lines +6 to +7
cuda_version: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[:2] | join(".") }}
cuda_major: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[0] }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: let's normalize everything to use double-quotes.

Suggested change
cuda_version: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[:2] | join(".") }}
cuda_major: ${{ (env.get('RAPIDS_CUDA_VERSION') | split('.'))[0] }}
cuda_version: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[:2] | join(".") }}
cuda_major: ${{ (env.get("RAPIDS_CUDA_VERSION") | split("."))[0] }}

SCCACHE_REGION: ${{ env.get("SCCACHE_REGION", default="") }}
SCCACHE_S3_USE_SSL: ${{ env.get("SCCACHE_S3_USE_SSL", default="") }}
SCCACHE_S3_NO_CREDENTIALS: ${{ env.get("SCCACHE_S3_NO_CREDENTIALS", default="") }}
SCCACHE_S3_KEY_PREFIX: librmm-${{ env.get("RUNNER_ARCH", default="X64") | replace("X64", "linux64") | replace("ARM64", "aarch64") | lower }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add a RAPIDS_ARCH (or RAPIDS_CONDA_ARCH) variable into our CI containers so we aren't tying this to GitHub Actions implementation details (GHA specifies RUNNER_ARCH but local reproductions of CI do not have it defined!). I ran into problems when reproducing CI locally for cugraph-gnn because RUNNER_ARCH wasn't defined.

If we added RAPIDS_ARCH (or RAPIDS_CONDA_ARCH) then we can cleanly solve this problem and minimize the logic here to just librmm-${{ env.get("RAPIDS_ARCH") }}.

I think there should not be a default -- we require RAPIDS_CUDA_VERSION so we can make this a hard requirement, too.

version: ${{ version }}
build:
number: ${{ GIT_DESCRIBE_NUMBER }}
string: cuda${{ cuda_major }}_${{ date_string }}_${{ GIT_DESCRIBE_HASH }}_${{ GIT_DESCRIBE_NUMBER }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm ok with removing the GIT_DESCRIBE_NUMBER from the build string as long as it stays in the alpha version number. That is redundant.

I do not want to remove the GIT_DESCRIBE_HASH. The GIT_DESCRIBE_HASH is very useful for debugging other developers' conda environments because we can see what commit their packages were built from, and we can determine if their build is expected to have a certain feature or not. Otherwise we are stuck trying to use GIT_DESCRIBE_NUMBER to count how many commits their build was from the last tag...

I think the hash also provides a nice confirmation that releases were built correctly from the tagged commit.

- fmt ${{ fmt_version }}
- spdlog ${{ spdlog_version }}
run:
- ${{ pin_compatible('cuda-version', upper_bound='x', lower_bound='x') }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're inconsistent with double/single quotes in lots of places. Let's use double quotes in the general case -- unless we are wrapping single quotes (or if there are any other behavioral oddities).

- if: cuda_major == "11"
then: cudatoolkit
else: cuda-cudart-dev
- fmt ${{ fmt_version }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be aware that #1808 will require changes here.

- cython >=3.0.0
- rapids-build-backend >=0.3.0,<0.4.0.dev0
- librmm =${{ version }}
- python >=3.7,<3.12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoa! RAPIDS requires >=3.10,<3.13 right now.

But shouldn't this be constrained by something else? I think python is special and knows to use the current Python version in host, which should also generate appropriate run-exports iirc. Can we just write python as we did in the previous recipe?

Suggested change
- python >=3.7,<3.12
- python

- ${{ pin_compatible('cuda-version', upper_bound='x', lower_bound='x') }}
- numba >=0.59.1,<0.61.0a0
- numpy >=1.23,<3.0a0
- python >=3.7,<3.12
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this bound. See above.

Suggested change
- python >=3.7,<3.12
- python

- cuda-python
- if: not (cuda_major == "11")
then: "cuda-cudart-dev"
else: "cuda-version"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No "else" needed here.

Suggested change
else: "cuda-version"

- python:
imports:
- rmm
pip_check: false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could probably invoke rapids-build-backend or rapids-dependency-file-generator to write out the appropriate dependencies before we build the package. We should try to pass pip check, because it gives us a (strong? weak?) cross-check that dependencies.yaml is compatible with recipe.yaml.

Comment on lines +92 to +93
license: Apache-2.0
license_family: Apache
Copy link
Contributor

@bdice bdice Feb 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has license_family, which Rattler's docs say has been dropped in favor of license being an SPDX identifier.

Similarly, license_file might be okay to drop?

Let's pick a standard set of fields and make them match in all recipes/packages.

Also let's make sure the order of these fields is consistent (alphabetical? following the order in the docs?) -- homepage is first and summary is last here, but it's different in librmm.

@vyasr
Copy link
Contributor

vyasr commented Feb 8, 2025

@gforsyth The package contents look right to me. I'm not aware of anything missing but it could be helpful to do a direct comparison against the existing nightlies and see if any paths are different.

I was going to suggest exactly the same thing.

tests:
- script:
- "test -d \"${PREFIX}/include/rmm\""
about:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's somewhat out of scope for this PR, but also not really since one reason I never considered this sooner was because I wasn't sure what rattler would support by the time that we got here: should we try using load_from_file to get this information from the pyproject.toml file now? rattler-build has this functionality now. It's pretty low-priority but given all of the other cleanup that we're doing in this PR it might be nice to really single-source .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci conda improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
Status: Review
Development

Successfully merging this pull request may close these issues.

4 participants