Skip to content

Rebuild for CUDA 12#148

Open
regro-cf-autotick-bot wants to merge 22 commits into
conda-forge:mainfrom
regro-cf-autotick-bot:rebuild-cuda120-0-3_h86dae8
Open

Rebuild for CUDA 12#148
regro-cf-autotick-bot wants to merge 22 commits into
conda-forge:mainfrom
regro-cf-autotick-bot:rebuild-cuda120-0-3_h86dae8

Conversation

@regro-cf-autotick-bot

Copy link
Copy Markdown
Contributor

This PR has been triggered in an effort to update cuda120.

Notes and instructions for merging this PR:

  1. Please merge the PR only after the tests have passed.
  2. Feel free to push to the bot's branch to update this PR if needed.

Please note that if you close this PR we presume that the feedstock has been rebuilt, so if you are going to perform the rebuild yourself don't close this PR until the your rebuild has been merged.


Here are some more details about this specific migrator:

The transition to CUDA 12 SDK includes new packages for all CUDA libraries and
build tools. Notably, the cudatoolkit package no longer exists, and packages
should depend directly on the specific CUDA libraries (libcublas, libcusolver,
etc) as needed. For an in-depth overview of the changes and to report problems
see this issue.
Please feel free to raise any issues encountered there. Thank you! 🙏


If this PR was opened in error or needs to be updated please add the bot-rerun label to this PR. The bot will close this PR and schedule another one. If you do not have permissions to add this label, you can use the phrase @conda-forge-admin, please rerun bot in a PR comment to have the conda-forge-admin add it for you.

This PR was created by the regro-cf-autotick-bot. The regro-cf-autotick-bot is a service to automatically track the dependency graph, migrate packages, and propose package version updates for conda-forge. Feel free to drop us a line if there are any issues! This PR was generated by - please use this URL for debugging.

The transition to CUDA 12 SDK includes new packages for all CUDA libraries and
build tools. Notably, the cudatoolkit package no longer exists, and packages
should depend directly on the specific CUDA libraries (libcublas, libcusolver,
etc) as needed. For an in-depth overview of the changes and to report problems
[see this issue]( conda-forge/conda-forge.github.io#1963 ).
Please feel free to raise any issues encountered there. Thank you! 🙏
@conda-forge-webservices

Copy link
Copy Markdown

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@h-vetinari

Copy link
Copy Markdown
Member

@conda-forge-admin, please rerender

@conda-forge-webservices

conda-forge-webservices Bot commented Aug 9, 2024

Copy link
Copy Markdown

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

@h-vetinari

Copy link
Copy Markdown
Member

@conda-forge-admin, please rerender

@h-vetinari h-vetinari force-pushed the rebuild-cuda120-0-3_h86dae8 branch from cd7f4fc to 46c3957 Compare August 9, 2024 06:38
@h-vetinari

Copy link
Copy Markdown
Member

Haven't seen such a failure before:

[ 48%] Building NVCC (Device) object AmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/__/__/cuda/xc_cuda_generated_gpu_getxc.cu.o
sh: cicc: command not found
CMake Error at xc_cuda_generated_gpu_getxc.cu.o.Release.cmake:278 (message):
  Error generating file
  /home/conda/feedstock_root/build_artifacts/ambertools_1723188819308/work/build/AmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/__/__/cuda/./xc_cuda_generated_gpu_getxc.cu.o

Looks like this is a known issue and we need to point to $PREFIX/nvvm/bin/cicc

@mattwthompson

Copy link
Copy Markdown
Member

I gather from the diff that the changes here cover Windows but Windows support isn't added?

@h-vetinari

Copy link
Copy Markdown
Member

I gather from the diff that the changes here cover Windows but Windows support isn't added?

The migrator would be adding CUDA 12.0 builds on windows, if windows weren't skipped completely here. That's okay though, it's just the default title of PRs opened by this migrator. Actual windows enablement should be done independently from this PR.

@h-vetinari h-vetinari force-pushed the rebuild-cuda120-0-3_h86dae8 branch from ad43279 to 7f37f2f Compare August 14, 2024 08:42
@h-vetinari

Copy link
Copy Markdown
Member

OK, moved past the cicc issue, now getting:

[ 58%] Building NVCC intermediate link file AmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/xc_cuda_intermediate_link.o
cc1plus: fatal error: $BUILD_PREFIX/targets/x86_64-linux/bin/crt/link.stub: No such file or directory
compilation terminated.

I cannot tell from the recipe where things would refer to link.stub; I'm presuming this should be part of the cuda-nvcc setup, but there, that stub is under $BUILD_PREFIX/bin/crt/link.stub. Could something still be configured incorrectly @conda-forge/cuda?

@jakirkham

jakirkham commented Aug 27, 2024

Copy link
Copy Markdown
Member

Sorry for the slow reply here Axel

Discussed this with my colleagues today

When we have seen similar issues before, they have tended to trace back to using the legacy CMake find_package(CUDA), which is deprecated. In these cases, libraries are recommended to move to adding the CUDA language where appropriate and start using find_package(CUDAToolkit REQUIRED) to pick up any CTK contents for linking into relevant artifacts. It's possible that other steps may need to be as well ( scopetools/cudadecon#29 (comment) )

Am not entirely sure the right place to look at the source code for ambertools, but was able to find the Amber-MD GitHub org, which references the webpage ( https://ambermd.org/ ) used in downloads here

url: https://ambermd.org/downloads/AmberTools23_rc6.tar.bz2

Looking in that org do see usage of find_package(CUDA). So think the first step would be for ambertools to complete this upgrade

As an interesting note did see this comment in that ambertools code:

# With CMake 3.7, FindCUDA.cmake crashes when crosscompiling.

if(CROSSCOMPILE)
	message(STATUS "CUDA disabled when crosscompiling.")
	set(CUDA FALSE)
else()

One of the things the CMake team solved by adding the CUDA language and find_package(CUDAToolkit) was cross-compilation support with CUDA. In fact this was done with an eye toward working in Conda with the Conda compilers

Think to move this forward, would recommend working with upstream to adopt these changes. Possibly the build here can be patched to use those upstream changes (though it may be simpler to update to a new release with the build fixes)

cc @robertmaynard @bdice (for awareness & in case revisions to the above are needed)

@mattwthompson

Copy link
Copy Markdown
Member

As far as I recall, the canonical source code is non-public and on GitLab. But it's several projects stapled together, including cpptraj which is hosted here, so what you found there occurs at least once (probably several times).

cc: @dacase who is likely the best person to coordinate making any needed changes away from deprecated calls

@h-vetinari

Copy link
Copy Markdown
Member

Thanks for the analysis John!

@h-vetinari

Copy link
Copy Markdown
Member

So I downloaded the tarball (man there's a lot of stuff in there; a cool 3GB when unpacked, and a mass of vendored bits), and searched for the occurrences of find_package(CUDA:

>findstr /L /S /N /C:"find_package(CUDA" *.*
AmberTools\src\cpptraj\cmake-cpptraj\CudaConfig.cmake:11:       find_package(CUDA)
AmberTools\src\quick\cmake\CudaConfig.cmake:11: find_package(CUDA)
AmberTools\src\quick\quick-cmake\QUICKCudaConfig.cmake:11:    find_package(CUDA REQUIRED)
cmake\CudaConfig.cmake:11:      find_package(CUDA)

Given that there's only 4, this sounds quite patchable.

@h-vetinari h-vetinari force-pushed the rebuild-cuda120-0-3_h86dae8 branch from 0f5e3cb to 504b0a8 Compare August 28, 2024 03:43
@jakirkham jakirkham changed the title Rebuild for CUDA 12 w/arch + Windows support Rebuild for CUDA 12 Aug 28, 2024
@jakirkham

Copy link
Copy Markdown
Member

Am renaming the PR to avoid further confusion. Hope that is ok

@h-vetinari h-vetinari force-pushed the rebuild-cuda120-0-3_h86dae8 branch from 95560e0 to 3b4686d Compare August 28, 2024 04:31
@h-vetinari h-vetinari force-pushed the rebuild-cuda120-0-3_h86dae8 branch from 1e0c112 to c01fe85 Compare August 28, 2024 07:06
@h-vetinari

Copy link
Copy Markdown
Member

Well, I'm several patches deep into trying to make this work, and I think I'm hitting a CMake bug.

Surely it would be better to do less hacky changes in AmberTools upstream; I was mainly trying to see what would be necessary to unblock the build and tried to keep patching ~minimal, at least conceptually (feel free to pick up anything, though these were not really written with being upstreamed in mind - not least because there's no public repo to contribute to - but rather as the most immediately necessary fixes to overcome the failures here).

@mattwthompson

mattwthompson commented Aug 28, 2024

Copy link
Copy Markdown
Member

Just to avoid any confusion - this PR would have to be for AmberTools 23 until #141 or a similar build is complete, so using the AmberTools23_rc6.tar.bz2 blob is the only option. Since building AmberTools 24 is stalled, updating it for these CUDA changes can't happen with that version.

@jakirkham

Copy link
Copy Markdown
Member

Am deeply impressed by the amount of effort you spent patching here Axel! 🙏

Subscribed to that issue. Though it looks like my colleague Rob already replied to you over there. Agree with him we likely need enable_language

That all being said, agree this is work probably best taken on upstream. Think the other pieces you included here are a good starting point for anyone wanting to push this forward


Agreed Matt. Was trying to capture that in my comments above. Apologies if that was too muddled with other details

@h-vetinari

Copy link
Copy Markdown
Member

Just to avoid any confusion - this PR would have to be for AmberTools 23 until #141 or a similar build is complete, so using the AmberTools23_rc6.tar.bz2 blob is the only option.

This is what I've been doing, the sources are unchanged in this PR.

@mattwthompson

Copy link
Copy Markdown
Member

👍 yep just wanted to be sure we were all on the same page, that comment was mostly to explain to David why this is being applied to 23, not 24

while languages usually get defined around where the (sub)project has its
own CMakeLists.txt, this still doesn't work, so move it to the very top
@h-vetinari

Copy link
Copy Markdown
Member

Well, I got things to build, but then run into:

$SRC_DIR/AmberTools/src/quick/src/cuda/gpu_getxc.h: error: no instance of overloaded function "atomicAdd" matches the argument list

@jakirkham

jakirkham commented Aug 30, 2024

Copy link
Copy Markdown
Member

Does the header in question have #include <cooperative_groups.h>?

That seems like the kind of thing we would need. It is also covered in this blogpost

Also worth noting this header lives in cuda-cudart-dev_{{ target_platform }}, which {{ compiler("cuda") }} pulls in as a dependency. So it should be available

@h-vetinari h-vetinari left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the header in question have #include <cooperative_groups.h>?

Actually, looking at the source code, it does something like this:

#ifdef USE_LEGACY_ATOMICS
      QUICKULL val1 = (QUICKULL) (fabs( _tmp * OSCALE) + (QUICKDouble)0.5);
      if ( _tmp * weight < (QUICKDouble)0.0)
          val1 = 0ull - val1;
      QUICKADD(devSim_dft.DFT_calculated[0].Eelxc, val1);
#else
      atomicAdd(&devSim_dft.DFT_calculated[0].Eelxc, _tmp);
#endif

The header is missing though, so realistically only the USE_LEGACY_ATOMICS branch has a chance of working.

Comment thread recipe/patches/0002-rely-on-DCMAKE_CUDA_ARCHITECTURES.patch
@h-vetinari

Copy link
Copy Markdown
Member

@conda-forge-admin, please rerender

@mikemhenry

Copy link
Copy Markdown
Contributor
[ 57%] Building CUDA object AmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/gga_c_am05.cu.o
/bin/sh: -c: line 0: syntax error near unexpected token `;'
/bin/sh: -c: line 0: `cd /home/conda/feedstock_root/build_artifacts/ambertools_1728550220697/work/build/AmberTools/src/quick/src/libxc/maple2c_device && /home/conda/feedstock_root/build_artifacts/ambertools_1728550220697/_build_env/bin/nvcc -forward-unknown-to-host-compiler -DCEW -DCEW_USE_DLL -DCUDA -DGNU --options-file CMakeFiles/xc_cuda.dir/includes_CUDA.rsp -Wno-deprecated-gpu-targets;-Wno-deprecated-declarations;-DUSE_LEGACY_ATOMICS;-O2;AmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/compiler_depend.tsAmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/compiler_depend.tsCONFIG:Debug>:-g>;-use_fast_math;--compiler-options;-fPIC -O3 -DNDEBUG -std=c++11 "--generate-code=arch=compute_35,code=[sm_35]" "--generate-code=arch=compute_53,code=[sm_53]" "--generate-code=arch=compute_62,code=[sm_62]" "--generate-code=arch=compute_72,code=[sm_72]" "--generate-code=arch=compute_75,code=[sm_75]" "--generate-code=arch=compute_80,code=[sm_80]" "--generate-code=arch=compute_86,code=[sm_86]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" -Xptxas --disable-optimizer-constants -I/home/conda/feedstock_root/build_artifacts/ambertools_1728550220697/work/AmberTools/src/quick/src/libxc/maple2c_device/.. -MD -MT AmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/gga_c_am05.cu.o -MF CMakeFiles/xc_cuda.dir/gga_c_am05.cu.o.d -x cu -c /home/conda/feedstock_root/build_artifacts/ambertools_1728550220697/work/AmberTools/src/quick/src/libxc/maple2c_device/gga_c_am05.cu -o CMakeFiles/xc_cuda.dir/gga_c_am05.cu.o'
make[2]: *** [AmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/build.make:77: AmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/gga_c_am05.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:7267: AmberTools/src/quick/src/libxc/maple2c_device/CMakeFiles/xc_cuda.dir/all] Error 2
make: *** [Makefile:156: all] Error 2

looks like there is an extra or missing ; somewhere

@h-vetinari h-vetinari mentioned this pull request Feb 7, 2025
5 tasks
mikemhenry added a commit to mikemhenry/ambertools-feedstock that referenced this pull request Feb 10, 2025
@mikemhenry mikemhenry mentioned this pull request Feb 11, 2025
5 tasks
mikemhenry added a commit that referenced this pull request Feb 21, 2025
* test if we can build 23 still

* MNT: Re-rendered with conda-build 24.3.0, conda-smithy 3.45.4, and conda-forge-pinning 2025.02.07.19.05.24

* MNT: Re-rendered with conda-build 25.1.1, conda-smithy 3.45.4, and conda-forge-pinning 2025.02.07.19.05.24

* see if things work without installing csh into the env

* MNT: Re-rendered with conda-build 25.1.1, conda-smithy 3.45.4, and conda-forge-pinning 2025.02.07.19.05.24

* turn off gui build

* MNT: Re-rendered with conda-build 25.1.2, conda-smithy 3.45.4, and conda-forge-pinning 2025.02.07.19.05.24

* don't skip older cuda builds, lets see if we can patch it

* MNT: Re-rendered with conda-build 25.1.2, conda-smithy 3.45.4, and conda-forge-pinning 2025.02.10.18.05.55

* build build number

* pull in changes from #148 (thanks @h-vetinari )

* MNT: Re-rendered with conda-build 25.1.2, conda-smithy 3.45.4, and conda-forge-pinning 2025.02.10.18.05.55

* add cuda 12.6 support

* lets see what happens if we allow newer clang

* that didn't fix the cxx detection issue

* pin cmake to older version that worked in the past

* don't pin cmake and use newest clang

* MNT: Re-rendered with conda-build 25.1.2, conda-smithy 3.45.4, and conda-forge-pinning 2025.02.12.12.48.03

* see if the patches are doing more harm then good

* still need to remove nab2c

* naive find and replace mtune with march

* didn't end up needed to use FC in host tools, but added for completness

* replace mtune with mcpu

* crazy hack to check PoC

* see if -march=armv8.3-a works for clang and gfortran

* do a verbose build to try and figure out where we are missing an arch flag

* guard if cd fails, but also cd into correct dir before we build the package

* try setting LDFLAGS -arch arm64

* fix LDFLAGS export (not sure why { was treated as literal

* see if any hacks are hurting us here

* try adding -arch arm64 to help ld understand what arch we are trying to build

* set correct arch

* TODO patch cmake/TargetArch.cmake

* try and patch cmake/TargetArch.cmake

* fix patch

* forgot we need to fix with our hack arm64-apple-darwin20.0: error: unsupported argument 'native' to option '-mtune='

* see if telling cmake we are crosscompling helps -- it this doesn't work I have another idea

* see if setting TARGET_TRIPLE is enough

* if these cmake flags don't work we will do plan b

* see if it likes armv8.3-a more than arm64

* it this doesn't work we will add the linker flags by hand

* switch around flags to correct ones

* see if this gets us back to linker error

* yolo

* bump ci

* see if adding the arch flags to setup.py for sander and cpptraj fixes it

* don't reset CC and CXX in setup.py files

* skip cuda builds

* MNT: Re-rendered with conda-build 25.1.2, conda-smithy 3.45.4, and conda-forge-pinning 2025.02.19.18.36.33

* see if adding mkl to run will fix issues with linux openmpi builds

* check python 3.13 builds

* MNT: Re-rendered with conda-build 25.1.2, conda-smithy 3.45.4, and conda-forge-pinning 2025.02.19.18.36.33

* skip python 3.13 builds and add mkl as a run time dep

---------

Co-authored-by: conda-forge-webservices[bot] <91080706+conda-forge-webservices[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants