Skip to content
This repository was archived by the owner on Jan 9, 2026. It is now read-only.

Rebuild for CUDA 12#13

Merged
h-vetinari merged 23 commits into
conda-forge:mainfrom
h-vetinari:cuda_12
Aug 29, 2024
Merged

Rebuild for CUDA 12#13
h-vetinari merged 23 commits into
conda-forge:mainfrom
h-vetinari:cuda_12

Conversation

@h-vetinari

Copy link
Copy Markdown
Member

Bot couldn't open the PR for some reason (so the restart from #2 never worked), let's try doing it manually; based on #12

@conda-forge-webservices

Copy link
Copy Markdown
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

@h-vetinari h-vetinari force-pushed the cuda_12 branch 2 times, most recently from 37ce8f7 to c8b02c6 Compare August 9, 2024 03:57

@jakirkham jakirkham left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Axel! 🙏

Had a couple minor suggestions below that may help this clear

Comment thread recipe/meta.yaml
Comment thread recipe/meta.yaml
Comment on lines +104 to +91
imports:
- awq

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is loading libcuda.so (which requires a working GPU) and it may be doing this, we would need to skip this test

Suggested change
imports:
- awq
imports: # [cuda_compiler_version != "None"]
- awq # [cuda_compiler_version != "None"]

Alternatively we could switch to a Quansight GPU runner

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It fails to find libtorch_python.so:

import: 'awq_ext'
Traceback (most recent call last):
  File "/home/conda/feedstock_root/build_artifacts/autoawq_1723203836924/test_tmp/run_test.py", line 2, in <module>
    import awq_ext
ImportError: libtorch_python.so: cannot open shared object file: No such file or directory

I don't understand how the build manages to link at buildtime, but not at runtime. Well, -Wl,--allow-shlib-undefined might explain some of it (didn't succeed in stripping this it seems), but -L$SP_DIR/torch/lib is in LDFLAGS, so I have no idea where this fails. I doubt it's the cuda drivers though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC this was an issue in PyTorch packaging that was being worked on in PR: conda-forge/pytorch-cpu-feedstock#246

AIUI this was added to PyTorch 2.4.0: conda-forge/pytorch-cpu-feedstock#250

So likely we need to update PyTorch here too

@jakirkham

Copy link
Copy Markdown
Member

Also worth noting the migrator and version updates got a bit clogged up due to a duplicate key in conda-forge.yml: #14

That is now resolved. So a bunch of bot PRs just got added to this feedstock

upstream calculates CUDA arches from torch, does not respect
TORCH_CUDA_ARCH_LIST anymore
@conda-forge-webservices

Copy link
Copy Markdown
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe/meta.yaml:

  • No valid build backend found for Python recipe for package autoawq_kernels using pip. Python recipes using pip need to explicitly specify a build backend in the host section. If your recipe has built with only pip in the host section in the past, you likely should add setuptools to the host section of your recipe.

@conda-forge-webservices

Copy link
Copy Markdown
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe/meta.yaml) and found it was in an excellent condition.

@h-vetinari

Copy link
Copy Markdown
Member Author

Given conda/conda-build#5467, I'm going to merge this for now and then see if I can layer the rest on top.

@h-vetinari h-vetinari merged commit 91e57ce into conda-forge:main Aug 29, 2024
@h-vetinari h-vetinari deleted the cuda_12 branch August 29, 2024 06:50
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants