Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rpaths #18

Merged
merged 3 commits into from
Jun 7, 2022
Merged

Fix rpaths #18

merged 3 commits into from
Jun 7, 2022

Conversation

raimis
Copy link
Contributor

@raimis raimis commented Jun 3, 2022

Revert baf0463

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@raimis raimis self-assigned this Jun 3, 2022
@raimis
Copy link
Contributor Author

raimis commented Jun 3, 2022

@conda-forge-admin, please rerender

@raimis raimis mentioned this pull request Jun 3, 2022
2 tasks
@carterbox
Copy link
Member

Why? This package doesn't install anything into $SP/torch/lib and that namespace belongs to pytorch.

@raimis
Copy link
Contributor Author

raimis commented Jun 3, 2022

It needs libtorch.so and friends, which are at {{ SP_DIR }}/torch/lib.

@carterbox
Copy link
Member

Makes sense. Seems like something @conda-forge/pytorch-cpu should do upstream if pytorch extension packages need to link against libtorch.so and friends.

@raimis raimis marked this pull request as ready for review June 3, 2022 16:30
@mikemhenry
Copy link
Contributor

@hmaarrfk do we need to make a change upstream?

@peastman
Copy link
Contributor

peastman commented Jun 3, 2022

Seems like something https://github.com/orgs/conda-forge/teams/pytorch-cpu should do upstream if pytorch extension packages need to link against libtorch.so and friends.

That would be great, if it would save downstream packages from needing to add it.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Jun 3, 2022

We might need to fix the feedstock,

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Jun 3, 2022

Why are the tests not catching errors?

@carterbox
Copy link
Member

Why are the tests not catching errors?

Possibly because you can't test load/run any binaries that link against libcuda.so on the CI because there are no GPUs?

string: cuda{{ cuda_compiler_version | replace('.', '') }}py{{ CONDA_PY }}h{{ PKG_HASH }}_{{ PKG_BUILDNUM }} # [cuda_compiler_version != "None"]
string: cpu_py{{ CONDA_PY }}h{{ PKG_HASH }}_{{ PKG_BUILDNUM }} # [cuda_compiler_version == "None"]
skip: true # [win or (linux and cuda_compiler_version in (undefined, 'None', '10.2'))]
rpaths:
- lib/
- {{ SP_DIR }}/torch/lib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we try adding:

requirements:
   run:
        - cudnn                           # [cuda_compiler_version != "None"]
        - nccl                            # [cuda_compiler_version != "None"]

similar to https://github.com/conda-forge/pytorch-cpu-feedstock/blob/main/recipe/meta.yaml#L58 instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess these two lines won't help.

2022-06-01T04:06:06.6901147Z WARNING (openmm-torch,lib/libOpenMMTorch.so): $RPATH/libc10.so not found in packages, sysroot(s) nor the missing_dso_whitelist.
2022-06-01T04:06:06.6901798Z .. is this binary repackaging?
2022-06-01T04:06:06.6916172Z WARNING (openmm-torch,lib/libOpenMMTorch.so): $RPATH/libtorch_cpu.so not found in packages, sysroot(s) nor the missing_dso_whitelist.
2022-06-01T04:06:06.6917017Z .. is this binary repackaging?
2022-06-01T04:06:06.6930804Z WARNING (openmm-torch,lib/libOpenMMTorch.so): $RPATH/libtorch_cuda.so not found in packages, sysroot(s) nor the missing_dso_whitelist.
2022-06-01T04:06:06.6931457Z .. is this binary repackaging?
2022-06-01T04:06:06.6945306Z WARNING (openmm-torch,lib/libOpenMMTorch.so): $RPATH/libtorch.so not found in packages, sysroot(s) nor the missing_dso_whitelist.
2022-06-01T04:06:06.6945926Z .. is this binary repackaging?

Is a really strange error to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truthfully, this is where my knowledge of the build systems ends.

It is unclear to me why we need the rpaths here, and not in torchvision.

https://github.com/conda-forge/torchvision-feedstock/blob/main/recipe/meta.yaml#L27

@conda-forge/cudatoolkit-dev if you could chime in here it owuld be helpful.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One reason why I think this is happening is that you are doing alot of custom configuration to find the file libtoch.so within the python sitepackages directory. This is somewhat frowned upon and is not OK. this is likely what all these RPATHs are trying to warn you about. I think, that for now, you should do what you need to do to link to this package. to "work". I think adding these "IGNORE RPATHS" is ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we try adding:

requirements:
   run:
        - cudnn                           # [cuda_compiler_version != "None"]
        - nccl                            # [cuda_compiler_version != "None"]

similar to https://github.com/conda-forge/pytorch-cpu-feedstock/blob/main/recipe/meta.yaml#L58 instead?

Why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think adding the RPATH is correct. My longer term goal is to package libtorch to enable this kind of stuff to happen naturally.

The hackery i'm talking about is:
https://github.com/conda-forge/openmm-torch-feedstock/blob/main/recipe/build.sh#L12

image

Setting TORCH_DIR to this strange directory is the hack.

Though for now, I don't have a better solution. My latest attempts at refactoring the pytorch package are documented in conda-forge/pytorch-cpu-feedstock#108 but there is still a long way to go (especially with GPU) and with the python side of the package too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My longer term goal is to package libtorch to enable this kind of stuff to happen naturally.

A straightforward way of doing that would be to have it put its libraries in the main lib directory instead of a private lib directory inside the Python module.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @peastman. Given the constraints of time, I don't have time to do this at the moment. But yes, it is in the longer term goal i outlined in conda-forge/pytorch-cpu-feedstock#108

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So given the current state of things, I think we should be good to merge this in, I see how it is ideal but it seems there isn't really anything else to do until pytorch drops the libs in the correct place (or a separate package is created)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the only one i'm still unconvinced of is the lib RPATH. Not sure where that is coming from. But if you all want to merge, then I don't see "harm" that can be made from this.

string: cuda{{ cuda_compiler_version | replace('.', '') }}py{{ CONDA_PY }}h{{ PKG_HASH }}_{{ PKG_BUILDNUM }} # [cuda_compiler_version != "None"]
string: cpu_py{{ CONDA_PY }}h{{ PKG_HASH }}_{{ PKG_BUILDNUM }} # [cuda_compiler_version == "None"]
skip: true # [win or (linux and cuda_compiler_version in (undefined, 'None', '10.2'))]
rpaths:
- lib/
- {{ SP_DIR }}/torch/lib
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So given the current state of things, I think we should be good to merge this in, I see how it is ideal but it seems there isn't really anything else to do until pytorch drops the libs in the correct place (or a separate package is created)

@mikemhenry
Copy link
Contributor

@conda-forge-admin, please rerender

@github-actions
Copy link
Contributor

github-actions bot commented Jun 7, 2022

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/openmm-torch-feedstock/actions/runs/2456955267.

@mikemhenry mikemhenry merged commit 804c6da into conda-forge:main Jun 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants