Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry OSX support #80

Closed
wants to merge 8 commits into from
Closed

Retry OSX support #80

wants to merge 8 commits into from

Conversation

h-vetinari
Copy link
Member

Follow-up from #77 (which already did most of the groundwork), as well as #21

Fixes #2

Also cleans up some dependency cruft.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@h-vetinari
Copy link
Member Author

@mattip
Did you ever have a passing OSX build in #77. This fails with

Execution platform: @local_config_platform//:host
make[3]: external/local_config_cc/cc_wrapper.sh: No such file or directory

presumably because our patched rule for building redis is not yet compatible with the OSX compiler stack (i.e. will probably need to use the LLVM versions and make sure the cc_wrapper is found):

  /bin/bash -c 'source external/bazel_tools/tools/genrule/genrule-setup.sh; 
        export CC=external/local_config_cc/cc_wrapper.sh
        export CFLAGS=
        export AR=${CC/gnu-cc/gnu-ar}
        export NM=${CC/gnu-cc/gnu-nm}
        export RANLIB=${CC/gnu-cc/gnu-ranlib}
        tmpdir="redis.tmp"
        p=external/com_github_antirez_redis/Makefile
        cp -p -L -R -- "${p%/*}" "${tmpdir}"
        chmod +x "${tmpdir}"/deps/jemalloc/configure
        parallel="$(getconf _NPROCESSORS_ONLN || echo 1)"
        make -s -C "${tmpdir}" -j"${parallel}" V=0 CFLAGS="${CFLAGS-} -DLUA_USE_MKSTEMP -Wno-pragmas -Wno-empty-body" AR="${AR}" RANLIB="${RANLIB}"
        mv "${tmpdir}"/src/redis-server bazel-out/darwin-opt/bin/external/com_github_antirez_redis/redis-server
        chmod +x bazel-out/darwin-opt/bin/external/com_github_antirez_redis/redis-server
        mv "${tmpdir}"/src/redis-cli bazel-out/darwin-opt/bin/external/com_github_antirez_redis/redis-cli
        chmod +x bazel-out/darwin-opt/bin/external/com_github_antirez_redis/redis-cli
        rm -r -f -- "${tmpdir}"

@mattip
Copy link
Contributor

mattip commented Nov 9, 2022

Did you ever have a passing OSX build ...

No. I would prefer to move forward with #78 first, then return to this

@h-vetinari
Copy link
Member Author

Did you ever have a passing OSX build ...

No. I would prefer to move forward with #78 first, then return to this

I picked the independently valuable changes from this PR into #81. Fine for me to do 2.1.0 first. I can rebase this PR once #78 lands.

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@h-vetinari
Copy link
Member Author

Not sure if this is relevant, but could be that we need to fill in our abseil

INFO: From Linking external/com_google_absl/absl/types/libbad_optional_access.a:
warning: /Applications/Xcode_13.2.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/libtool: archive library: bazel-out/darwin-opt/bin/external/com_google_absl/absl/types/libbad_optional_access.a the table of contents is empty (no object file members in the library define global symbols)

Immediately afterwards there's loads of:

INFO: From Linking python/ray/_raylet.so:
ld: warning: cannot export hidden symbol std::__1::[etc]

@ngam
Copy link

ngam commented Feb 9, 2023

Would sourcing the bazel-toolchain be helpful here? The build setup is pretty opaque...

https://github.com/conda-forge/bazel-toolchain-feedstock

@ngam
Copy link

ngam commented Feb 9, 2023

Then likely patch bazelrc or something to tell bazel to pick up the custom toolchain like we do in jaxlib and tensorflow

@ngam
Copy link

ngam commented Feb 9, 2023

    requirements:
      build:
        - {{ compiler('c') }}
        - {{ compiler('cxx') }}
        - bazel =4.2.2
        - sysroot_linux-64 2.17  # [linux64]
        - patchelf  # [linux]
        - curl
        - cython >=0.29.26
        - make
        - m2-bash  # [win]
        - python
      host:
        - python
        - pip
        - packaging
        - openjdk =11

The bazel version is pretty old (upstream seems to be on 5.x) and I thought the openjdk issue was resolved. Also, it could prove useful to add bazel in both build and host --- I remember having all sorts of problems with such things in jaxlib and tensorflow but perhaps due to other dependencies and here the dependency list is a bit more forgiving.

@h-vetinari
Copy link
Member Author

@ngam, thanks for taking a look - would you like to collaborate on this PR?

@ngam
Copy link

ngam commented Feb 9, 2023

I am going to try to help, but this is obviously going to be difficult (at least from the looks of it). I was planning to fork your PR and start tinkering, but I was frightened by the 1+ hours of build (which would likely translate to 2+ hours for osx). So I think I should perhaps consider building this locally, in which case I need to reset my local setup and decide which machine to use...

@h-vetinari
Copy link
Member Author

I am going to try to help, but this is obviously going to be difficult (at least from the looks of it).

Yeah, not saying this will be easy, haha. Sent you an invite in any case. 🙃

@ngam
Copy link

ngam commented Feb 9, 2023

Thanks! I will obviously accept the invite 😄

I also want to spend a bit more time on the upstream source code. I think we can get away with tinkering with bazelrc alone (which actually may be the way to go in general) but there might be some hardcoded parts we need to be aware of. They obviously manage to build it in a somewhat straightforward way (download and install bazel; tell it to run) so hopefully not a complete hopeless endeavor!

@ngam
Copy link

ngam commented Feb 16, 2023

Things to try:

  • get stuff up to date
  • use our conda-forge custom bazel toolchain

@ngam
Copy link

ngam commented Feb 17, 2023

Stuck on this https://github.com/ray-project/ray/blob/7afc8d67f4c525f76e2288e04bf52861ac9eb108/cpp/BUILD.bazel#L6-L33 with an error of:

Analyzing: 2 targets (106 packages loaded, 4070 targets configured)
Analyzing: 2 targets (106 packages loaded, 4070 targets configured)
Analyzing: 2 targets (107 packages loaded, 7697 targets configured)
ERROR: file 'cpp/libray_api.so' is generated by these conflicting actions:
Label: //cpp:ray_api, //cpp:libray_api.so
RuleClass: cc_library rule, cc_binary rule
Configuration: 08401d45df28f14149fc8673d59cd04414242f0750a2b9199c7ecb01302b6b5c
Mnemonic: Fail, CppLink
Action key: f6f07b3447480df855b2dc4f96704e38d1c30ed8101a22baa3516e7363d1ae71, 4111fc6131d8bf51f9fa45d039b763b4d7b3afd3e9eb1476ada396e6cf31ed93
Progress message: Reporting failed target //cpp:ray_api located at /Users/runner/miniforge3/conda-bld/ray-packages_1676608690235/work/cpp/BUILD.bazel:30:11, Linking cpp/libray_api.so
PrimaryInput: (null), File:[[<execution_root>]bazel-out/darwin-opt/bin]cpp/libray_api.lo
PrimaryOutput: File:[[<execution_root>]bazel-out/darwin-opt/bin]cpp/libray_api.so
ERROR: com.google.devtools.build.lib.skyframe.ArtifactConflictFinder$ConflictException: com.google.devtools.build.lib.actions.MutableActionGraph$ActionConflictException: for cpp/libray_api.so, previous action: action 'Linking cpp/libray_api.so', attempted action: action 'Reporting failed target //cpp:ray_api located at /Users/runner/miniforge3/conda-bld/ray-packages_1676608690235/work/cpp/BUILD.bazel:30:11'
INFO: Elapsed time: 190.356s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (173 packages loaded, 16684 targets configured)
FAILED: Build did NOT complete successfully (173 packages loaded, 16684 targets configured)
Traceback (most recent call last):

@h-vetinari
Copy link
Member Author

Currently based on #105, because why not

@h-vetinari
Copy link
Member Author

So it looks like the main compilation succeed now, but we're running into

make[3]: external/local_config_cc/cc_wrapper.sh: No such file or directory

That sounds like a bazel thing? @ngam?

@ngam

This comment was marked as outdated.

@ngam
Copy link

ngam commented Jul 24, 2023

But this is weird. Essentially, this is only happening on osx, right? Hmm...

@conda-forge-admin, please rerender.

@ngam
Copy link

ngam commented Jul 26, 2023

This

Let's add https://github.com/conda-forge/bazel-toolchain-feedstock to the build section and source it in the build.sh script

didn't help. It stopped the build at the very start by complaining about zlib. Let me think what we can do... we may need to patch something...

@mattip
Copy link
Contributor

mattip commented Jul 26, 2023

There are lots of these lines in the build log

Could not remove or rename \
   /Users/runner/miniforge3/conda-bld/ray-packages_1690346634210/_build_env/share/bazel/2966d1d2105c37b8a8f4040713b497b0/external/python3_9_x86_64-apple-darwin/lib/python3.9/site-packages/pip/_vendor/rich/tree.py.  
   Please remove this file manually (you may need to reboot to free file handles)

@ngam
Copy link

ngam commented Jul 26, 2023

@h-vetinari @mattip any history on the redis patch? I think the problem may have been in it (specifically the part about darwin and (un/re)setting the env variables). Obviously completely skipping it is just the brute-force attempt; we can likely edit it to suit our needs here.

I will look into the issue with the filesystem complaints (we may need some extra bazel cleaning/expunging)

@ngam
Copy link

ngam commented Jul 26, 2023

Also, please assess before I make more tweaks. I can try some tricks on the filesystem front, but it would be best to get more context on the redis patch.

@h-vetinari
Copy link
Member Author

any history on the redis patch?

That was mainly @vnlitvinov's work. We documented how to generate it, but that doesn't cover the content of the redis patch itself.

@ngam
Copy link

ngam commented Jul 27, 2023

Thanks. I think we need to change this part:

 
 genrule_cmd = select({
     "@bazel_tools//src/conditions:darwin": """
-        unset CC LDFLAGS CXX CXXFLAGS
+        export CC=$(CC)
+        export CFLAGS=$(CC_FLAGS)
+        export AR=$${CC/gnu-cc/gnu-ar}
+        export NM=$${CC/gnu-cc/gnu-nm}
+        export RANLIB=$${CC/gnu-cc/gnu-ranlib}
         tmpdir="redis.tmp"
         p=$(location Makefile)
         cp -p -L -R -- "$${p%/*}" "$${tmpdir}"
         chmod +x "$${tmpdir}"/deps/jemalloc/configure
         parallel="$$(getconf _NPROCESSORS_ONLN || echo 1)"
-        make -s -C "$${tmpdir}" -j"$${parallel}" V=0 CFLAGS="$${CFLAGS-} -DLUA_USE_MKSTEMP -Wno-pragmas -Wno-empty-body"
+        make -s -C "$${tmpdir}" -j"$${parallel}" V=0 CFLAGS="$${CFLAGS-} -DLUA_USE_MKSTEMP -Wno-pragmas -Wno-empty-body" AR="$${AR}" RANLIB="$${RANLIB}"
         mv "$${tmpdir}"/src/redis-server $(location redis-server)
         chmod +x $(location redis-server)
         mv "$${tmpdir}"/src/redis-cli $(location redis-cli)

or simply undo it. I will have another look soon. The latter parts of the patch are likely safe.

@mattip
Copy link
Contributor

mattip commented Aug 21, 2023

I wonder if rebasing this off 2.6.2 would help.

@vnlitvinov
Copy link
Contributor

any history on the redis patch?

The point of those patches (IIRC there were multiple related to this topic, but maybe some were already upstreamed) is to make sure we're using the "correct" compiler. Building with conda is different from what Ray does at their own CI in the way that compilers/tools aren't named cc/ar etc, but are specified by a full name like gcc-linux-whatever, and are pointed to using environment variables.

So, when a build isn't expecting to use the variables like $CC -c foo.c -o foo.bin but instead runs cc, it fails in conda environment.

Looking at the quoted patch, I would guess that it was just a copy of the Linux one, and (probably) names of compiler and tools might be different on Darwin (i.e. maybe they aren't gnu-whatever).

@mattip
Copy link
Contributor

mattip commented Sep 12, 2023

Still getting this error, and lot of other build warnings.

external/local_config_cc/cc_wrapper.sh: No such file or directory

@h-vetinari
Copy link
Member Author

Long-term, I'd like to avoid rebuilding redis completely (we have in in conda-forge), but to try that, we'll (well at least I) need to wait for ray to become compatible with bazel 6.

@vnlitvinov
Copy link
Contributor

Long-term, I'd like to avoid rebuilding redis completely

Ray "ideology" of vendoring most of the dependencies internally (which is kind of safe when you're running stuff in a random but production environment) conflicts with conda-forge "do not vendor stuff" pretty badly. Redis is just one of the cases (and I think initially Ray vendored Redis because they patched something in it).

@mattip
Copy link
Contributor

mattip commented Nov 11, 2023

Closing, #119 which replaces this was merged

@mattip mattip closed this Nov 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Build for OSX
5 participants