Skip to content

Conversation

@corleyma
Copy link
Contributor

@corleyma corleyma commented Aug 18, 2020

This PR updates the osx-build.sh script to enable S3 when building the C++ dependencies and the python wheels, and updates the corresponding travis job definition to include the requisite aws-sdk-cpp dependency via homebrew.

@github-actions
Copy link

@kou
Copy link
Member

kou commented Aug 19, 2020

@github-actions crossbow submit wheel-osx-*

@kou kou changed the title ARROW-9266: [Python/Packaging] enable C++ S3FS in macOS wheels ARROW-9266: [Python][Packaging] enable C++ S3FS in macOS wheels Aug 19, 2020
@github-actions
Copy link

Revision: ad77b8f

Submitted crossbow builds: ursa-labs/crossbow @ actions-493

Task Status
wheel-osx-cp35m TravisCI
wheel-osx-cp36m TravisCI
wheel-osx-cp37m TravisCI
wheel-osx-cp38 TravisCI

@pitrou
Copy link
Member

pitrou commented Aug 19, 2020

@nealrichardson

@nealrichardson
Copy link
Member

So this is failing because it's trying to build the bundled aws-sdk-cpp external project, but that's not implemented: https://travis-ci.org/github/ursa-labs/crossbow/builds/719137336#L1231-L1232

You could try adding -DAWSSDK_SOURCE=SYSTEM to try to pick up the Homebrew package. I suspect that will also fail due to aws/aws-sdk-cpp#1309 but we'll see.

@corleyma corleyma requested a review from kou August 24, 2020 19:06
@corleyma
Copy link
Contributor Author

@pitrou It seems like it might be worth digging into how to fix/enable builds of AWS SDK in the ThirdpartyToolchain.cmake. Looks like you disabled it here.

Is there any additional context you could provide re: the libcrypto linking problem so I can try to dig into this?

@pitrou
Copy link
Member

pitrou commented Aug 24, 2020

I don't remember the specifics, but IIRC the AWS SDK build procedure always takes up the system version of OpenSSL, while in many cases (especially packaging-related) we want to use a dedicated OpenSSL version (such as Conda-provided version when building a conda package, etc.).

@pitrou
Copy link
Member

pitrou commented Aug 24, 2020

That was with an older AWS SDK version some time ago, though, so you could try to enable it and see what happens now.

@kou
Copy link
Member

kou commented Aug 24, 2020

@github-actions crossbow submit wheel-osx-*

@github-actions
Copy link

Revision: e2359f5

Submitted crossbow builds: ursa-labs/crossbow @ actions-496

Task Status
wheel-osx-cp35m TravisCI
wheel-osx-cp36m TravisCI
wheel-osx-cp37m TravisCI
wheel-osx-cp38 TravisCI

@nealrichardson
Copy link
Member

FWIW the latest Travis failures do look like aws/aws-sdk-cpp#1309. I don't think Homebrew is a viable source for aws-sdk-cpp for us.

@corleyma
Copy link
Contributor Author

@kou I pushed some changes which optimistically re-enable building of AWS C++ sdk from source.

@pitrou As said before, I'm not super familiar with conda but a cursory glance at the docs implies it might be possible to build wheels from conda? I wonder if that might be a good approach for the macOS wheels.

@pitrou
Copy link
Member

pitrou commented Aug 24, 2020

The approach we use for manylinux wheels is to build the AWS SDK from source separately, and it works. Why would it be different on macOS?

@corleyma
Copy link
Contributor Author

The approach we use for manylinux wheels is to build the AWS SDK from source separately, and it works. Why would it be different on macOS?

I'm not sure if it would be possible to use conda to build manylinux-compatible wheels, but in general it seems that the conda builds of pyarrow are more supported than the wheels. I wonder if using conda-build to generate wheels would reduce the lag time for new functionality to make it into the wheels and reduce overhead for maintainers.

@pitrou
Copy link
Member

pitrou commented Aug 24, 2020

I can't answer definitely, but I'm extremely skeptical that conda-build will be able to proper usable Python wheels in anything other than the most simple cases, and Arrow is really at the other end of the complexity spectrum when it comes to dependencies and packaging.

To understand why, you have to understand that Python wheels are not able to express non-Python dependencies, while conda packages are. So any C++ dependency that's expressed naturally in conda (as a dependency on just another package) has to become statically bundled in a Python wheel.

@pitrou
Copy link
Member

pitrou commented Aug 24, 2020

So I maintain that building AWS SDK from source during the wheel build process is probably the most reasonable way forward here.

@corleyma
Copy link
Contributor Author

So any C++ dependency that's expressed naturally in conda (as a dependency on just another package) has to become statically bundled in a Python wheel.

Bundling those dependencies is exactly what I was hoping conda would do when building a wheel. I guess that's probably being a bit optimistic.

@kou
Copy link
Member

kou commented Aug 25, 2020

@github-actions crossbow submit wheel-osx-*

FYI: Anybody can run jobs to build wheels for macOS by writing the above comment.

@github-actions
Copy link

Revision: 0b3365d

Submitted crossbow builds: ursa-labs/crossbow @ actions-497

Task Status
wheel-osx-cp35m TravisCI
wheel-osx-cp36m TravisCI
wheel-osx-cp37m TravisCI
wheel-osx-cp38 TravisCI

@corleyma
Copy link
Contributor Author

@kou Ah, interesting, good to know! I could not seem to get @github-actions crossbow submit -g conda to work as expected in a different PR, but perhaps it's just the -g option that isn't supported?

@pitrou It looks like there is something called conda-press that may be closer to what I was imagining, but it doesn't seem very mature yet.

@corleyma
Copy link
Contributor Author

@github-actions crossbow submit wheel-osx-*

@corleyma
Copy link
Contributor Author

@kou seems crossbow builds won't run for me, alas. 😿

@kou
Copy link
Member

kou commented Aug 25, 2020

@github-actions crossbow submit wheel-osx-*

Oh, sorry. I didn't know that we have the https://github.com/apache/arrow/blob/master/dev/archery/archery/bot.py#L180-L183 check.

@wesm
Copy link
Member

wesm commented Aug 25, 2020

We experimented a bit with conda-press a while back but it yielded poor results for us (wheels that were much larger than our current ones). I expect we are going to be fighting to keep our wheels at an acceptable size for the foreseeable future (at some point we need to try to break up the Python project into multiple interdependent wheels to enable modular installations)

@wesm
Copy link
Member

wesm commented Aug 25, 2020

@github-actions crossbow submit wheel-osx-*

@github-actions
Copy link

Revision: e7948be

Submitted crossbow builds: ursa-labs/crossbow @ actions-498

Task Status
wheel-osx-cp35m TravisCI
wheel-osx-cp36m TravisCI
wheel-osx-cp37m TravisCI
wheel-osx-cp38 TravisCI

@nealrichardson
Copy link
Member

If you're going to try to get the bundled build_awssdk macro to work, a couple of notes:

  1. The features enabled in the bundled ep (https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake#L2635) don't match what is actually required (https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake#L2681-L2685) so you'll need to fix that
  2. If it doesn't build, perhaps look to https://github.com/TileDB-Inc/TileDB/blob/dev/cmake/Modules/FindAWSSDK_EP.cmake for inspiration since they manage to build it

@kou
Copy link
Member

kou commented Oct 5, 2020

We'll complete this by #8315.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants