Skip to content

Conversation

@pearu
Copy link
Contributor

@pearu pearu commented May 20, 2020

Fixes conda-forge/pyarrow-feedstock#107
Fixes #93

Checklist

  • Used a fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-linter
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@pearu pearu marked this pull request as draft May 20, 2020 20:18
@pearu pearu changed the title Include CUDA enabled arrow Include CUDA enabled pyarrow May 20, 2020
@pearu pearu self-assigned this May 20, 2020
@jakirkham
Copy link
Member

Thanks for picking this up @pearu! 😄

cc @h-vetinari (who also seemed interested in this)

@h-vetinari
Copy link
Member

Thanks for the ping @jakirkham; looks good @pearu!

Also closes #93 (see discussion there). Finally, merging this means the pyarrow-feedstock must be disabled/archived somehow - don't know how that works.

@pearu
Copy link
Contributor Author

pearu commented May 20, 2020

pyarrow fails to build on aarch64:

CMake Error at cmake_modules/SetupCxxFlags.cmake:338 (message):
  Unsupported arch flag: -march=.
Call Stack (most recent call first):
  CMakeLists.txt:100 (include)

I have disabled aarch64 for pyarrow but this does not feel right.
Does anyone have an idea of how to fix the pyarrow build for aarch64?

@h-vetinari
Copy link
Member

-  - name: pyarrow
+  - name: arrow-py

I don't think that's the right approach. We want this feedstock to serve both packages, and have the packages not interfere with each other (which is automatic if they have the same name, otherwise we need another mutex etc.).

@conda-forge/core How can we absorb one feedstock into another correctly?

@pearu
Copy link
Contributor Author

pearu commented May 26, 2020

I was about to propose https://github.com/conda-forge/parquet-cpp-feedstock approach where the future pyarrow packages would just point to arrow-py.
But I second the question of how to absorb feedstocks correctly.

@xhochy
Copy link
Member

xhochy commented May 27, 2020

We should be building pyarrow here. I have the superpowers to enable that but I'll have to consult @beckermr on how to use it ;)

Copy link
Member

@xhochy xhochy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some minor suggestions.

We shouldn't rename the package though!

…f arrow-cpp in pyarrow. Restore pyarrow package name.
@pearu
Copy link
Contributor Author

pearu commented May 27, 2020

@xhochy I have addressed all your comments. Surprisingly, there are no validation failures now.
Can you land this PR?

@pearu pearu requested a review from xhochy May 27, 2020 16:39
@xhochy
Copy link
Member

xhochy commented May 27, 2020

Requires conda-forge/feedstock-outputs#3

There should have been validation failures 🤔

I'll test this locally and merge then.

@pearu
Copy link
Contributor Author

pearu commented May 27, 2020

The message

Cloning into '/tmp/tmpeilv4_v4_recipe/feedstock-outputs'...
validation results:
{
  "linux-64/arrow-cpp-0.17.1-py37h1234567_3_cuda.tar.bz2": true,
  "linux-64/pyarrow-0.17.1-py37h1234567_3_cuda.tar.bz2": false,
  "linux-64/arrow-cpp-proc-1.0.0-cuda.tar.bz2": true
}
NOTE: Any outputs marked as False are not allowed for this feedstock.

persists. I guess your superpowers will be needed after all, @xhochy .

@xhochy
Copy link
Member

xhochy commented May 27, 2020

@conda-forge-admin please restart ci

@pearu
Copy link
Contributor Author

pearu commented May 27, 2020

Validate seems to be ok with the feedstock:

+ validate_recipe_outputs arrow-cpp-feedstock
Cloning into '/tmp/tmp95n9c6e__recipe/feedstock-outputs'...
validation results:
{
  "linux-64/arrow-cpp-0.17.1-py37h1234567_3_cuda.tar.bz2": true,
  "linux-64/pyarrow-0.17.1-py37h1234567_3_cuda.tar.bz2": true,
  "linux-64/arrow-cpp-proc-1.0.0-cuda.tar.bz2": true
}
NOTE: Any outputs marked as False are not allowed for this feedstock.

Copy link
Member

@xhochy xhochy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll merge tomorrow morning. If something goes wrong, I then have the time to clean up.

@h-vetinari
Copy link
Member

@pearu, can you put Fixes #93 in the OP?

@pearu
Copy link
Contributor Author

pearu commented May 27, 2020

@pearu, can you put Fixes #93 in the OP?

Done in the description of the issue.

@h-vetinari
Copy link
Member

Regarding the naming conventions, I think it might be worth renaming arrow-cpp to libarrow. This is in line with a lot of other feedstocks, but obviously a matter of taste. In this case, we'd have to have a compatibility output of arrow-cpp that depends on libarrow, which could be dropped after a few versions.

@isuruf made me aware of this in the context of faiss. Other examples I can think of off the top of my head are blas & lapack, openblas, opencv, postgresql, gdal, plus a whole bunch more (non-exhaustive).

Of course, this recipe has some counter examples to this: aws-sdk-cpp, boost-cpp, grpc-cpp, thrift-cpp, but I think they are far in the minority and the same argument could be made for renaming those (maybe worth noting that at least the last two seem to have been started by the arrow team).

Happy to open an issue for this if this needs more discussion.

@isuruf
Copy link
Member

isuruf commented May 28, 2020

-cpp was a trend that I started with boost-cpp. That was a mistake. I'm in favour of changing it, but this is not the correct place to raise the issue.

Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally seems good to me. Thanks @pearu! 😄

Had a couple minor comments

Comment on lines +38 to +41
$PYTHON setup.py \
build_ext $BUILD_EXT_FLAGS \
install --single-version-externally-managed \
--record=record.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we able to use pip here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check that but in a separate PR.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that makes sense. Was having a little trouble seeing how this was built before. If this is no different from what we were already doing, agree we don't need to change it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using pip does not work out of the box:

$PYTHON -m pip install . --no-deps -vv build_ext $BUILD_EXT_FLAGS install

fails with

pip._internal.exceptions.DistributionNotFound: No matching distribution found for build_ext

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need something like,

$PYTHON -m pip install . --no-deps -vv --global-option=build_ext --global-option=--with-cuda 

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. The following diff works locally:

diff --git a/recipe/build-pyarrow.sh b/recipe/build-pyarrow.sh
index 99396d8..4b01871 100644
--- a/recipe/build-pyarrow.sh
+++ b/recipe/build-pyarrow.sh
@@ -22,7 +22,7 @@ BUILD_EXT_FLAGS=""
 if [[ ! -z "${cuda_compiler_version+x}" && "${cuda_compiler_version}" != "None" ]]
 then
     export PYARROW_WITH_CUDA=1
-    BUILD_EXT_FLAGS="${BUILD_EXT_FLAGS} --with-cuda"
+    BUILD_EXT_FLAGS="${BUILD_EXT_FLAGS} --global-option=build_ext --global-option=--with-cuda"
 else
     export PYARROW_WITH_CUDA=0
 fi
@@ -35,7 +35,4 @@ fi
 
 cd python
 
-$PYTHON setup.py \
-        build_ext $BUILD_EXT_FLAGS \
-        install --single-version-externally-managed \
-                --record=record.txt
+$PYTHON -m pip install . --no-deps -vv $BUILD_EXT_FLAGS
diff --git a/recipe/meta.yaml b/recipe/meta.yaml
index 8a0edf7..20523e2 100644
--- a/recipe/meta.yaml
+++ b/recipe/meta.yaml
@@ -181,6 +181,7 @@ outputs:
         - boost-cpp
         - cython
         - numpy 1.16.*
+        - pip
         - python
         - setuptools
         - setuptools_scm

Just FYI.

Comment on lines +26 to +29
%PYTHON% setup.py ^
build_ext ^
install --single-version-externally-managed ^
--record=record.txt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question about pip here

@xhochy
Copy link
Member

xhochy commented May 28, 2020

Happy to open an issue for this if this needs more discussion.

Please do so at https://github.com/conda-forge/conda-forge.github.io

@xhochy xhochy merged commit 99a1e8d into conda-forge:master May 28, 2020
@xhochy
Copy link
Member

xhochy commented May 28, 2020

Thanks @pearu , let's 🙏 this works!

@h-vetinari
Copy link
Member

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pyarrow should be an output of arrow-cpp Split package?

6 participants