Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support CPython 3.11, 3.12, and aarch64 processors #2331

Merged
merged 91 commits into from
Sep 11, 2024

Conversation

ddelange
Copy link
Contributor

@ddelange ddelange commented Jan 20, 2023

Hoi 👋

linux-aarch64 makes up for almost 10% of all platforms ref giampaolo/psutil#2103

aarch64 has already surpassed windows in terms of downloads for this package. Oracle, Amazon, Google, and Microsoft are all offering aarch64 cloud instances at an undeniable price point compared to amd/intel, so the demand will undoubtedly only grow

  • this PR is adapted from Add arm64 mac and linux wheels MagicStack/asyncpg#954
  • uses QEMU emulation for linux arm64 wheels: manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs 😅
  • manylinux2014 wheels are built with GCC 10, which I think does not guarantee proper functioning of pybind11 (docs).
    • so with this PR, linux wheels are built with GCC 12 (manylinux_2_28).
    • pip will only install these wheels on linux operating systems with glibc >= 2.28 (mostly all 2020+ linux distributions like debian 10 buster, ubuntu 20.04 focal, almalinux/rhel 8, ...).

the wheels from this PR can be installed with:

# comma separated list for --find-links
export PIP_FIND_LINKS=https://github.com/ddelange/vaex/releases/expanded_assets/core-v4.17.1.post4
pip install --force-reinstall vaex

fixes #2366, fixes #2368, fixes #2397, closes #2427, fixes #2384

@maartenbreddels
Copy link
Member

Hoi 👋

exciting, will take a look early next week!

  • manylinux takes around 2.5hrs per wheel and alpine arm64 up to 4 hrs

that worries me a bit.. :)

groeten,

Maarten

@ddelange
Copy link
Contributor Author

ddelange commented Jan 21, 2023

here are all timings: https://github.com/ddelange/vaex/actions/runs/3965720337/usage

depending on how often a month you release vaex, this could eat into the 2k free minutes of GH...

as the parallelization is maximised and they're pushed to PyPI as soon as they're built, most of the wheels will be available soon upon release regardless

here are all the wheels: distributions.zip

@ddelange
Copy link
Contributor Author

interestingly, that was 8260 minutes ^

apparently that's OK? then I don't understand their explanation 🤔 https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#included-storage-and-minutes

@ddelange
Copy link
Contributor Author

ddelange commented Jan 21, 2023

ah there is a fair amount of duplication in that usage table for whatever reason 🤯

@ddelange
Copy link
Contributor Author

a diff of current PyPI vs the zip above:

 vaex_core-4.16.1-cp310-cp310-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp310-cp310-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp310-cp310-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp310-cp310-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp310-cp310-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp310-cp310-win_amd64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_10_9_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-macosx_11_0_arm64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_aarch64.whl
+vaex_core-4.16.1-cp311-cp311-musllinux_1_1_x86_64.whl
+vaex_core-4.16.1-cp311-cp311-win_amd64.whl
 vaex_core-4.16.1-cp36-cp36m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp36-cp36m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp36-cp36m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp36-cp36m-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp36-cp36m-win_amd64.whl
 vaex_core-4.16.1-cp37-cp37m-macosx_10_9_x86_64.whl
-vaex_core-4.16.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp37-cp37m-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp37-cp37m-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp37-cp37m-win_amd64.whl
 vaex_core-4.16.1-cp38-cp38-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp38-cp38-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp38-cp38-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp38-cp38-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp38-cp38-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp38-cp38-win_amd64.whl
 vaex_core-4.16.1-cp39-cp39-macosx_10_9_x86_64.whl
 vaex_core-4.16.1-cp39-cp39-macosx_11_0_arm64.whl
-vaex_core-4.16.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_aarch64.whl
+vaex_core-4.16.1-cp39-cp39-manylinux_2_28_x86_64.whl
+vaex_core-4.16.1-cp39-cp39-musllinux_1_1_aarch64.whl
 vaex_core-4.16.1-cp39-cp39-musllinux_1_1_x86_64.whl
 vaex_core-4.16.1-cp39-cp39-win_amd64.whl

Comment on lines -16 to -23
namespace std {
template<>
struct hash<PyObject*> {
size_t operator()(const PyObject *const &o) const {
return PyObject_Hash((PyObject*)o);
}
};
}
Copy link
Contributor Author

@ddelange ddelange Jan 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maartenbreddels any thoughts on this (incl me updating the pybind11 submodule)?

@@ -183,12 +183,14 @@ def __str__(self):
include_package_data=True,
ext_modules=([extension_vaexfast] if on_rtd else [extension_vaexfast, extension_strings, extension_superutils, extension_superagg]) if not use_skbuild else [],
zip_safe=False,
python_requires=">=3.6",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cibuildwheel parses this to determine which wheels to build

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @franz101

see also the diff above

@ddelange
Copy link
Contributor Author

I'm guessing this is blocked by #2339

@maartenbreddels
Copy link
Member

Just letting you know i'm very busy and had a vacation.
Yes, I'll try to get #2339 green first!

@ddelange
Copy link
Contributor Author

fwiw there are now third party free minutes on native arm64 machines, to get rid of the slow qemu builds

@ddelange ddelange changed the title Build aarch64 wheels Build aarch64 wheels and support python 3.11 Jul 10, 2023
@maartenbreddels
Copy link
Member

Could you try rebasing this?

@ddelange
Copy link
Contributor Author

@maartenbreddels already merged in master 👍

@ddelange
Copy link
Contributor Author

    ERROR: Could not find a version that satisfies the requirement vaex-core<4.17,>=4.17.0 (from vaex)
    ERROR: No matching distribution found for vaex-core<4.17,>=4.17.0

@maartenbreddels
Copy link
Member

Yeah, a bug/artifact or our release script. Should be good now.

@ddelange
Copy link
Contributor Author

ddelange commented Aug 3, 2023

hoi @maartenbreddels 👋

I pulled master and fixed merge conflicts, but it looks like CI is still not very happy. Seeing errors like hdf file missing on disk, and TypeError: train() got an unexpected keyword argument 'early_stopping_rounds'.

Do you think it might be related to this PR?

ddelange referenced this pull request in rapidfuzz/RapidFuzz Aug 10, 2023
@franz101
Copy link
Contributor

Just wondering here on the Python packaging. Python 3.6 and 3.7 are now deprecated on the other hand we can bump to 3.10 and 3.11?

@to-bee
Copy link

to-bee commented Aug 28, 2023

Do we have any updates on this MR?

@ddelange
Copy link
Contributor Author

ddelange commented Sep 1, 2023

HI @maartenbreddels 👋

Was your s3 account deleted by any chance?

vaex.open('s3://vaex/taxi/yellow_taxi_2009_2015_f32.hdf5?anon=true')

raises

FileNotFoundError: [Errno 2] Path does not exist 'vaex/taxi/yellow_taxi_2009_2015_f32.hdf5'. Detail: [errno 2] No such file or directory
image

@ddelange ddelange force-pushed the build-matrix branch 3 times, most recently from 5680eb9 to 2136629 Compare September 4, 2023 08:28
@ddelange
Copy link
Contributor Author

looks like not allowed. have a feeling it might be windows 3.8 threading implementation memory leak

@ddelange
Copy link
Contributor Author

ddelange commented Aug 30, 2024

you could simply skip publish of that vaex-core wheel, and don't upload tar.gz for vaex-core. it's the tensorflow approach

done in the commits below

.github/workflows/wheel.yml Outdated Show resolved Hide resolved
Comment on lines +81 to +86
# https://github.com/pypa/gh-action-pypi-publish#trusted-publishing
- name: Publish package distributions to PyPI
uses: pypa/[email protected]
if: startsWith(github.ref, 'refs/tags')
with:
skip-existing: true
Copy link
Contributor Author

@ddelange ddelange Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all you need to do for this to work is add vaexio/vaex as trusted publisher here: https://pypi.org/manage/project/vaex/settings/publishing (and to the other vaex pypi projects)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now fails, because it will try to release all packages (even though I only pushed the vaex-core tag).

I guess if we only do the cp commands when the specific tag is used, we can make this work. What do you think?

@ddelange
Copy link
Contributor Author

that did it. we're green 🏁

fwiw, pandas deprecated python 3.8 in August 2023 (v2.1.0)

@maartenbreddels
Copy link
Member

Whow, amazing. You did most of the work @ddelange thanks a lot for the work and support !

I'll set up the trusted publisher (probably next week) and see if we can make a release 🥳

@maartenbreddels maartenbreddels merged commit e3e1842 into vaexio:master Sep 11, 2024
51 checks passed
@EwoutH
Copy link
Contributor

EwoutH commented Sep 11, 2024

Wow it got merged! Congratulations!!

@erwanp
Copy link

erwanp commented Sep 13, 2024

Well done @ddelange @maartenbreddels @EwoutH ; that's a great news for our codes!

@HajimeKawahara
Copy link

Great news! Thanks a lot. @ddelange @maartenbreddels @EwoutH

@franz101
Copy link
Contributor

@Ben-Epstein wake up it's christmas

@ddelange
Copy link
Contributor Author

fyi followup PR #2434 should land soon, so release is imminent:)
landed

@@ -1,19 +1,19 @@
import os
import imp
from importlib.machinery import SourceFileLoader
Copy link
Contributor Author

@ddelange ddelange Sep 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maartenbreddels imp was removed and will fail to import on cp312. so this PR touches all the universal packages to achieve cp312 support.

it might make sense to bump the minimum required versions accordingly in vaex: so cp312 users don't end up with older (broken) versions of the universal packages in their env

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if I should take action here. Can you elaborate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here it is: #2438

@EwoutH
Copy link
Contributor

EwoutH commented Oct 1, 2024

Now that this upgrade process is still fresh in everybody’s mind, would it make sense to directly push to Python 3.13?

@maartenbreddels
Copy link
Member

I doubt the dependencies (e.g. arrow) are already shipping, but I think we should indeed do this soon!

@erwanp
Copy link

erwanp commented Oct 25, 2024

Hello Vaex team Python 3.12 compatibility is merged (congrats) but it does not seem to be released on Conda, can you confirm?
(@minouHub for information)

@ddelange
Copy link
Contributor Author

Hi @erwanp 👋

  • you can fetch vaex-core 4.18.1 from PyPI
  • it was yanked because windows support is still an open issue [BUG-REPORT] vaex causes a segmentation fault on windows #2442 but it's fine to use on linux and mac
  • the other vaex packages (vaex, vaex-ml etc) still need to be released to PyPI with the right version bumps introduced in this PR for Python 3.12 support, I think @maartenbreddels will want to wait with that until the windows bug is fixed and release all packages together.
  • I don't know if conda release is planned or automated -- the wheels are self contained so you can just fetch them from PyPI

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet