Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release version 0.7.0 #7

Closed
7 tasks done
andygrove opened this issue Jul 22, 2022 · 25 comments · Fixed by #74
Closed
7 tasks done

Release version 0.7.0 #7

andygrove opened this issue Jul 22, 2022 · 25 comments · Fixed by #74
Labels
help wanted Extra attention is needed

Comments

@andygrove
Copy link
Member

andygrove commented Jul 22, 2022

I would like to propose releasing version 0.7.0.

Checklist:

@andygrove andygrove added the help wanted Extra attention is needed label Sep 4, 2022
@francis-du
Copy link
Contributor

I would like to participate in the release process to familiarize myself with the ASF specification.

@andygrove
Copy link
Member Author

I would like to see us start releasing this project again, but I don't have sufficient knowledge about Python. I would love to pair with someone to get a release out.

@isidentical I was curious - do you have any interest in the DataFusion Python bindings?

@jimexist
Copy link
Member

jimexist commented Nov 8, 2022

I used to do release while it is hosted in datafusion-contrib org, if help is needed I can share more.

@andygrove
Copy link
Member Author

I used to do release while it is hosted in datafusion-contrib org, if help is needed I can share more.

Thanks @jimexist. Is there any documentation on how to do the release? I guess the first step is getting this kind of documentation into the repo.

@isidentical
Copy link
Contributor

@andygrove @jimexist also let me know if there is anything I can help with.

@isidentical
Copy link
Contributor

I am quite unfamiliar with the release procedure of this repo, but here are my observations on how the basic flow of a release might look like:

  • It seems like the last release was at July with 0.6.0, so we probably need to bump the version tag for this one?

  • There is already some automation in place for building the wheels when an rc version tag is pushed. So we can trigger it and use the artifacts for the release candidate process.

  • Once the vote passes, we probably need to repeat these two steps but with the final version (that'd be 0.7.0). For building wheels for all platforms, I think it makes sense to change the build action to actually be ran on every tag push.

  • Downloading the artifacts from the build of 0.7.0, and uploading them to PyPI using twine.

@francis-du
Copy link
Contributor

I'm interested in doing some work on this, anyone who needs my help please assign me.

@francis-du
Copy link
Contributor

francis-du commented Nov 9, 2022

I am quite unfamiliar with the release procedure of this repo, but here are my observations on how the basic flow of a release might look like:

  • It seems like the last release was at July with 0.6.0, so we probably need to bump the version tag for this one?
  • There is already some automation in place for building the wheels when an rc version tag is pushed. So we can trigger it and use the artifacts for the release candidate process.
  • Once the vote passes, we probably need to repeat these two steps but with the final version (that'd be 0.7.0). For building wheels for all platforms, I think it makes sense to change the build action to actually be ran on every tag push.
  • Downloading the artifacts from the build of 0.7.0, and uploading them to PyPI using twine.
  • We don't seem to need to publish to PyPI with twine, we can publish directly with maturin publish.

  • For building wheels, here is a GiHub Action example, I don't know if it can be used directly.

@andygrove
Copy link
Member Author

maturin publish sounds good. Is publishing to PyPi enough for this release? How important is it to publish wheels?

@andygrove
Copy link
Member Author

If we are ok with just running maturin publish for now, I think we could go ahead and create a release candidate, vote on it, then run that command to publish?

@francis-du
Copy link
Contributor

If we are ok with just running maturin publish for now, I think we could go ahead and create a release candidate, vote on it, then run that command to publish?

I think these are enough, does @isidentical has anything else to add?

@francis-du
Copy link
Contributor

maturin publish sounds good. Is publishing to PyPi enough for this release? How important is it to publish wheels?

I think maturin publish should upload wheel directly. If we use twine + GitHub Action, we may needs to build the wheel .maturin publish should not need to build

@isidentical
Copy link
Contributor

maturin publish sounds good. Is publishing to PyPi enough for this release? How important is it to publish wheels?

I am not super sure if maturin publish has an internal mechanism to allow this, but in general wheels are a great way to save a lot of trouble for users (since otherwise, each installation would essentially need to recompile & link datafusion; which in turn would require the installation system to have a proper rust tooling ready).

Another idea is checking out cross compilation from maturin. I haven't used it and maybe it won't help with such a complex dependency chain as ours but it might be worth a shot. https://www.maturin.rs/distribution.html#build-wheels & https://www.maturin.rs/distribution.html#cross-compile-to-linuxmacos

@andygrove
Copy link
Member Author

Maybe we can learn from how Polars does this. It looks like they use maturin publish to publish the wheels?

https://github.com/pola-rs/polars/blob/master/.github/workflows/create-py-release-manylinux.yaml

@isidentical
Copy link
Contributor

maturin publish publishes (with the right set of arguments) wheels only for the system you are running it in. So we still have to do something like polars where we would repeatedly call maturin publish on multiple different environments to package wheels for everyone. I think the most important one would be MacOS + manylinux2010/manylinux2014 but if windows is easy we can also include it.

@andygrove
Copy link
Member Author

To make sure I understand this correctly... we already have a GitHub workflow to build the wheels when we tag the repo, so if we go ahead with a release now, we can always run maturin publish manually on Mac & Linux after the release is tagged, and with appropriate votes on releasing? I would rather release soon and learn what we can do to improve the process next time.

@andygrove
Copy link
Member Author

@isidentical @francis-du @jimexist Let me know what you think of the previous statement/question ☝️. Any reason I should not go ahead and cut a release candidate now? It has been more than 6 months since the last release so I am keen to get some momentum going again.

@isidentical
Copy link
Contributor

isidentical commented Nov 16, 2022

To make sure I understand this correctly... we already have a GitHub workflow to build the wheels when we tag the repo, so if we go ahead with a release now, we can always run maturin publish manually on Mac & Linux after the release is tagged, and with appropriate votes on releasing?

That seems to be my understanding as well. Wheels are basically pre-packaged versions of source distributions; so if there is no formal guidelines on binary publishing in ASF I think you can start the voting process directly on the source distributions and once we cut it we should be able to build wheels for it separately.

@isidentical
Copy link
Contributor

By the way, source distribution hash can also be used to verify it on PyPI directly (so I assume that satisfies the voting guidelines in the sense that the actual release is now shipped, and the wheels are just auxiliary artefacts)

image

image

@jimexist
Copy link
Member

jimexist commented Nov 18, 2022

release candidate

FYI one way to have a thing for people to test out is instead of publishing a release candidate, publish to testpypi

@jimexist
Copy link
Member

last time I did publish, I used twine and upload it like any other python packages. maturin was used in prior steps to build the artifacts.

@andygrove
Copy link
Member Author

I pushed a 0.7.0-rc1 tag, and it built the wheels. The artifacts are available for download at the bottom of this page: https://github.com/apache/arrow-datafusion-python/actions/runs/3549837792

@andygrove
Copy link
Member Author

I started a vote on the mailing list: https://lists.apache.org/thread/1nh72rdvywbxyyjhgqs3jd5xrnhx6n5f

@andygrove
Copy link
Member Author

I am now trying to upload to testpypi using twine (based on the instructions at https://packaging.python.org/en/latest/tutorials/packaging-projects/) but am running into an error.

$ python3 -m twine upload --repository testpypi --verbose datafusion-0.7.0-cp37-abi3*.whl
Uploading distributions to https://test.pypi.org/legacy/
INFO     datafusion-0.7.0-cp37-abi3-macosx_10_7_x86_64.whl (9.4 MB)                                                                              
INFO     datafusion-0.7.0-cp37-abi3-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (11.9 MB)                                                     
INFO     datafusion-0.7.0-cp37-abi3-win_amd64.whl (10.0 MB)                                                                                      
INFO     Querying keyring for username                                                                                                           
Enter your username: __token__
INFO     Querying keyring for password                                                                                                           
Enter your password: 
INFO     username: __token__                                                                                                                     
INFO     password: <hidden>                                                                                                                      
Uploading datafusion-0.7.0-cp37-abi3-macosx_10_7_x86_64.whl
100% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.9/9.9 MB • 00:06 • 1.7 MB/s
INFO     Response from https://test.pypi.org/legacy/:                                                                                            
         400 Invalid value for project_urls. Error: Use valid URL.                                                                               
INFO     <html>                                                                                                                                  
          <head>                                                                                                                                 
           <title>400 Invalid value for project_urls. Error: Use valid URL.</title>                                                              
          </head>                                                                                                                                
          <body>                                                                                                                                 
           <h1>400 Invalid value for project_urls. Error: Use valid URL.</h1>                                                                    
           The server could not comply with the request since it is either malformed or otherwise incorrect.<br/><br/>                           
         Invalid value for project_urls. Error: Use valid URL.                                                                                   
                                                                                                                                                 
                                                                                                                                                 
          </body>                                                                                                                                
         </html>                                                              

Any idea what I am doing wrong @jimexist @isidentical?

@andygrove
Copy link
Member Author

I think the issue may be that in f0d5659#diff-50c86b7ed8ac2cf95bd48334961bf0530cdc77b5a56f852c5c61b89d735fd711 we removed the https:// prefix from the URLs in project.url?

This was referenced Nov 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants