Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[arrow] Update to 10.0.0 #27687

Merged
merged 6 commits into from
Nov 11, 2022
Merged

[arrow] Update to 10.0.0 #27687

merged 6 commits into from
Nov 11, 2022

Conversation

kou
Copy link
Contributor

@kou kou commented Nov 7, 2022

Describe the pull request

  • What does your PR fix?

    None

  • Which triplets are supported/not supported? Have you updated the CI baseline?

    Not changed, No

  • Does your PR follow the maintainer guide?

    Yes

  • If you have added/updated a port: Have you run ./vcpkg x-add-version --all and committed the result?

    Yes

If you are still working on the PR, open it as a Draft: https://github.blog/2019-02-14-introducing-draft-pull-requests/

github-actions[bot]
github-actions bot previously approved these changes Nov 7, 2022
@kou kou marked this pull request as ready for review November 7, 2022 04:01
@kou
Copy link
Contributor Author

kou commented Nov 7, 2022

NOTE: I'm one of Apache Arrow developers. I'll upstream the updated patches as much as possible after this is merged. We'll be able to remove most patches with Apache Arrow 11.0.0.

@JonLiu1993 JonLiu1993 added the category:port-update The issue is with a library, which is requesting update new revision label Nov 7, 2022
JonLiu1993
JonLiu1993 previously approved these changes Nov 7, 2022
@@ -42,48 +34,44 @@ if(VCPKG_TARGET_IS_WINDOWS AND NOT VCPKG_TARGET_IS_MINGW)
list(APPEND FEATURE_OPTIONS "-DARROW_USE_NATIVE_INT128=OFF")
endif()

set(MALLOC_OPTIONS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this commands unsets MALLOC_OPTIONS in the current scope.
To actually start with an emtpy value, this needs to be set(MALLOC_OPTIONS "").

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I didn't know it.

Comment on lines 38 to 49

if("jemalloc" IN_LIST FEATURES)
set(MALLOC_OPTIONS -DARROW_JEMALLOC=ON)
list(APPEND MALLOC_OPTIONS -DARROW_JEMALLOC=ON)
else()
set(MALLOC_OPTIONS -DARROW_JEMALLOC=OFF)
list(APPEND MALLOC_OPTIONS -DARROW_JEMALLOC=OFF)
endif()

if("mimalloc" IN_LIST FEATURES)
set(MALLOC_OPTIONS ${MALLOC_OPTIONS} -DARROW_MIMALLOC=ON)
list(APPEND MALLOC_OPTIONS -DARROW_MIMALLOC=ON)
else()
set(MALLOC_OPTIONS ${MALLOC_OPTIONS} -DARROW_MIMALLOC=OFF)
list(APPEND MALLOC_OPTIONS -DARROW_MIMALLOC=OFF)
endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shoud be integrated into vcpgk_check_features(...) if possible, or get its own invocation of that command.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm can arrow actually be build with jemalloc and mimalloc? I would assume those are exclusive options and thus forbidden.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apache Arrow can build with jemalloc and mimalloc. Users can choose their allocation backend at runtime.

I've removed them and used vcpkg_check_features() instead.

vcpkg_cmake_config_fixup(CONFIG_PATH lib/cmake/arrow)
if("parquet" IN_LIST FEATURES)
vcpkg_cmake_config_fixup(
PACKAGE_NAME Parquet
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can change this to lower-case parquet. It is only the directory name in /share.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. vcpkg prefers to lowercase for /share/PACKAGE name, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and there is even a PR to add a check for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see!

DO_NOT_DELETE_PARENT_CONFIG_PATH
)
endif()
vcpkg_cmake_config_fixup(PACKAGE_NAME Arrow CONFIG_PATH lib/cmake/Arrow)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can change this to lower-case arrow. It is only the directory name in /share, and it will integrate this into the port's existing folder.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines 17 to 22
+ find_package(unofficial-utf8proc ${find_package_args} REQUIRED)
+ if(unofficial-utf8proc_FOUND)
+ set(utf8proc_FOUND TRUE)
+ add_library(utf8proc::utf8proc ALIAS utf8proc)
+ return()
+ endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can trim this to

find_package(utf8proc NAMES unofficial-utf8proc REQUIRED)
add_library(utf8proc::utf8proc ALIAS utf8proc)
return()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've removed REQUIRED and added NAMES.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(same comment as the brotli changes #27687 (comment))

return()
endif()

+if(ARROW_PACAKGE_KIND STREQUAL vcpkg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without quotes, this will break when a variable named vcpkg is defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

return()
endif()

+if(ARROW_PACAKGE_KIND STREQUAL vcpkg)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without quotes, this will break when a variable named vcpkg is defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

Comment on lines 17 to 24
+ find_package(unofficial-brotli ${find_package_args} REQUIRED)
+ if(unofficial-brotli_FOUND)
+ set(Brotli_FOUND TRUE)
+ add_library(Brotli::brotlicommon ALIAS unofficial::brotli::brotlicommon)
+ add_library(Brotli::brotlienc ALIAS unofficial::brotli::brotlienc)
+ add_library(Brotli::brotlidec ALIAS unofficial::brotli::brotlidec)
+ return()
+ endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can trim this to

find_package(Brotli NAMES unofficial-brotli REQUIRED)
add_library(Brotli::brotlicommon ALIAS unofficial::brotli::brotlicommon)
add_library(Brotli::brotlienc ALIAS unofficial::brotli::brotlienc)
add_library(Brotli::brotlidec ALIAS unofficial::brotli::brotlidec)
return()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I should have removed REQUIRED.
Wow, I have never used NAMES for find_package()!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use INTERFACE instead of ALIAS.
Also return() in Find<X>.cmake?
also you need to define everything from find_package_handle_standard_args:

find_package_handle_standard_args(
  Brotli REQUIRED_VARS BROTLI_COMMON_LIBRARY BROTLI_ENC_LIBRARY BROTLI_DEC_LIBRARY
                       BROTLI_INCLUDE_DIR)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, REQUIRED and return() need extra care.

  • In Find modules, one must not use return(). It will exit the scope where find_package is called, not just the Find module. (That's the reason why we cannot use find_dependency in Find modules.)
  • In Find modules, one should pass on REQUIRED only if it was passed in. (In Config files, it can be delegated to find_dependency.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use INTERFACE instead of ALIAS.

Why? Is it for old CMake?

Use INTERFACE instead of ALIAS. Also return() in Find<X>.cmake? also you need to define everything from find_package_handle_standard_args:

Why do we need to use find_package_handle_standard_args() here? It seems that find_package(Brotli NAMES unofficial-brotli) does needed things because this file is called by find_package(Brotli). (Same package name is used.)

  • In Find modules, one must not use return(). It will exit the scope where find_package is called, not just the Find module.

Oh, I didn't know it.

I tried the return() behavior with Apache Arrow but it seems that it works as expected for Apache Arrow (exited only from a Find module).
In Apache Arrow, this file (FileBrotli.cmake) is called from resolve_dependency() macro https://github.com/apache/arrow/blob/apache-arrow-10.0.0/cpp/cmake_modules/ThirdpartyToolchain.cmake#L261 .
If I call return() from FindBrotli.cmake, the next line https://github.com/apache/arrow/blob/apache-arrow-10.0.0/cpp/cmake_modules/ThirdpartyToolchain.cmake#L262 is also evaluated instead of returning from resolve_dependency().

So it seems that we can use return() here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use INTERFACE instead of ALIAS.

Why? Is it for old CMake?

It may not be a problem here, but...
It does behave differently at least on CMake Config export: It will export the original target name (here unofficial::...), not the alias name. This may cause problems when the exported CMake config is used, depending on the completeness of find_dependency stuff.

(Here, both target names are namespaced, so CMake knows they are targets. But we have seen this with internal names like libfoo, and then CMake has no choice but must pass this blindly to the linker, which may succeed on Windows (libfoo.lib) but fail on linux (libfoo.so).)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I understand.
I confirmed that we can use ALIAS In Apache Arrow use case. Can we use ALIAS here? (Can we merge this pull request with the current changes?) Or should we use INTERFACE for better manner?

@kou kou dismissed stale reviews from JonLiu1993 and github-actions[bot] via d9f7db8 November 7, 2022 08:43
Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new experimental fast check for PR issues. Please let us know if this bot is helpful!

PRs must add only one version and must not modify any published versions

When making any changes to a library, the version or port-version in vcpkg.json or CONTROL must be modified.

error: checked-in files for arrow have changed but the version was not updated
version: 10.0.0
old SHA: 11ae7347b3d708803a9562ef48f1734796bb56f6
new SHA: ab325846893f2dc3382b7b403bf195645ab93a39
Did you remember to update the version or port version?
Use --overwrite-version to bypass this check
***No files were updated***

@JonLiu1993
Copy link
Member

Feature plasma,s3,parquet,orc,json,jemalloc,flight,filesystem,example,dataset,csv tested successfully in the x64-linux triplet

@kou
Copy link
Contributor Author

kou commented Nov 8, 2022

Thanks for confirming these features!

kou added a commit to kou/arrow that referenced this pull request Nov 8, 2022
… vcpkg

vcpkg provides CMake packages for Brotli and utf8proc with
"unofficial-" prefix. So we need to care about them separately.

See also: microsoft/vcpkg#27687
@spectre-ns
Copy link

I'm awaiting this for #26501

@JonLiu1993 JonLiu1993 added the info:reviewed Pull Request changes follow basic guidelines label Nov 11, 2022
Comment on lines +14 to +16
+ if(Brotli_FIND_QUIETLY)
+ list(APPEND find_package_args QUIET)
+ endif()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next time, would you mind making this REQUIRED? :)
Other than that, this looks good to me!
Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This spot is in a find module forwarding to vcpkg's unofficial config. IMO REQUIRED is only possible if the original call was REQUIRED and the absence of vcpkg config shall raise an immediate error despite other search options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I'll add the following:

  if(Brotli_FIND_REQUIRED)
    list(APPEND find_package_args REQUIRED)
  endif()

@JavierMatosD JavierMatosD merged commit 59b58fc into microsoft:master Nov 11, 2022
@kou kou deleted the arrow-10.0.0 branch November 12, 2022 20:23
kou added a commit to apache/arrow that referenced this pull request Nov 16, 2022
… vcpkg (#14609)

vcpkg provides CMake packages for Brotli and utf8proc with "unofficial-" prefix. So we need to care about them separately.

See also: microsoft/vcpkg#27687

Authored-by: Sutou Kouhei <[email protected]>
Signed-off-by: Sutou Kouhei <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category:port-update The issue is with a library, which is requesting update new revision info:reviewed Pull Request changes follow basic guidelines
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants