Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 16 additions & 2 deletions scripts/cmake/vcpkg_from_github.cmake
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
function(vcpkg_from_github)
cmake_parse_arguments(PARSE_ARGV 0 "arg"
""
"USE_TARBALL_API"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem correct for every port to need to be explicitly told to do this.

  1. Is the tarball we get substantially different or should we apply this change all the time?
  2. Alternately, should this be based on something else being set, like there being a token in one of the other values?

Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • We could detect the fact that we have one of these new tokens by parsing the token string.
  • The problem is that the hash code of those tarballs is different. So doing that would break backward compatibility.

I am 100% open for alterantive solutions.

The name USE_TARBALL_API relates to the implementation and not to the purpose.

Do you have better ideas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, USE_FINE_GRAINED_ACCESS_TOKENS, the feature is documented here:

The fact is that is the tarball API that produces tarballs whith a slightly different content, but their content are 100% equivalent after unpacking them with VCPKG's CMake code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some digging on this and it looks like the token goes in a header rather than the URI? https://docs.github.com/en/rest/repos/contents?apiVersion=2022-11-28#download-a-repository-archive-tar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Yes, the token goes in the HTTP header, wasn't the case also before?
  • If not we could maybe fix the older GitHub-API call and keep the backward compatibility.

Copy link
Contributor Author

@rressi-at-globus rressi-at-globus Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm also digging a little bit more on the API we use currently here:

Binary stability by the current API is guaranteed only on the tarball package after the decompression (maybe you are already expecting that, not sure):

The exact compression settings used to generate a zipball or tarball may change over time. The extracted contents won't change if the branch or tag doesn't change, but the outer compressed archive may have a different byte layout. GitHub will give at least six months' notice before changing compression settings.

And late in the same page, the owners of this API are suggesting to rather use the other endpoint I'm introducing with this API:

If you rely on stability of source code archives for reproducibility (ensuring you always get identical files inside the archive), we recommend using the archives REST API with a commit ID for :ref. Using the commit ID ensures you'll always get the same file contents inside the archive and you’ll be immune to repositories rewriting tags or moving branch heads.

With these 2 things in mind, I would suggest you to define a deprecation strategy for the older endpoint.

And we can introduce with this PR the first step of this strategy where we add the new feature without breaking all of the existing looking forward for the migration.

But it is, of course, up to you.

Whatever solution you decide, what we need is just supporting fine-grainded tokens and keep our fork of this really nice project as close as possible with upstream.

So far it is almost identical.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that on the REST API, there are rate limits. Unauthenticated users might quickly hit the limit of currently 60 requests per hour.
https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api?apiVersion=2022-11-28#primary-rate-limit-for-unauthenticated-users

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • This probably means we could use this new API only when we have a token.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And late in the same page, the owners of this API are suggesting to rather use the other endpoint I'm introducing with this API:

I think they mean using a commit rather than a head or tag.

"OUT_SOURCE_PATH;REPO;REF;SHA512;HEAD_REF;GITHUB_HOST;AUTHORIZATION_TOKEN;FILE_DISAMBIGUATOR"
"PATCHES")

Expand Down Expand Up @@ -102,9 +102,23 @@ Error was: ${head_version_err}
else()
set(downloaded_file_name "${org_name}-${repo_name}-${sanitized_ref}.tar.gz")
endif()

if(arg_USE_TARBALL_API)
# This alternative endpoint has a better support for GitHub's personal
# access tokens (for instance when there is SSO enabled within the
# organization).
set(download_url
"${github_api_url}/repos/${org_name}/${repo_name}/tarball/${ref_to_use}"
)
else()
set(download_url
"${github_host}/${org_name}/${repo_name}/archive/${ref_to_use}.tar.gz"
)
endif()

# Try to download the file information from github
vcpkg_download_distfile(archive
URLS "${github_host}/${org_name}/${repo_name}/archive/${ref_to_use}.tar.gz"
URLS "${download_url}"
FILENAME "${downloaded_file_name}"
${headers_param}
${sha512_param}
Expand Down
Loading