Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uv pip install returning 403 from private pypi cloud instance backed by s3 #2025

Closed
philiplinden opened this issue Feb 27, 2024 · 26 comments · Fixed by #3070 or #3460
Closed

uv pip install returning 403 from private pypi cloud instance backed by s3 #2025

philiplinden opened this issue Feb 27, 2024 · 26 comments · Fixed by #3070 or #3460
Labels
question Asking for clarification or support

Comments

@philiplinden
Copy link

philiplinden commented Feb 27, 2024

I am using a private pypi cloud instance backed by s3 (with no auth on my end). Public packages are resolved normally, but uv pip cannot resolve packages hosted on the private cloud instance.

  • pip3 install my-private-package --index-url https://my-pip-instance.example.com/ succeeded with no problems.
  • Running uv pip install with the --no-cache option did not change the result.
  • I can retrieve the artifact from the url that uv pip was trying to fetch. This URL returns a 302 to the url that is giving me a 403
$ uv pip install my-private-package --index-url https://my-pip-instance.example.com/

error: Failed to build editables
  Caused by: Failed to build editable: file:///Users/phil/repos/philiplinden/scratch
  Caused by: Failed to install requirements from build-system.requires (resolve)
  Caused by: No solution found when resolving: setuptools
  Caused by: Failed to download: setuptools==69.1.1
  Caused by: HTTP status client error (403 Forbidden) for url (https://s3-url.amazonaws.com/example/setuptools-69.1.1-py3-none-any.whl?AWSAccessKeyId=xxx&Signature=xxx%3D&Expires=1709153725](https://s3-url.amazonaws.com/ffdf/setuptools/setuptools-69.1.1-py3-none-any.whl?AWSAccessKeyId=xxx&Signature=xxx%3D&Expires=1709153725)))

Relates to #1709 and #1902

Version: 0.1.6, 0.1.11

Verbose output (anonymized)

 uv_client::flat_index::from_entries 
 uv_installer::downloader::build_editables 
      0.355282s   0ms DEBUG uv_distribution::source Building (editable) file:///Users/phil/repos/philiplinden/scratch
   uv_dispatch::setup_build package_id="file:///Users/phil/repos/philiplinden/scratch", subdirectory=None
     uv_resolver::resolver::solve 
          0.361346s   0ms DEBUG uv_resolver::resolver Solving with target Python version 3.11.6
       uv_resolver::resolver::choose_version package=root
       uv_resolver::resolver::get_dependencies package=root, version=0a0.dev0
            0.361500s   0ms DEBUG uv_resolver::resolver Adding direct dependency: setuptools*
       uv_resolver::resolver::choose_version package=setuptools
         uv_resolver::resolver::package_wait package_name=setuptools
     uv_resolver::resolver::process_request request=Versions setuptools
       uv_client::registry_client::simple_api package=setuptools
         uv_client::cached_client::get_cacheable 
           uv_client::cached_client::read_and_parse_cache file=/Users/phil/Library/Caches/uv/simple-v1/8aba338bd0495f93/setuptools.rkyv
     uv_resolver::resolver::process_request request=Prefetch setuptools *
              0.366930s   5ms DEBUG uv_client::cached_client Found stale response for: https://my-pip-instance.example.com/simple/setuptools/
              0.366959s   5ms DEBUG uv_client::cached_client Sending revalidation request for: https://my-pip-instance.example.com/setuptools/
           uv_client::cached_client::revalidation_request url="https://my-pip-instance.example.com/setuptools/"
              1.016439s 654ms DEBUG uv_client::cached_client Found modified response for: https://my-pip-instance.example.com/simple/setuptools/
           uv_client::cached_client::new_cache file=/Users/phil/Library/Caches/uv/simple-v1/8aba338bd0495f93/setuptools.rkyv
           uv_client::registry_client::parse_simple_api package=setuptools
             uv_client::html::parse url=https://my-pip-instance.example.com/setuptools/
 uv_resolver::version_map::from_metadata 
       uv_distribution::distribution_database::get_or_build_wheel_metadata dist=setuptools==69.1.1
         uv_client::registry_client::wheel_metadata built_dist=setuptools==69.1.1
           uv_client::cached_client::get_serde 
             uv_client::cached_client::get_cacheable 
               uv_client::cached_client::read_and_parse_cache file=/Users/phil/Library/Caches/uv/wheels-v0/index/8aba338bd0495f93/setuptools/setuptools-69.1.1-py3-none-any.msgpack
            1.322675s 961ms DEBUG uv_resolver::resolver Searching for a compatible version of setuptools (*)
            1.322694s 961ms DEBUG uv_resolver::resolver Selecting: setuptools==69.1.1 (setuptools-69.1.1-py3-none-any.whl)
       uv_resolver::resolver::get_dependencies package=setuptools, version=69.1.1
         uv_resolver::resolver::distributions_wait package_id=setuptools-69.1.1
                  1.322754s   0ms DEBUG uv_client::cached_client No cache entry for: https://my-pip-instance.example.com/api/package/setuptools/setuptools-69.1.1-py3-none-any.whl#sha256=02fa291a0471b3a18b2b2481ed902af520c69e8ae0919c13da936542754b4c56
               uv_client::cached_client::fresh_request url="https://my-pip-instance.example.com/api/package/setuptools/setuptools-69.1.1-py3-none-any.whl#sha256=02fa291a0471b3a18b2b2481ed902af520c69e8ae0919c13da936542754b4c56"
error: Failed to build editables
  Caused by: Failed to build editable: file:///Users/phil/repos/philiplinden/scratch
  Caused by: Failed to install requirements from build-system.requires (resolve)
  Caused by: No solution found when resolving: setuptools
  Caused by: Failed to download: setuptools==69.1.1
  Caused by: HTTP status client error (403 Forbidden) for url (https://my-pip-instance.amazonaws.com/ffdf/setuptools/setuptools-69.1.1-py3-none-any.whl?AWSAccessKeyId=xxx&Signature=xxx&Expires=1709157923)
@charliermarsh charliermarsh added the bug Something isn't working label Feb 27, 2024
@charliermarsh
Copy link
Member

Do you mind updating to v0.1.11? v0.1.6 is a few versions out-of-date.

@charliermarsh charliermarsh added question Asking for clarification or support and removed bug Something isn't working labels Feb 27, 2024
@philiplinden
Copy link
Author

Thanks, I just updated to v0.1.11 and now the installer hangs at the same spot for a few seconds before throwing the same 403 error.

@charliermarsh
Copy link
Member

Can you say a bit more about how the auth is intended to work? The URL is publicly available, and redirects you to S3 URLs with credentials embedded?

@philiplinden
Copy link
Author

Can you say a bit more about how the auth is intended to work? The URL is publicly available, and redirects you to S3 URLs with credentials embedded?

Yeah that's correct. It uses s3 one time preauthed urls. I am using pypicloud with redirect_urls enabled

@amarckal
Copy link

This also occurs when pip compiling from gemfury links, the package lookup on gemfury works, but downloading packages fails since they are backed by s3. This is the response from S3 when opening one of those preauthed links:

Code: SignatureDoesNotMatch
Message: The request signature we calculated does not match the signature you provided. Check your key and signing method.

@torarvid
Copy link

torarvid commented Apr 4, 2024

Hi. I probably don't understand half of the code in this repo, but after experimenting with uv pip install -vv locally, I think I might have found a clue:

let mut headers = HeaderMap::default();
if let Some(authorization) = req.headers().get("authorization") {
headers.append("authorization", authorization.clone());
}

Here, I believe req never has auth headers attached, because it's only when the request is executed the AuthMiddleware runs and attaches the auth header to req. Iow, I believe the headers are extracted too early in the linked code. Not sure how to fix it though 🤷🏻‍♂️

@charliermarsh
Copy link
Member

That should be okay though, since the subsequent requests will also go through the auth middleware and get the appropriate headers attached.

Were you able to reproduce this issue? What does your setup look like?

@torarvid
Copy link

torarvid commented Apr 4, 2024

Yep, you're right, @charliermarsh. I just tried hard-coding in my credentials at that point in the code, and then I got a "Request already has an authorization header" error instead. My clue was not a clue after all 😢

But yes, I can reproduce. I believe I have the same issue as @amarckal, which is that when I use curl against pypi.fury.io to fetch my private package, I get a 302 redirect to a url like this:

https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?x-acct=<redacted-acct>&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T180859Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>

(I put <redacted-[whatever]> in places that could be sensitive).

When I run ./target/debug/uv pip install -vv --no-cache --extra-index-url https://{$GEMFURY_READ_TOKEN}@pypi.fury.io/oda/ sdxp==0.7.3, the last part of the output is (with some <redacted-[whatevers]> here as well:

           uv_client::cached_client::read_and_parse_cache file=/private/var/folders/2k/by15c_cs40l9swcck47g6lpm0000gn/T/.tmplRVtUe/wheels-v0/index/5f5b51aad86993d4/sdxp/sdxp-0.7.3-py3-none-any.msgpack
 uv_client::cached_client::from_path_sync path="/private/var/folders/2k/by15c_cs40l9swcck47g6lpm0000gn/T/.tmplRVtUe/wheels-v0/index/5f5b51aad86993d4/sdxp/sdxp-0.7.3-py3-none-any.msgpack"
                1.665639s   0ms TRACE uv_client::cached_client No cache entry exists for /private/var/folders/2k/by15c_cs40l9swcck47g6lpm0000gn/T/.tmplRVtUe/wheels-v0/index/5f5b51aad86993d4/sdxp/sdxp-0.7.3-py3-none-any.msgpack
              1.665827s   1ms DEBUG uv_client::cached_client No cache entry for: https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad
           uv_client::cached_client::fresh_request url="https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad"
                1.666044s   0ms TRACE uv_client::cached_client Sending fresh HEAD request for https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad
                1.666269s   0ms DEBUG uv_auth::middleware Adding authentication to already-seen URL: https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad
                1.895293s 229ms TRACE uv_client::httpcache cached request https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl#sha256=869326637eef5de7d4312b82ecf9ba85fcd9e038273d54c0bbe7602d3b8529ad is storable because its response has a heuristically cacheable status code 200
           uv_client::cached_client::new_cache file=/private/var/folders/2k/by15c_cs40l9swcck47g6lpm0000gn/T/.tmplRVtUe/wheels-v0/index/5f5b51aad86993d4/sdxp/sdxp-0.7.3-py3-none-any.msgpack
           uv_client::registry_client::read_metadata_range_request wheel=sdxp-0.7.3-py3-none-any.whl
                1.896366s   0ms TRACE uv_client::registry_client Getting metadata for sdxp-0.7.3-py3-none-any.whl by range request
    1.897265s DEBUG uv_auth::middleware No credentials found for: https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T181433Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>
error: Failed to download: sdxp==0.7.3
  Caused by: Failed to unzip wheel: sdxp-0.7.3-py3-none-any.whl
  Caused by: an upstream reader returned an error: io error occurred: Request error: HTTP status client error (403 Forbidden) for url (https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T181433Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>)
  Caused by: io error occurred: Request error: HTTP status client error (403 Forbidden) for url (<the-same-url-again>)
  Caused by: Request error: HTTP status client error (403 Forbidden) for url (<the-same-url-again>)
  Caused by: HTTP status client error (403 Forbidden) for url (<the-same-url-again>)

If I isolate just the url in this output, there's another possible clue:

https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T181433Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig> (<-- this is the uv one)

compared with the curl one from above:

https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?x-acct=<redacted-acct>&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240404%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240404T180859Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig> (<-- this is the curl one)

The curl url has a x-acct=<stuff> query param, while the uv one doesn't. I have no idea why...

@zanieb
Copy link
Member

zanieb commented Apr 4, 2024

Wow thanks for the sleuthing. I have no idea why that would be dropped either.

@charliermarsh
Copy link
Member

Do you know if you did anything special to get Gemfury to run against S3? My Gemfury URLs don't look like that so I've had trouble reproducing.

@zanieb
Copy link
Member

zanieb commented Apr 4, 2024

I pushed a branch with an extra log if you want to give it a try: #2823

I'm trying to narrow down when that part is dropped from the URL.

@benwebber
Copy link

benwebber commented Apr 4, 2024

We are experiencing the same issue. I wonder if it's because we have an older Gemfury account. This option is enabled under our organization settings:

Screen Shot 2024-04-04 at 14 43 23

I don't remember opting into that explicitly. Is that disabled in your account?

EDIT: Disabling this option changed the package source to a https://gemfury.s3-accelerate.dualstack.amazonaws.com/gems/... CDN URL instead of S3, but I still get the same error.

@charliermarsh
Copy link
Member

I can try enabling that and then uploading a new package.

@charliermarsh
Copy link
Member

Sadly it's still giving me URLs like https://pypi.fury.io/charliermarsh/-/ver_mt7Ge/gemfury-test-0.0.1.tar.gz.

@torarvid
Copy link

torarvid commented Apr 4, 2024

Without knowing, I am sure that for our Gemfury account, I have at least one package that works fine (it does not redirect to S3) and then at least this one here that fails (because it does redirect to S3).

I'm not sure what causes this difference in behavior. The one that fails for me was created by running poetry publish -r oda -u <secret> -p NOPASS, so it's possible poetry makes it "do S3 magic"? (The package that works fine is made by another team. At this point I have no idea how it was created/published)

@charliermarsh
Copy link
Member

I emailed Gemfury.

@charliermarsh
Copy link
Member

Perhaps I can get setup with one of these S3-back indexes, or they can tell me what I'm doing wrong.

@torarvid
Copy link

torarvid commented Apr 5, 2024

Ok, I've said "I think I found a clue!" before and been wrong, but I persist: I think I might have found a clue! 😆

I made this test program:

use std::error::Error;

use reqwest::redirect::Policy;

use reqwest::Client;

#[tokio::main]
pub async fn main() -> Result<(), Box<dyn Error>> {
    let url = "https://pypi.fury.io/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl";
    let url: reqwest::Url = url.parse().unwrap();
    let client = Client::builder()
        // Ensure that we *don't* follow redirects for this example
        .redirect(Policy::none())
        // fake being curl in case it matters (i don't think so)
        .user_agent("curl/8.4.0")
        .build()?;
    let req = client
        .head(url)
        .header("authorization", "Basic <redacted>")
        .header("accept", "*/*")
        .build()?;
    println!("111 {:?}", req);
    let head_response = client.execute(req).await?;
    println!("222 {:?}", head_response);
    let location = head_response.headers().get("location").unwrap();
    println!("333 {:?}", location);
    Ok(())
}

And I get this output (with redactions):

111 Request { method: HEAD, url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("pypi.fury.io")), port: None, path: "/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl", query: None, fragment: None }, headers: {"authorization": "Basic <redacted>", "accept": "*/*"} }
222 Response { url: Url { scheme: "https", cannot_be_a_base: false, username: "", password: None, host: Some(Domain("pypi.fury.io")), port: None, path: "/oda/-/ver_Fz9wq/sdxp-0.7.3-py3-none-any.whl", query: None, fragment: None }, status: 302, headers: {"server": "Cowboy", "report-to": "{\"group\":\"heroku-nel\",\"max_age\":3600,\"endpoints\":[{\"url\":\"https://nel.heroku.com/reports?ts=1712306018&sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6&s=KrA%2BOTdymL0CqPAad5loOyUNMC4BcBEE%2BWCTsxIYvkA%3D\"}]}", "reporting-endpoints": "heroku-nel=https://nel.heroku.com/reports?ts=1712306018&sid=929419e7-33ea-4e2f-85f0-7d8b7cd5cbd6&s=KrA%2BOTdymL0CqPAad5loOyUNMC4BcBEE%2BWCTsxIYvkA%3D", "nel": "{\"report_to\":\"heroku-nel\",\"max_age\":3600,\"success_fraction\":0.005,\"failure_fraction\":0.05,\"response_headers\":[\"Via\"]}", "connection": "keep-alive", "content-type": "text/html; charset=utf-8", "location": "https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240405%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240405T083338Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>", "vary": "Accept-Encoding", "date": "Fri, 05 Apr 2024 08:33:38 GMT", "via": "1.1 vegur"} }
333 "https://s3.amazonaws.com/gemfury/gems/<redacted-path>/sdxp_0_7_3_py3_none_any_whl?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=<redacted-cred>%2F20240405%2Fus-east-1%2Fs3%2Faws4_request&X-Amz-Date=20240405T083338Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=<redacted-sig>"

So: if I curl -i <the-url-from-the-333-line> I again get the error from AWS: The request signature we calculated does not match the signature you provided. Check your key and signing method.

But if I change the curl-line to curl -i --head <the-url-from-the-333-line> it works.

Next, I change my Rust test program from .head(url) to .get(url), I then take the url that it spits out. Now the url works with curl -i and fails with curl -i --head.

So finally my clue becomes: The HTTP method is encoded in the signature for the S3 urls, so when you pass that S3 url to AsyncHttpRangeReader::from_head_response it won't work. The S3 url would only work with a HEAD request, but AsyncHttpRangeReader::from_head_response uses GET internally. 🤯

Am I right? I need all of your 🧠s to sanity check my logic 😛

@torarvid
Copy link

torarvid commented Apr 5, 2024

@torarvid
Copy link

torarvid commented Apr 5, 2024

I made a proof-of-concept PR to work around the issue. I do that by passing a modified response to the range reader so that it uses the "original" (gemfury) link and not the 302-redirected S3 link. Works for my local repro test case 😄

@elbaro
Copy link
Contributor

elbaro commented Apr 16, 2024

I have the same issue with pypicloud, and @torarvid 's PR did not work.

❯ cargo install --git https://github.com/torarvid/uv.git --rev 6d89c85 uv
❯ uv pip compile pyproject.toml -o requirements.txt --index-url http://internal-pypi:8080/simple/
error: Failed to download: internal-pkg==0.1.1054928
  Caused by: HTTP status client error (403 Forbidden) for url (https://bucket-name.s3.amazonaws.com/pypi10c6/internal-pkg/internal-pkg-0.1.1054928-py3-none-any.whl
    ?Signature=%2FhTEgX6psBoSyuCM9F4BiwpCEbw%3D
    &Expires=1870939460
    &AWSAccessKeyId=FEWNCEFY
    &x-amz-security-token=TP//////////ARNU/OkGQV/8AwxoYm)

If I manually curl the printed URL, GET works but HEAD gets 403.

curl -vvv https://bucket-name.s3..  # 200
curl -vvv -X HEAD https://bucket-name.s3..  # 403 Forbidden

@elbaro
Copy link
Contributor

elbaro commented Apr 16, 2024

For pypicloud workaround,
I added 403 Forbidden to https://github.com/astral-sh/uv/pull/2186/files and it worked.

@charliermarsh
Copy link
Member

Interesting, ok, we can add that. Do you want to submit a PR?

@zanieb zanieb reopened this Apr 16, 2024
@zanieb
Copy link
Member

zanieb commented Apr 16, 2024

@charliermarsh I re-opened as I do not think that addresses all of the cases here.

@charliermarsh
Copy link
Member

Thanks, sorry, I didn't mean to close this.

@charliermarsh
Copy link
Member

If anyone is willing to test #3460 I would appreciate it.

charliermarsh added a commit that referenced this issue May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Asking for clarification or support
Projects
None yet
7 participants