Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stop vendoring requests and urllib3 while being backwards compatible #1466

Closed
wants to merge 6 commits into from

Conversation

henriklindstrom
Copy link

Fixes #1258
Fixes #1248
Fixes #1325
Fixes #756
Fixes aws/aws-cli#2994

This updates awsrequest.py to be compatible with the latest urllib3 and rips out the vendored versions of requests and urllib3. In order to be compatible with other libraries that are using the vendored packages, requests and urllib3 are imported and added to sys.modules at the same path as before.

This also stops using select.select for checking sockets, in order to work in processes with more than 1024 open filedescriptors (this is fixed in urllib3 by urllib3/urllib3#1001).

This was used to calculate content-length for some file-like objects
that requests could not handle. With updated version of requests
this is not needed any more.
This in order to support having more than 1024 filedescriptors open
in the process.
@codecov-io
Copy link

codecov-io commented May 22, 2018

Codecov Report

Merging #1466 into develop will increase coverage by 19.15%.
The diff coverage is 100%.

Impacted file tree graph

@@             Coverage Diff              @@
##           develop    #1466       +/-   ##
============================================
+ Coverage    71.38%   90.53%   +19.15%     
============================================
  Files          128       51       -77     
  Lines        14159     9109     -5050     
============================================
- Hits         10107     8247     -1860     
+ Misses        4052      862     -3190
Impacted Files Coverage Δ
botocore/retryhandler.py 99.37% <100%> (ø) ⬆️
botocore/compat.py 93.92% <100%> (ø) ⬆️
botocore/endpoint.py 97.41% <100%> (ø) ⬆️
botocore/stub.py 97.87% <100%> (ø) ⬆️
botocore/exceptions.py 100% <100%> (ø) ⬆️
botocore/awsrequest.py 98.55% <100%> (-0.09%) ⬇️
botocore/vendored/__init__.py 100% <100%> (ø)
botocore/utils.py 98.22% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a1b666...92918cd. Read the comment docs.

@joguSD
Copy link
Contributor

joguSD commented Jun 1, 2018

Thanks for taking the time to look into this. This is a pretty big PR that addresses a lot of issues. Would it be possible to break this up into smaller PRs that address issues more specifically?

Right now, I think we would be receptive of fixes that address compatibility issues with newer versions of requests and urllib3 so long as they still work with our vendored version (hopefully that makes sense!). But as of right now I don't think we're quite ready to bite the bullet on unvendoring requests as there's likely to be edge cases that cause breaking behavior. For example, exception handling has changed in urllib3 since our vendored version and I know of at least one place that affects botocore.

@henriklindstrom
Copy link
Author

Yes this is a pretty big PR, but it is made in order to solve a single issue, #756 .
In order to do this, three parts is needed:

This is basically what this PR does, but instead of upgrading the vendored versions of request and urllib3, they are removed.
So, if you feel like its a too big leap to remove them, we would have to find a version of urllib3 that you feel comfortable with including, that also includes the needed fix.

I do however think that it would be worth investigating eventual edge cases from unvendoring the packages and taking the leap, since this situation will likely occur again.

@Hernrup
Copy link

Hernrup commented Jun 5, 2018

I think this PR is as small as It can get from the point of view that unvendoring in it itself is a much needed fix.
We can clearly see from the number of open issues related to this that it is not possible to keep up with the fixes in upstream. Frustration aside, we should consider all the possible security patches that might be applied in upstream and that will be hard to refer back while requests is vendored. This makes botocore unusually susceptible to 0-days. We can already see that the included CA-cert list is out of sync with those validated by https://github.com/certifi for example (later version of requests used certifi).

I get that it is a high-risk change but I dont see a viable long-term solution other then biting the bullet.

@joguSD
Copy link
Contributor

joguSD commented Jun 5, 2018

I agree that the current state of things is not a viable long term solution. We've been playing around with a few ideas and have been thinking about potentially unvendoring requests and dropping it as a dependency entirely and taking urllib3 as a direct dependency. After auditing our usage of requests we really don't use most of the features it offers in botocore and it might be simpler for us to just use urllib3 directly in the long run. A large reason we vendor requests is to avoid version conflicts with other popular libraries that also use requests. Maybe the best way to avoid this is to simply not use it. Any thoughts on this?

@joakimkarlsson
Copy link

I agree that in the long run, removing unnecessary dependencies is the way to go. But this needs to be fixed ASAP. We are currently forced to run a fork of botocore based on this PR in our production environment as the problem with exhausting file descriptors due to #756 makes the official release unviable to use.

So I say we should do whatever is necessary to get this PR merged first, and once that is out of the way start looking at removing dependencies.

We'd be happy to help out in any way we can.

@mwilck
Copy link

mwilck commented Jun 6, 2018

@joguSD, Linux distributions are very much against vendor-specific versions of official packages like requests. I've tried to push a package using your vendored requests as a fix for #1258 for openSUSE, and it was rejected because of the security threats. I'm pretty certain other Linux distros have the same preferences. By shipping a custom requests, you essentially take over responsibility for security fixes for that package, which has a significantly larger attack surface than botocore itself. Is that something you want to carry around in the long term?

That said, I, and I guess most users, don't care whether you solve the issue by taking the patches from here or by just ripping out requests. That's fully up to you.

@Hernrup
Copy link

Hernrup commented Jun 12, 2018

Have you decided on a way forward for this @joguSD ?
What can we do to help?

@joguSD
Copy link
Contributor

joguSD commented Jun 19, 2018

I've opened an issue to continue this discussion: #1486
If you have any additional concerns post them on the proposal issue and give it a +1 to help show that we need to get this prioritized.

I'm working on a branch that implements what's described in the above issue that I will hopefully be able to share before too long.

@joguSD
Copy link
Contributor

joguSD commented Jul 4, 2018

I've opened up the following PR that begins the work to address a lot of these issues: #1495

Let me know if these changes solve the issues you guys are facing or if you run into any new issues!

@joguSD
Copy link
Contributor

joguSD commented Aug 25, 2018

We have un-vendored requests/urllib3 and are now using urllib3 directly as of botocore v1.11.0.
Relevant upgrade notes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants