Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proxies input argument is mutated #6118

Open
milanboers opened this issue May 3, 2022 · 5 comments
Open

proxies input argument is mutated #6118

milanboers opened this issue May 3, 2022 · 5 comments

Comments

@milanboers
Copy link

The input argument to proxies is mutated when environment proxy variables are present. See the reproduction steps.

This may be different than what users are expecting. It can lead to unexpected behavior when re-using the argument that was passed.

Expected Result

No mutation of the input argument:

>>> proxies = {'dummy': 'dummy'}
>>> os.environ['http_proxy'] = 'http://dummy'
>>> requests.get('https://python.org',proxies=proxies)
<Response [200]>
>>> proxies
{'dummy': 'dummy'}

Actual Result / Reproduction steps

>>> proxies = {'dummy': 'dummy'}
>>> os.environ['http_proxy'] = 'http://dummy'
>>> requests.get('https://python.org',proxies=proxies)
<Response [200]>
>>> proxies
{'dummy': 'dummy', 'http': 'http://dummy'}

System Information

$ python -m requests.help
{
  "chardet": {
    "version": null
  },
  "charset_normalizer": {
    "version": "2.0.12"
  },
  "cryptography": {
    "version": ""
  },
  "idna": {
    "version": "3.3"
  },
  "implementation": {
    "name": "CPython",
    "version": "3.10.4"
  },
  "platform": {
    "release": "5.10.102.1-microsoft-standard-WSL2",
    "system": "Linux"
  },
  "pyOpenSSL": {
    "openssl_version": "",
    "version": null
  },
  "requests": {
    "version": "2.27.1"
  },
  "system_ssl": {
    "version": "30000020"
  },
  "urllib3": {
    "version": "1.26.9"
  },
  "using_charset_normalizer": true,
  "using_pyopenssl": false
}
@nateprewitt
Copy link
Member

Hi @milanboers, the behavior you're seeing is documented and standard across other HTTP tools, so I'm not sure it's broadly unexpected. We can't remove this behavior since a significant portion of users rely on it. If you don't want this functionality, you can set trust_env to False to disable this as documented here.

@milanboers
Copy link
Author

Hi @nateprewitt , not sure if I made myself clear enough. The issue is not that proxy settings are taken from the environment, but that the input argument itself is mutated. I don't see this documented and am not aware of other tools that do the same.

@nateprewitt
Copy link
Member

Thanks for clarifying, I agree that behavior is a bit odd in the context of the top-level apis. merge_environment_settings is intended to mutate the input parameters and so I don't think we'd want to change that behavior. We could potentially create a copy of the input proxies here, but I think that also has potential to be breaking.

The best option here for now may be to make sure this is documented. I'd be willing to accept a PR with this information in the proxies section.

@milanboers
Copy link
Author

I see. From the code it appears unintentional though:

  • All other parameters to merge_environment_settings (url, stream, verify, cert) are not mutated, only proxies is.
  • The merged settings are returned from the function. Why return anything if the input arguments are supposed to be mutated instead?
  • The settings that are returned are not the same as the mutation:
>>> session.proxies['session'] = 'http://session'
>>> os.environ['http_proxy'] = 'http://env'
>>> proxies = {'input': 'http://input'}
>>> session.merge_environment_settings('http://myurl', proxies, None, None, None)
{'verify': True, 'proxies': OrderedDict([('session', 'http://session'), ('input', 'http://input'), ('http', 'http://env')]), 'stream': False, 'cert': None}
>>> proxies
{'input': 'http://input', 'http': 'http://env'}

@nateprewitt
Copy link
Member

  • All other parameters to merge_environment_settings (url, stream, verify, cert) are not mutated, only proxies is.

Well, to that point, proxies is the only mutable argument for merge_environment_settings. Everything else is immutable due to booleans and strings being references.

The merged settings are returned from the function. Why return anything if the input arguments are supposed to be mutated instead?

Because not everything is mutable.

The settings that are returned are not the same as the mutation:

Yeah, I agree this is a bit odd, but it's been the behavior for 9 years. I'm not entirely against improving it, but as I said above, this is existing functionality in a tool the majority of the Python ecosystem relies on. We cannot change this without a major version bump which isn't on the immediate horizon.

I think the actions for the moment are documentation updates. I can reopen this as a breaking tracking feature, but there isn't a current timeline for addressing it.

@nateprewitt nateprewitt reopened this May 3, 2022
Ousret added a commit to jawah/niquests that referenced this issue Sep 24, 2023
**Added**
- Static type annotations thorough the whole package.
- `cert` argument for client authentication with certificate can now pass the password/passphrase using a 3-values tuple (cert, key, password).
  The three parameters in the tuple must be of type `str`.
- `verify` argument behavior has been extended and now accept your CA bundle as `str` instead of a path. It also accepts your CA bundle as `bytes` directly.
  This help when you do not have access to the fs.
- Operating system truststore will be used instead of `certifi`. Root CAs are automatically grabbed from your computer configuration.
- Oriented-object headers. Access them through the new property `oheaders` in your `Response`.
- Propagated the argument `retries` in `niquests.api` for all functions.
- Added argument `retries` in the `Session` constructor.
- Property `conn_info` to the `PreparedRequest` and `Response` that hold a reference to a `ConnectionInfo`.
  This class exposes the following properties: `certificate_der` _(bytes)_, `certificate_dict` _(dict)_ as provided by the standard
  library (ssl), `destination_address` _(tuple[ipAddress, portNumber])_, `cipher` _(str)_, `tls_version` _(TLSVersion)_, and `http_version`.
- Two hooks, namely `pre_send` and `pre_request`. The `pre_request` event is fired just after the initial construction of
  a `PreparedRequest` instance. Finally, the `pre_send` will be triggered just after picking a (live) connection
  for your request. The two events receive a `PreparedRequest` instance.

**Changed**
- Calling the method `json` from `Response` when no encoding was provided no longer relies on internal encoding inference.
  We fall back on `charset-normalizer` with a limited set of charsets allowed (UTF-8/16/32 or ASCII).
- No longer will the `text` method from `Response` return str if content cannot be decoded. It returns None instead.
- If specified charset in content-type does not exist (LookupError) the `text` method from `Response` will rely on charset detection.
- If specified charset in content-type is not made for text decoding (e.g. base64), the `text` method from `Response` returns None.
- With above four changes, the `json` method will raise `RequestsJSONDecodeError` when the payload (body) cannot be decoded.
- Passing invalid `files` description no longer _just skip_ invalid entries, it raises `ValueError` from now on.
- Non-str HTTP-Verb are refused.
- Passing `files` with minimal description (meaning no tuple but _just_ the fp) no longer guess its name when `fp.name` return bytes.
- No longer will the default timeout be unset, thus making you waiting indefinitely.
  Functions `get`, `head`, and `options` ships with a default of 30 seconds.
  Then `put`, `post`, `patch` and `delete` uses a default of 120 seconds.
  Finally, the `request` function also have 120 seconds.
- Basic authorization username and password are now encoded using utf-8 instead of latin-1 prior to being base64 encoded.


**Removed**
- Property `apparent_encoding` in favor of a discrete internal inference.
- Support for the legacy `chardet` detector in case it was present in environment.
  Extra `chardet_on_py3` is now unavailable.
- **requests.compat** no longer hold reference to _chardet_.
- Deprecated `requests.packages` that was meant to avoid breakage from people importing `urllib3` or `chardet` within this package.
  They were _vendored_ in early versions of Requests. A long time ago.
- Deprecated function `get_encodings_from_content` from utils.
- Deprecated function `get_unicode_from_response` from utils.
- BasicAuth middleware no-longer support anything else than `bytes` or `str` for username and password.
- `requests.compat` is stripped of every reference that no longer vary between supported interpreter version.
- Charset fall back **ISO-8859-1** when content-type is text and no charset was specified.
- Main function `get`, `post`, `put`, `patch`, `delete`, and `head` no longer accept **kwargs**. They have a fixed list of typed argument.
  It is no longer possible to specify non-supported additional keyword argument from a `Session` instance or directly through `requests.api` functions.
  e.g. function `delete` no-longer accept `json`, or `files` arguments. as per RFCs specifications. You can still override this behavior through the `request` function.
- Mixin classes `RequestEncodingMixin`, and `RequestHooksMixin` due to OOP violations. Now deported directly into child classes.
- Function `unicode_is_ascii` as it is part of the stable `str` stdlib on Python 3 or greater.
- Alias function `session` for `Session` context manager that was kept for BC reasons since the v1.
- pyOpenSSL/urllib3 injection in case built-in ssl module does not have SNI support as it is not the case anymore for every supported interpreters.
- Constant `DEFAULT_CA_BUNDLE_PATH`, and submodule `certs` due to dropping `certifi`.
- Function `extract_zipped_paths` because rendered useless as it was made to handle an edge case where `certifi` is "zipped".
- Extra `security` when installing this package. It was previously emptied in the previous major.
- Warning emitted when passing a file opened in text-mode instead of binary. urllib3.future can overrule
  the content-length if it detects an error. You should not encounter broken request being sent.
- Support for `simplejson` if was present in environment.
- Submodule `compat`.

**Fixed**
- An invalid content-type definition would cause the charset being evaluated to `True`, thus making the program crash.
- Given `proxies` could be mutated when environment proxies were evaluated and injected. This package should not modify your inputs.
  For context see psf#6118
- A server could specify a `Location` header that does not comply to HTTP specifications and could lead to an unexpected exception.
  We try to fall back to Unicode decoding if the typical and expected Latin-1 would fail. If that fails too, a proper exception is raised.
  For context see psf#6026
- Top-level init now specify correctly the exposed api. Fixes mypy error `.. does not explicitly export attribute ..`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants