Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drop X-Forwarded-For header after extracting trusted proxies #451

Closed
simonk52 opened this issue Oct 31, 2024 · 13 comments
Closed

Drop X-Forwarded-For header after extracting trusted proxies #451

simonk52 opened this issue Oct 31, 2024 · 13 comments

Comments

@simonk52
Copy link
Contributor

(I'm not certain this is a good idea - please feel free to close if you don't like it)

My application is deployed behind a load balancer so I'm using the following configuration:

trusted_proxy = *
trusted_proxy_headers = x-forwarded-for x-forwarded-proto
trusted_proxy_count = 1
clear_untrusted_proxy_headers = yes

This fixes REMOTE_ADDR and related values, but it leaves the X-Forwarded-For header in the WSGI environment. I use Pyramid, and request.client_addr still reports the first value from the X-Forwarded-For header, which, while documented, feels like a footgun. I either need to write a WSGI middleware to remove those headers or a lint rule to ensure that I never accidentally use request.client_addr.

I think, if waitress has been configured to clear untrusted proxy headers, then it should also clear X-Forwarded-For and related headers after processing them.

This would be a behaviour change so would probably have to be opt-in.

Thanks for reading this far :-). I can try to submit a PR if you think this would be a worthwhile feature.

@mmerickel
Copy link
Member

mmerickel commented Oct 31, 2024

Typically waitress is assuming that if you're trusting a proxy then a proxy is doing the work for you, validating those headers first before passing them to you.

So the issue you're identifying I believe is that if you set a proxy count to 1 and a proxy forwards you 2 x-forwarded-for values like x-forwarded-for: 1.0.0.1, 1.0.0.2, how does waitress handle the situation?

  1. Waitress will pass the headers through unchanged so a client will see x-forwarded-for: 1.0.0.1, 1.0.0.2.
  2. Waitress will set REMOTE_ADDR to the "first trusted value counting "count" values back from the end of the list" which would be the result value if count is 1. In this case it'd be 1.0.0.2.

In that second scenario you are left with x-forwarded-for in the environ containing a list that a client may not parse consistently. I think this is a legit issue and the PR would be to overwrite the headers in the WSGI environ to match the truncated list that waitress used containing at max trusted_proxy_count values. This would at least expose a consistent view of things.

Secondly, you're proposing stripping the header from the wsgi environ entirely. I would say that this is the responsibility of downstream middleware. If proxy count is > 1 then stripping the value would be losing some valuable information that isn't stored elsewhere in the environ.

If you're interested in doing a PR I'd be happy to review it. From a quick glance at the code in https://github.com/Pylons/waitress/blob/ae949bb428e50cf04152db56460f31c1e6d3a2a9/src/waitress/proxy_headers.py we'd be talking about overwriting the wsgi environ with the data contained in the local forwarded_for, forwarded_host_multiple, and proxies local variables by reconstructing valid header formats for them out of the content.

In the example above, the value then passed to the wsgi environ would be x-forwarded-for: 1.0.0.2.

@simonk52
Copy link
Contributor Author

simonk52 commented Nov 1, 2024

When waitress parses the x-forwarded-for header (for example), it does some normalisation of the values, such as removal of double-quotes and backslashes, and adding square brackets to IPv6 address. If we're going to reconstruct that header using just the trusted values, would it be OK to use the original "raw" values, or should we use the post-processed values?

The former would look something like this:

diff --git a/src/waitress/proxy_headers.py b/src/waitress/proxy_headers.py
index 652ca0b..7b42d98 100644
--- a/src/waitress/proxy_headers.py
+++ b/src/waitress/proxy_headers.py
@@ -76,6 +76,7 @@ def parse_proxy_headers(
     forwarded_host = forwarded_proto = forwarded_port = forwarded = ""
     client_addr = None
     untrusted_headers = set(PROXY_HEADERS)
+    rewritten_headers = {}
 
     def raise_for_multiple_values():
         raise ValueError("Unspecified behavior for multiple values found in header")
@@ -84,7 +85,8 @@ def parse_proxy_headers(
         try:
             forwarded_for = []
 
-            for forward_hop in environ["HTTP_X_FORWARDED_FOR"].split(","):
+            raw_forward_hops = environ["HTTP_X_FORWARDED_FOR"].split(",")
+            for forward_hop in raw_forward_hops:
                 forward_hop = forward_hop.strip()
                 forward_hop = undquote(forward_hop)
 
@@ -100,9 +102,11 @@ def parse_proxy_headers(
                     forwarded_for.append(forward_hop)
 
             forwarded_for = forwarded_for[-trusted_proxy_count:]
+            raw_forward_hops = raw_forward_hops[-trusted_proxy_count:]
             client_addr = forwarded_for[0]
 
             untrusted_headers.remove("X_FORWARDED_FOR")
+            rewritten_headers["HTTP_X_FORWARDED_FOR"] = ",".join(raw_forward_hops)
         except Exception as ex:
             raise MalformedProxyHeader(
                 "X-Forwarded-For", str(ex), environ["HTTP_X_FORWARDED_FOR"]

The latter would be more complicated, presumably requiring a "re-quoting" process for each value.

@mmerickel
Copy link
Member

mmerickel commented Nov 1, 2024

I lean toward (my initial thinking was) using the sanitized data in forwarded_for to reconstruct a proper header containing double quotes and square brackets etc that follows the RFC spec. However, as long as what we are doing is consistent across for/host/proxies then I'm ok if we want to preserve the original header value, albeit truncated to trusted_proxy_count items.

@simonk52
Copy link
Contributor Author

simonk52 commented Nov 1, 2024

I think the raw values should be well-formed (otherwise an exception would be raised during parsing), so I think it would be safe (and quicker) to re-use those raw values. If it's OK with you, I will prepare a PR on that basis.

Do you think the headers should be rewritten

a) unconditionally, or
b) only if clear_untrusted_proxy_headers is set, or
c) only if a different (new) configuration option is set

I think (b) makes the most sense, but it would be a behaviour change.

@mmerickel
Copy link
Member

mmerickel commented Nov 1, 2024

I think they should be rewritten for any headers that match trusted_proxy_headers and nothing else, unconditionally. The untrusted ones are passed through unless clear_untrusted_proxy_headers is set (which is the default now). Also I do not think a new flag is warranted.

@digitalresistor
Copy link
Member

#452 was merged, thank you!

@merwok
Copy link

merwok commented Nov 23, 2024

Is it possible to define these settings with the waitress-serve program?
I don’t have pastedeploy in my stack, but I could write a python script if the only way is calling waitress.serve.

@mmerickel
Copy link
Member

You'll need to write your own script as the trusted proxy configs aren't exposed via the CLI. We'd probably accept a PR to add any of those options to the CLI. Writing a script is incredibly easy for waitress tho.

import logging
from waitress import serve

from myapp import app

logging.basicConfig(...)
serve(app, listen='*', clear_untrusted_proxy_headers=False)

@kgaughan
Copy link
Member

An additional flag could expose this. It's something that can be added in another PR.

kgaughan added a commit to kgaughan/waitress that referenced this issue Nov 23, 2024
A number of options aren't documented on the command line interface.
These settings are usable via waitress-serve already, but are not
documented.

See the comment raised by @merwok on Pylons#451.
@kgaughan
Copy link
Member

@merwok I opened #456 to document these options for waitress-serve.

@mmerickel
Copy link
Member

huh I didn't realize the options were just lacking docs, thank you @kgaughan!

@kgaughan
Copy link
Member

Yep! The flags are all automatically generated from Adjustments, so it's only ever a documentation syncing issue.

kgaughan added a commit that referenced this issue Nov 23, 2024
A number of options aren't documented on the command line interface.
These settings are usable via waitress-serve already, but are not
documented.

See the comment raised by @merwok on #451.
@merwok
Copy link

merwok commented Nov 24, 2024

Woop woop! Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants