Add get_url helper, deprecate base_url#35224
Conversation
|
Hey there @home-assistant/core, mind taking a look at this pull request as its been labeled with a integration ( |
|
Can't we get it from http context variable if available? |
|
Alternatively one could make use of https://docs.python.org/3/library/contextvars.html to store incoming request info, so we don't need to pass it along as a variable all the time. |
Get what exactly? I think I miss your point? |
|
I may be off base, but on each http incoming request, it should be possible to deduce which hostname was used by the client to do the request. So any response that is the result of a request can provide the same host/ip as was used in the request rather than a static configuration. That said, this will be hard/impossible to use for back end initiated things. |
|
That definitely is a viable option to explore. This currently is mainly aimed backend wise. Where things like a Webhook, telegram bot or Samsung smart things are hard to set up / get right with the current configuration possibilities. So let cover the backend first, and get everything migrated (which will be part of this PR). After that, we have a centralized place to improve on. |
This is my use case. The config flow for Plex sends the user to an external site and passes along an HA-generated callback URL to redirect the user back to HA to complete setup. If the If a browser can successfully open the UI, the backend implicitly knows that's a valid URL for that specific session. It's possible for different users to reach the server using different paths, some of which can not be known by the backend ahead of time. For example, if a user is connecting to HA over a proxy or SSH tunnel. Since we have the built-in webserver to leverage, this should be doable automatically without asking the user to configure any hardcoded base URLs. I'd see this part as very useful for certain interactive config flows. |
This goes south with reverse proxies in a lot of cases. While I hear you guys, we now have a single Right now, the focus is the backend case, not the frontend case. That said, an improvement to your use case can already be made by using this helper, as it will catch way more cases. Edit: so your linked case is caused by a nat loopback / hair pinning problem. That would be resolved by this PR, as it can differentiate between an internal and external URL, which right now, isn't possible. |
Does it, though? If the URL the browser used to hit HA is passed all the way down the chain, then it should be reusable for other calls. I tested quickly behind some reverse proxies and with an SSH tunnel and the same URL I see in my browser was always held in |
It is in many cases, not all setups do that. But again, for now this is out of scope. |
|
@elupus @jjlawren I've explored the option for adding a context var as a middleware to our aoihttp web application. While it works fine (in most cases), it goes south on a reverse proxy that does SSL offloading. The scheme will not be correct in those cases, which is intended behavior by aiohttp to avoid security issues. |
|
SSL offloading is no problem. There is a header that should be set to indicate the originating protocol. Can't remember it now. |
|
Yes, we have that header, but it relies on being set by the reverse proxy. So it is not 100% reliable. I guess we could add some logic like: When a reverse proxy is detected, and the forwarded proto not set: No URL available. There are some concerns around using that with security, but we can ride along with the real IP detection middleware we already have. I'm going to do some restructuring to this PR, but leaving this part out. However, will make it easier to add as a next step. I have a POC working and done for the request URL handling, including the X-Forwarded-Proto. |
|
Imho, reverse proxy must be setup correctly. Also those headers should only be respected if source address is from an expected reverse proxy ip (or some secret key in header). Not checked code if that is verified. |
|
@elupus It is verified by the trusted proxies list (real_ip middleware). |
Co-authored-by: Paulus Schoutsen <balloob@gmail.com>
MartinHjelmare
left a comment
There was a problem hiding this comment.
The helper doesn't seem to need async context but we do mark it as such.
| setup_platform(hass, config, add_entities, discovery_info) | ||
|
|
||
| start_url = f"{hass.config.api.base_url}{FITBIT_AUTH_CALLBACK_PATH}" | ||
| start_url = f"{async_get_url(hass)}{FITBIT_AUTH_CALLBACK_PATH}" |
There was a problem hiding this comment.
This is not async context here. Looks like the whole platform is sync.
| configurator.notify_errors(_configurator, error_msg) | ||
|
|
||
| start_url = f"{hass.config.api.base_url}{WINK_AUTH_CALLBACK_PATH}" | ||
| start_url = f"{async_get_url(hass)}{WINK_AUTH_CALLBACK_PATH}" |
There was a problem hiding this comment.
This integration is sync.
| } | ||
|
|
||
| try: | ||
| params["external_url"] = async_get_url(hass, allow_internal=False) |
Breaking change
The HTTP
base_urlis deprecated and replaced by aninternal_urlandexternal_urlcore configuration setting.Proposed change
This PR is aimed at resolving the instance URL juggling/issues across the board. It introduces a new, single point, helper:
Default behavior
By default, without parameters on calling it, it will try to:
httpintegration settings).external_urlset by the user, in case that fails; Get a Home Assistant Cloud URL if that is available.By default, nothing is required and anything is allowed.
base_urlfallback during deprecation periodDuring the deprecation period,
base_urlwill serve as a fallback for the internal & external URLs.For the internal URL the
base_urlis only used if we are sure it is an internal URL (based on a local IP address in the host or.localin the domain of the host).For the external URL the
base_urlis only used if we are sure it is NOT an internal URL (based on a local IP address in the host or.localin the domain of the host).Tuning parameters
The method provides a list of parameters to get the URL one need for the use case. Explanation of the parameters:
require_ssl:Require the returned URL to use the
httpsscheme.require_standard_port:Require the returned URL to have port 80 for the
httpscheme, and port 443 on thehttpsscheme.allow_internal:Allow the URL to be an internal set URL by the user or a detected URL on the internal network. Set this one to
Falseif one requires an external URL exclusively.allow_external:Allow the URL to be an external set URL by the user or a Home Assistant Cloud URL. Set this one to
Falseif one requires an internal URL exclusively.allow_cloud:Allow a Home Assistant Cloud URL to be returned, set to
Falsein case one requires anything but a Cloud URL.allow_ip:Allow the host part of an URL to be an IP address, set to
Falsein case that is not usable for the use case.prefer_external:By default, we prefer internal URLs over external ones. Set this option to
Trueto turn that logic around and prefer an external URL over an internal one.prefer_cloud:By default an external URL set by the user is preferred, however, in rare cases a cloud URL might be more reliable. Setting this option to
Trueprefers the Home Assistant Cloud URL above the user defined external URL.Using this helper, most parts of Home Assistant should be able to get a working URL for their use case.
Context
Originally the PR started out with the following design document:
Frenck's URL helper variant.pdf
Original FlowChart:
Adapting to the real world
During initially implementation quickly was discovered we have many more cases, which have been incorporated. The results of this can be found in the proposal chapter above.
For example, Webhooks have 2 implementations, one from the Home Assistant Cloud, the other from the
webhookintegration itself. The built-in one, should not use cloud URLs.The (now obsolete)
get_external_urlmethod, actually preferred the Home Assistant cloud over thebase_url.The general motivation and design of the document and flowchart still apply.
Request URL
In the discussion in this PR, it appears there is a need for having access to the request URL the user uses. This is currently out of scope for this PR as it aims towards resolving backend issues.
Getting that request URL is a little more complex as it seems. We need to add a ContextVar to the
httpintegration, inject a bit of middleware to set values to that context. Also, while basically that is not hard, the hard part is handling proxied requests correctly. This involves processing theX-Forwarded-Protoheader, which might have security implications.Some other strategy could be: Only retrieve the host from the
httpcontext and try to map it to either the internal or external URL and set preference to either of those based on the current request.However, for the future that is something to consider (although, I'm not completely sure if that should be part of this helper).
Migrations
The following integrations need to be migrated onto this new helper:
Those that rely on
get_external_urlhelper:Those that rely on
hass.config.api.base_urldirectly:Other things to migrate:
browser.openin__main__.pyOther tasks:
Add backget_external_url(deprecated) with a warning in the logs.GitHub search revealed no usage, our codebase doesn't use it either.
https://github.com/search?l=Python&p=1&q=async_get_external_url&type=Code
base_urlproperty and drop a warning in the logs.Will make a separate PR for this, after this has been merged. It is ready and written, but is rather large. This PR is already huge.
Type of change
Example entry for
configuration.yaml:# Example configuration.yamlAdditional information
Checklist
black --fast homeassistant tests)If user exposed functionality or configuration variables are added/changed:
If the code communicates with devices, web services, or third-party tools:
Updated and included derived files by running:
python3 -m script.hassfest.requirements_all.txt.Updated by running
python3 -m script.gen_requirements_all..coveragerc.The integration reached or maintains the following Integration Quality Scale: