Skip to content

Remove login details before logging stream source#45398

Merged
emontnemery merged 6 commits intohome-assistant:devfrom
uvjustin:remove-login-details-from-log-stream
Mar 23, 2021
Merged

Remove login details before logging stream source#45398
emontnemery merged 6 commits intohome-assistant:devfrom
uvjustin:remove-login-details-from-log-stream

Conversation

@uvjustin
Copy link
Copy Markdown
Contributor

Proposed change

The stream logger currently outputs the entire stream source url which may include camera login details. While login details in the log might be useful for some, it seems like more users would prefer these details to be excluded. This PR uses a simple regex replacement to remove the login details from the url before logging.

Type of change

  • Dependency upgrade
  • Bugfix (non-breaking change which fixes an issue)
  • New integration (thank you!)
  • New feature (which adds functionality to an existing integration)
  • Breaking change (fix/feature causing existing functionality to break)
  • Code quality improvements to existing code or addition of tests

Example entry for configuration.yaml:

# Example configuration.yaml

Additional information

Checklist

  • The code change is tested and works locally.
  • Local tests pass. Your PR cannot be merged unless tests pass
  • There is no commented out code in this PR.
  • I have followed the development checklist
  • The code has been formatted using Black (black --fast homeassistant tests)
  • Tests have been added to verify that the new code works.

If user exposed functionality or configuration variables are added/changed:

If the code communicates with devices, web services, or third-party tools:

  • The manifest file has all fields filled out correctly.
    Updated and included derived files by running: python3 -m script.hassfest.
  • New or updated dependencies have been added to requirements_all.txt.
    Updated by running python3 -m script.gen_requirements_all.
  • Untested files have been added to .coveragerc.

The integration reached or maintains the following Integration Quality Scale:

  • No score or internal
  • 🥈 Silver
  • 🥇 Gold
  • 🏆 Platinum

To help with the load of incoming pull requests:

@probot-home-assistant
Copy link
Copy Markdown

Hey there @hunterjm, mind taking a look at this pull request as its been labeled with an integration (stream) you are listed as a codeowner for? Thanks!
(message by CodeOwnersMention)

Copy link
Copy Markdown
Member

@Kane610 Kane610 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this benefit a global pre-compiled regexp rather than specifying it multiple times?

@pvizeli
Copy link
Copy Markdown
Member

pvizeli commented Jan 21, 2021

@uvjustin uvjustin force-pushed the remove-login-details-from-log-stream branch from a49b9f1 to ca8f0d1 Compare January 22, 2021 01:46
@uvjustin
Copy link
Copy Markdown
Contributor Author

uvjustin commented Jan 22, 2021

Would this benefit a global pre-compiled regexp rather than specifying it multiple times?

Sure, let me do that. I'll look at the log filter too

@uvjustin
Copy link
Copy Markdown
Contributor Author

@pvizeli Not sure how the usage of the log filter is supposed to work. It looks like we need to parse out the sensitive data and then create a HideSensitiveDataFilter for it? But then we need to keep track of and update this filter any time the sensitive data changes (say we get passed a new stream.source).
The other issue is that the replace needs to be careful as it's possible the sensitive data is actually duplicated somewhere else in the logging string eg rtsp://user:pass@apassz.com, and if we try to filter out pass here we're both filtering out the wrong part of the string and leaking information about the password.

self._thread.start()
_LOGGER.info("Started stream: %s", self.source)
_LOGGER.info(
"Started stream: %s", STREAM_SOURCE_RE.sub("//", str(self.source))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to show that information has been redacted?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that, but I don't think that the fact that it is redacted has any importance. Having the username + password or maybe just the username would add more information, but aside from that just having the rest of the url is enough to identify the camera which is probably the point of the log message.

@uvjustin
Copy link
Copy Markdown
Contributor Author

There are also some cheap cams which don't use standard authentication and instead embed the login details in the url. We can't predict all of these formats so we would still let these through. See the format of the RTSP url here: https://ipcamtalk.com/threads/review-top-201-super-mini-720p-hd-ip-cam-the-cheapest-ip-cam-so-far.1775/post-20322

@uvjustin uvjustin force-pushed the remove-login-details-from-log-stream branch from 634cf1a to 0b9554d Compare February 27, 2021 19:34
Copy link
Copy Markdown
Contributor

@emontnemery emontnemery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good overall.

The stream URL is also printed in homeassistant/components/stream/worker.py, should those also be filtered?

Please update or add tests in tests/components/stream to check that what's printed to the log is as expected, maybe something like this:

    await async_setup_component(hass, "stream", {"stream": {}})
    stream = create_stream(hass, "https://abcd:efgh@foo.bar")
    with patch.object(hass.config, "is_allowed_path", return_value=True):
        await stream.async_record("/example/path")
   assert "https://abcd:efgh@foo.bar" not in caplog.text
   assert "https://foo.bar" in caplog.text

@uvjustin
Copy link
Copy Markdown
Contributor Author

Looks pretty good overall.

The stream URL is also printed in homeassistant/components/stream/worker.py, should those also be filtered?

Please update or add tests in tests/components/stream to check that what's printed to the log is as expected, maybe something like this:

    await async_setup_component(hass, "stream", {"stream": {}})
    stream = create_stream(hass, "https://abcd:efgh@foo.bar")
    with patch.object(hass.config, "is_allowed_path", return_value=True):
        await stream.async_record("/example/path")
   assert "https://abcd:efgh@foo.bar" not in caplog.text
   assert "https://foo.bar" in caplog.text

Thanks, I just applied the same regex redaction to the worker message and added tests.

@uvjustin uvjustin force-pushed the remove-login-details-from-log-stream branch from f337603 to 900308a Compare March 22, 2021 15:47
Comment thread homeassistant/components/stream/worker.py Outdated
uvjustin and others added 2 commits March 23, 2021 09:35
Co-authored-by: Erik Montnemery <erik@montnemery.com>
@emontnemery
Copy link
Copy Markdown
Contributor

Thanks, @uvjustin 👍

@emontnemery emontnemery merged commit cd455e2 into home-assistant:dev Mar 23, 2021
@github-actions github-actions Bot locked and limited conversation to collaborators Mar 24, 2021
@uvjustin uvjustin deleted the remove-login-details-from-log-stream branch November 7, 2021 11:44
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RTSP stream username and password visible in logging

6 participants