Skip to content

debug: redact sensitive info from debug logs#27579

Closed
tsaarni wants to merge 4 commits intoenvoyproxy:mainfrom
Nordix:debug-log-masking
Closed

debug: redact sensitive info from debug logs#27579
tsaarni wants to merge 4 commits intoenvoyproxy:mainfrom
Nordix:debug-log-masking

Conversation

@tsaarni
Copy link
Member

@tsaarni tsaarni commented May 23, 2023

Redact potentially sensitive information from debug logs by masking query string and cookie values.

Fix #9652

Additional Description:

While doing troubleshooting it might be necessary to collect, distribute and store debug level logs from Envoy. It introduces a risk of unintended data leakage, where sensitive information is revealed accidentally as part of the log files to people who should not have access to that data. Most typically, this would happen through authorization header, cookies and query string value, dumped in the log files.

See example of a log entry revealing sensitive information:

':authority', '127.0.0.1:8080'
':path', '/foo?secret'
':method', 'GET'
'user-agent', 'HTTPie/2.6.0'
'accept-encoding', 'gzip, deflate'
'accept', '*/*'
'connection', 'keep-alive'
'cookie', 'sessionid=secret;another=secret'
'authorization', 'Basic am9lOnBhc3N3b3Jk'

This PR proposes an approach where sensitive information is masked by adding [redacted] in place of potentially sensitive values:

':authority', '127.0.0.1:8080'
':path', '/foo?[redacted]'
':method', 'GET'
'user-agent', 'HTTPie/2.6.0'
'accept-encoding', 'gzip, deflate'
'accept', '*/*'
'connection', 'keep-alive'
'cookie', 'sessionid=[redacted];another=[redacted]'
'authorization', '[redacted]'

Any custom header X-MyExampleApp-Token might potentially also contain sensitive information, but it would require rather elaborate configuration support to handle. This case is not considered by this PR.

The proposed functionality is implemented in Http::Utilities since it contains functions for header parsing. Adding explicit dependency from Http::HeaderMap to Http::Utilities would introduce cyclic dependency, therefore forward declaration was used.

The PR adds runtime guard envoy.reloadable_features.debug_include_sensitive_data which keeps the current behavior unchanged. All headers, even sensitive ones, are printed to debug log as they are. Setting the value to false enables redacting sensitive headers.

Here is a list of older issue where sensitive information was redacted from various printouts:

Risk Level: Low

Testing:

TBD

Docs Changes:

Added chapter to "Application logging" configuration chapter, documenting the runtime feature flag.

Release Notes:

debug: sensitive headers can be redacted from debug logs by setting envoy.reloadable_features.debug_include_sensitive_data runtime guard to false.

@repokitteh-read-only
Copy link

CC @envoyproxy/runtime-guard-changes: FYI only for changes made to (source/common/runtime/runtime_features.cc).

🐱

Caused by: #27579 was opened by tsaarni.

see: more, trace.

@tsaarni
Copy link
Member Author

tsaarni commented May 23, 2023

I know that in #9652 @mattklein123 explicitly asks for design proposal and I'm submitting code PR instead. Sorry for that 😳

After looking at previous related issues (linked in the description) I felt it would be overwhelmingly complicated to propose e.g. a new extension points for this purpose. It seems to me that there isn't natural way to configure and control such extensions that do not naturally "fit" into the XDS API, like application logging .

I was also curious for feedback for a "trivial" approach where masking sensitive information would be targeted to cookies and query string in a hardcoded way, and controlled by runtime guard.

Redact potentially sensitive information from debug logs by masking
- query string
- authorization header value
- cookie values

Signed-off-by: Tero Saarni <tero.saarni@est.tech>
@tsaarni tsaarni force-pushed the debug-log-masking branch from fd7bf0e to 4cb131b Compare May 24, 2023 09:44
@yanavlasov
Copy link
Contributor

This should not be enabled by default since debug.log is a developer tool. This needs to be enabled explicitly for people that write debug log for production traffic.

Also this is very brittle. You have fixed one place where header map is printed in log, but there are others and people may add more in the future. It should probably be a part of the output operator of the header map.

/wait

@tsaarni
Copy link
Member Author

tsaarni commented May 24, 2023

Thank you @yanavlasov!

This should not be enabled by default since debug.log is a developer tool. This needs to be enabled explicitly for people that write debug log for production traffic.

Would using FALSE_RUNTIME_GUARD be acceptable? Or would new command line switch better serve this kind of use case? Or new debug level variant?

Also this is very brittle. You have fixed one place where header map is printed in log, but there are others and people may add more in the future. It should probably be a part of the output operator of the header map.

As the first alternative, I attempted to add this in Http::HeaderMap but that introduced cyclic dependency: Http::Utilities depends on Http::HeaderMap, but using utilities from Http::HeaderMap would require the reverse. Moving or duplicating the utility functions into Http::HeaderMap did not seem right to me.

Would you have any suggestions for me from this perspective?

@tsaarni
Copy link
Member Author

tsaarni commented May 24, 2023

Moving or duplicating the utility functions into Http::HeaderMap did not seem right to me.

Many, but not all libraries that depend on //source/common/http:header_map_lib also depend on //source/common/http:utility_lib. Maybe they could be combined to allow the output operator in Http::HeaderMap use the utility functions to parse cookies and query string?

Signed-off-by: Tero Saarni <tero.saarni@est.tech>
@tsaarni
Copy link
Member Author

tsaarni commented May 26, 2023

I have now negated the feature flag. It is now called envoy.reloadable_features.debug_include_sensitive_data and the default value true describes the current behavior. To redacting sensitive data, user needs to set the value to false.

Rationale: I did not add the feature behind FALSE_RUNTIME_GUARD since FALSE guards are expected to be temporary - this feature will never be enabled by default.

This approach might still be slight misuse of RUNTIME_GUARD. Adding new guards is described in context of adding new features. debug_include_sensitive_data is what we had, it is not a new feature,

When it comes to alternative ways of enabling the feature: I assume new command line flags are discouraged so I will not propose putting this behind new command line flag. And debug level names seem come from the log library, so I will not propose new level, such as debug_wihtout_sensitive.

Also this is very brittle. You have fixed one place where header map is printed in log, but there are others and people may add more in the future. It should probably be a part of the output operator of the header map.

I still have no solution for moving functionality from utility.cc into header_map.cc. I experimented a bit with putting both files into single library, making it possible to call HTTP header utility functions from header_map.cc. It would be relatively big change. Furthermore, almost all libraries include just single .cc/.h file.

tsaarni added 2 commits May 29, 2023 12:38
Call Http::Utilities for dumping HeaderMap, to make it possible to redact
potentially sensitive headers.

Signed-off-by: Tero Saarni <tero.saarni@est.tech>
Signed-off-by: Tero Saarni <tero.saarni@est.tech>
@tsaarni
Copy link
Member Author

tsaarni commented May 29, 2023

@yanavlasov,

Also this is very brittle. You have fixed one place where header map is printed in log,

As a straightforward workaround, I've just added forward declaration to call utility.cc from header_map.cc, making redacting less brittle against future changes.

Copy link
Contributor

@yanavlasov yanavlasov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/wait-any

headers.iterate([&os, spaces = spacesForLevel(indent_level)](
const HeaderEntry& header) -> HeaderMap::Iterate {
os << spaces << "'" << header.key().getStringView() << "', '";
if (header.key() == Http::Headers::get().Path) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the spec would come in handy. Why sanitize just these headers? This change looks to be specific to your requirements. The main concern is that this code would get bigger and bigger as different people would need different headers to be sanitized.

I think the list of headers needs to be determined by configuration, which is how this feature needs to be enabled too. Runtime flag is not a good choice here.

One approach is to define a bootstrap extension for printing header map into the debug log. You can then override it for your product to sanitize the header map the way you want. You can also implement an extension that takes the list of headers that are removed from the log.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where the spec would come in handy.

What type of spec could it be?

For masking three headers addressed in this PR currently, there was a need to write three different parsers: redacting query string, redacting cookie values, redacting complete header.

Would the spec then provide means for user to

  1. apply pre-defined / fixed parsers to mask given headers?
  2. add their own "custom masking" extensions written in C++ (like access log formatters)?

To me (1) sounds reasonable, although only "redact complete header" is generic enough to be applied to any random header.

I guess otherwise we end up in (2) allowing user to write application specific logic, to mask application specific headers. It sounds bit complicated and it might be unlikely people would utilize it that much. After all, masking debug log is quite a rare use case.

This change looks to be specific to your requirements.

I thought the proposed headers were the obvious ones, which would apply to most, not only my requirements.

But I agree, any X-MyExampleApp-Token type of header could contain sensitive information as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a GitHub Issue with a short description of how a feature is going to be implemented is enough.

The key is flexibility. Envoy achieves this through extensions. I think the first step is create an extension point with API - interface with a method that takes header map as a parameter and returns formatted string. This can then plumbed into the header map.

Then you can build on top of this to add functionality that you need. I.e. list of headers that need to be redacted, boolean flag to redact query, cookies, etc. You can add regex matchers, whatever suits your purpose.

This is one of the possible solutions, other maintains may suggest simpler things. The Issue with the spec would help define the solution that meets community needs.

@mattklein123
Copy link
Member

This change seems extremely fragile to me and of dubious utility given how many places sensitive information might come out. Can you just route the logs in your environment through some scrubber that parses them and filters?

@tsaarni
Copy link
Member Author

tsaarni commented Jun 8, 2023

This change seems extremely fragile to me and of dubious utility given how many places sensitive information might come out. Can you just route the logs in your environment through some scrubber that parses them and filters?

Hi @mattklein123! I've added a design doc here #27820. Do I interpret it correctly that anything that could be addressed there will not change the situation, and that #9652 should rather be closed as will-not-implement?

#9652 was specifically about redacting headers, which are centrally handled by Http::HeaderMap and could be addressed. It was obviously not considering any other type of debug printouts that might potentially contain data originating from e.g. headers but not printed through Http::HeaderMap.

@mattklein123
Copy link
Member

I'm mostly saying that I agree with @yanavlasov comments that these feels like a very specific change to deal with a single case of a problem, when in fact there will be many more cases in practice. It just seems like this is better handled as a wrapper in your own environment rather than changes to Envoy. If other maintainers strongly agree this is useful I'm happy to be swayed.

@mattklein123
Copy link
Member

Going to close this out for now given the discussion. We can reopen if there is strong support from other maintainers to go this route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Prevent sensitive request headers from being logged in debug level

4 participants