Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update scrub_fields documentation #149

Closed
wants to merge 1 commit into from
Closed

Conversation

benkuhn
Copy link
Contributor

@benkuhn benkuhn commented Dec 1, 2016

From reading the code, it appears that the scrub_fields transformer uses suffix matching, not whole-field matching. This updates the documentation to reflect that. (This is important to us so we can have e.g. a convention of suffixing our secret variables with _secret to get them scrubbed!)

From reading the code, it appears that the scrub_fields transformer uses suffix matching, not whole-field matching. This updates the documentation to reflect that. (This is important to us so we can have e.g. a convention of suffixing our secret variables with _secret to get them scrubbed!)
@ezarowny
Copy link
Contributor

ezarowny commented Dec 2, 2016

Hey @benkuhn! Thanks for the PR!

I've done some digging around in the source code and unfortunately I believe we only do exact matching at the moment. You are correct that suffix matching is possible though! In fact, if you import the build_key_matcher function into a Python interpreter, it works quite well with suffix matching:

In [1]: import rollbar

In [2]: from rollbar.lib import build_key_matcher

In [3]: matcher = build_key_matcher(rollbar.SETTINGS['scrub_fields'], type='suffix')

In [4]: matcher('secret_password')
Out[4]: True

In [5]: matcher('a_password')
Out[5]: True

During my investigation, I noticed that we actually pass the suffixes to the build_key_matcher function as a list of single element tuples. I haven't quite figured out why the data is structured that way. If it was just a list, as the above example shows, suffix matching would work as is. There's definitely some further investigative work to do here.

I'd actually prefer suffix matching over exact. It would be extremely useful, if not easier, to scrub fields en masse this way. If you want to take it on you certainly can or I can file an internal ticket for us to look at it.

@benkuhn
Copy link
Contributor Author

benkuhn commented Dec 3, 2016

After reading some more of the code, it looks like the single-element-tuple construct happens because the matching stuff is actually built to operate on key paths, not individual keys. E.g. if you have an object like {'foo':{'bar':{'password':'s3cr3t'}}}, then the transformer will see a key of ('foo', 'bar', 'password') and only care about the 'password' suffix. (source)

However, it looks like the build_key_matcher is kind of overkill for the scrubber, since it only ever looks at the last component of the key. Given that, here are some ways the scrub fields could work:

  • Have the scrub transform check key[-1].endswith(field) instead of constructing a key matcher; this turns every scrub_field into a suffix match.
  • Turn scrub_fields into re.compile('^%s$') for field in scrub_fields and then match each element against key[-1]. This preserves compatibility with existing scrubbers and is more flexible than just suffix matching, which might be nice e.g. if people's codebase uses (prefix-based) Hungarian notation. (If you're worried about performance, you can build one giant regex out of all the sub-regexes, e.g. ^(password|pw|...)$, and it will actually be about as fast as comparing against each string with == even in the base case of non-regex fields.)
  • Allow lambdas (with signature str -> bool) in scrub_fields that get applied to key[-1] (or key). Most flexible, but you can't write these down in a config file; I don't know how much you care about that.
  • Something else that I haven't thought of because I only know our use cases and not everyone else's :)

@ezarowny
Copy link
Contributor

These are pretty good ideas! I'll talk this over with the team.

@benkuhn
Copy link
Contributor Author

benkuhn commented Mar 31, 2017

Closing so this doesn't sit in my list of open PRs forever :)

@rokob
Copy link
Contributor

rokob commented Mar 31, 2017

Hey I created #160 so that this actually gets taken care of. Thanks for the ideas and sorry about the lack of response.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants