Throw errors in tests if PII is logged (LG-5163)#5447
Conversation
| analytics.track_event( | ||
| Analytics::IDV_PHONE_CONFIRMATION_VENDOR, | ||
| form_result.to_h.merge( | ||
| pii_like_keypaths: [ |
There was a problem hiding this comment.
With how widespread it seems we have innocent logging by attribute name, I wonder if it's worth considering one of the other avenues considered for detecting PII, like common patterns for phone/SSN/etc, or "known" PII values like our mock profile details?
There was a problem hiding this comment.
My sense is that approach trades convenience for completeness? Many tests used that PII but not all of them do? Do we think it would be close enough to check for those attribute values?
There was a problem hiding this comment.
I don't really feel too strongly one way or the other. For me, the ticket seemed to be framed as something of a safety net for peace of mind, and not necessarily the primary safeguard against logging PII. I could also wonder if using the mock details could catch logging in a situation where it's not cleanly structured data with attribute names, or with slight variations to the exact attribute name.
The current implementation isn't really problematic to me, though I would wonder if it could be confusing that the pii_like_keypaths is not actually part of the "extra" payload being logged.
We could even do both if we wanted 🤷
There was a problem hiding this comment.
Added in 77a2aad
Example error:
FakeAnalytics::PiiDetected: track_event example PII first_name (FAKEY) detected in attributes
event: Trackable Event ()
full event: {:some_benign_key=>"FAKEY MCFAKERSON"}"
There was a problem hiding this comment.
Another option, is we could update FormResponse and the classes similar to it to have a new attribute (next to extra), which get included in the .to_h? like this:
FormResponse.new(
success: false,
errors: { ssn: 'is not formatted right' },
extra: { num_attempts: 14 },
pii_like_key_paths: [[:errors, :ssn], [:error_details, :ssn]],
)|
This finally passes! 🎉 🎉 🎉 🎉 🎉 🎉 |
|
Had my first encounter with a flakey test with the new checking: I wonder if it could make sense to do a regex match with word boundary? e.g. |
Yup! Makes sense, can give it a shot. Or, we could decide zipcode on its own is not sensitive enough to warrant a fail, so we could skip that attribute (like we skip state) |
If we pass hashes that
piianywhere (such asdecrypted_pii: "json string")the tests will raise
Implementation is shared across Analytics and FakeAnalytics which should cover most of our tests where we're not otherwise stubbinb