[LG-501] Only write to phone configuration table#2478
Conversation
c28707b to
a11e287
Compare
There was a problem hiding this comment.
@jmhooper This is what I finally came to. As far as I can tell, the test that was failing (spec/features/users/sign_in_spec.rb line 244) was testing that the email attribute couldn't be decrypted and didn't have anything to do with this line. When I ran the test against the code in master, the user didn't have any profile and this method never got called with this test.
There was a problem hiding this comment.
This might be due to factory_bot stuff. I ran into a similar situation in the follow-on PR and found a fix. I'll test it later on this PR and roll it in here if it lets me get rid of this particular change.
There was a problem hiding this comment.
Yeah, I can totally believe it is something weird with factory bot.
There was a problem hiding this comment.
I'm looking at this and can't seem to figure out how it works at all without this conditional there.
a11e287 to
4ec2c09
Compare
74e3e5f to
18bd83f
Compare
**Why**: Before we can support multiple phones, we have to stop writing to the `user.phone` and related columns. **How**: Go through the code and make sure we never write/update the `user.phone` and `user.phone_confirmed_at` columns. For now, we leave some of the `user.otp_delivery_preference` updates as they are since they look like they are made without a phone number.
18bd83f to
a0e1cb9
Compare
| email_fingerprint = create_fingerprint(email) | ||
| find_by(email_fingerprint: email_fingerprint) | ||
| resource = find_by(email_fingerprint: email_fingerprint) | ||
| resource if resource&.email == email |
There was a problem hiding this comment.
This seems like an odd change. Why is it necessary, and how is this related to phone configurations?
There was a problem hiding this comment.
It is a result of the changes to the UpdateUser service. With this change, we're not decrypting the phone number there (good) but that means we aren't raising an attribute encryption error if there's an issue decrypting attributes. So, a user can sign in without needing any attributes decrypted and a seemingly unrelated 500 pops up further down the line.
There was a problem hiding this comment.
Do you mean we aren't raising an error when trying to encrypt the phone when updating the user? Why would we need to decrypt attributes when setting them?
It seems strange to me that we are looking for a match twice. In what case would there be a match for the email fingerprint but resource.email would not equal email? This new code is 3.62x slower than the old code, and this is a heavily used method in the app.
There was a problem hiding this comment.
Digging into this, I can confirm this code is not needed. This is in fact masking a bug in our SessionsController. I will open a new PR and explain everything there.
lib/tasks/create_test_accounts.rb
Outdated
| user.create_phone_configuration( | ||
| phone: mfa_phone || phone, | ||
| confirmed_at: user.phone_confirmed_at, | ||
| confirmed_at: user.Time.zone.now, |
There was a problem hiding this comment.
Did you mean Time.zone.now without user. before it?
There was a problem hiding this comment.
Doh. Good catch.
**Why**: These changes were made as part of #2478 in order to make the test in `spec/features/users/sign_in_spec.rb:244` pass, but they masked a bug in `Users::SessionsController`. Moreover, they made the heavily used `find_with_email` method almost 4 times slower. The goal of this particular test is to make sure that the app fails loudly if we mess up the list of keys during key rotation. Preferably, we want to fail as early as possible. Here is the flow that explains why this test was passing before the changes in #2478: 1. User signs in with email and password 2. `Users::SessionsController#create` is called 3. `track_authentication_attempt(auth_params[:email])` is called 4. `user_signed_in_and_not_locked_out?(user)` is called 5. `return false unless current_user` is called 6. Devise calls `current_user`, then Warden calls `UpdateUser` in `config/initializers/session_limitable.rb:9`, which raises `Encryption::EncryptionError` because it is trying to decrypt the `phone` encryptable attribute. With the changes in #2478, `UpdateUser` no longer tries to decrypt the phone, which is a good thing. The flow then continues as follows: 7. `User#need_two_factor_authentication?` is called 8. `two_factor_enabled?` is called and returns true right after it sees that `phone_configuration&.mfa_enabled?` is true. Note that so far, no encrypted attributes have been accessed. 9. After a few more calls, `cache_active_profile` is reached and `cacher.save(auth_params[:password], profile)` is called 10. Within `Pii::Cacher#save`, `stale_attributes?` is called. 11. `user.phone_configuration&.stale_encrypted_phone?` is called, which which raises `Encryption::EncryptionError`. However, `Users::SessionsController#cache_active_profile` rescues this error, and then `profile.deactivate(:encryption_error)` is called, which raises an error because `profile` is `nil`. The reason why we never saw this bug in `SessionsController` is because we haven't had problems rotating keys, and because so far, up until the changes in #2478, `Encryption::EncryptionError` was raised before `cache_active_profile` was reached. For example, before we introduced the PhoneConfiguration table, the error was raised early via `two_factor_enabled?`, which accessed the `phone` encryptable attribute. Similarly, if you change the spec to use a user `:with_authentication_app` instead of `:signed_up`, the test will pass because `two_factor_enabled?` will call `totp_enabled?`, which will try to decrypt the `otp_secret_key`. To make the test pass with an MFA-enabled user, we can wrap the 2 lines (in `cache_active_profile`) after the rescue in an `if profile` block, which we should do anyways. This will then allow the user to sign in, and will then raise the `Encryption::EncryptionError` on the 2FA page when it tries to decrypt the phone number. Ideally, we want to fail as early as possible, but with the current design of `cache_active_profile`, that's not possible because it is coupling actions that only apply to verified users with actions that apply to all users when keys are rotated. In a follow-up PR, I will attempt to extract the key rotation so that decryption attempts fail early and are not rescued.
Why:
Before we can support multiple phones, we have to stop writing
to the
user.phoneand related columns.How:
Go through the code and make sure we never write/update the
user.phoneanduser.phone_confirmed_atcolumns. For now,we leave some of the
user.otp_delivery_preferenceupdatesas they are since they look like they are made without a phone
number.
Hi! Before submitting your PR for review, and/or before merging it, please
go through the checklists below. These represent the more critical elements
of our code quality guidelines. The rest of the list can be found in
CONTRIBUTING.md
Controllers
authenticated, make sure to add
before_action :confirm_two_factor_authenticatedas the first callback.
Database
Unsafe migrations are implemented over several PRs and over several
deploys to avoid production errors. The strong_migrations gem
will warn you about unsafe migrations and has great step-by-step instructions
for various scenarios.
Indexes were added if necessary. This article provides a good overview
of indexes in Rails.
Verified that the changes don't affect other apps (such as the dashboard)
When relevant, a rake task is created to populate the necessary DB columns
in the various environments right before deploying, taking into account the users
who might not have interacted with this column yet (such as users who have not
set a password yet)
Migrations against existing tables have been tested against a copy of the
production database. See LG-228 Make migrations safer and more resilient #2127 for an example when a migration caused deployment
issues. In that case, all the migration did was add a new column and an index to
the Users table, which might seem innocuous.
Encryption
Routes
state or result in destructive behavior).
Session
user_sessionhelperinstead of the
sessionhelper so the data does not persist beyond the user'ssession.
Testing
and invalid inputs.