Skip to content

Deploy RC 64 to int#2424

Merged
jmhooper merged 38 commits intostages/intfrom
stages/rc-2018-08-16
Aug 13, 2018
Merged

Deploy RC 64 to int#2424
jmhooper merged 38 commits intostages/intfrom
stages/rc-2018-08-16

Conversation

@jmhooper
Copy link
Contributor

No description provided.

stevegsa and others added 30 commits July 21, 2018 21:38
…o generated keys

**Why**: To separate the key creator from the IDP user

**How**: Create the CloudhsmKeySharer script to interface to cloudhsm_mgmt_util using the credentials and key handle output by the CloudhsmKeyGenerator.  Use the greenletters gem to automate interaction with the commandline utility and obscure the password.
**Why**: When a user requests an account reset we tell them their request will be processed in 24 hours. We have to be sure that the asyncronous processing of the grant notifications are occurring on a timely basis.

**How**: Add a health checker to the health_controller.  Search for a single record in account_reset_requests where the request was not serviced in 26 hours (24 hours + a 2 hour buffer to service the requests).  Return a bad health status if such a record is found.  The table has an index that is optimized for this type of query while factoring in timestamps, cancellations, and requests already granted.
**Why**: Users can't return to their service provider.  They get stuck at the login.gov account page and don't know what to do.

**How**: After timeout put the request_id back on the url.  Fix the broken feature spec / test.
**Why**: So the SP can offer a custom page when a user fails to proof on LOA3.

**How**: Add a new column to service providers.  Update any supporting classes.
**Why**: The worker instances are no longer used and the devops scripts were renamed
LG-523 Update release script to remove workers and rename recycle script
LG-504 Create a health checker for account reset notifications
**Why**: In #2353 we changed the scrypt cost with changed the scrypt
cost which affected the session encryptor causing sessions encrypted by
old and new hosts to be incompatible. This commit hardcodes the cost in
the deprecated encryptor so that the sessions will be compatible between
hosts.
**Why**: Users are unable to complete IDV if they setup their account with an authenticator app.

**How**: Allow sms verification if it is in the middle of IDV verification by checking a session variable.  Fix the spec / test.
LG-525 Fix IDV for users without phones
**Why**: We are moving away from the user access keys in favor of 2L-KMS
which involves aes encrypted ciphertexts wrapped by KMS
**Why**: Because it doesn't make sense to generate 10 byte salts and
digest them when we can generate 32 byte salts directly
…lidation" (#2400)

This reverts commit 1087dd8, reversing
changes made to bd698fa.
**Why**: For some reason, the exception notification gem returns `nil`
sometimes for `@kontroller.analytics_user`.

**How**: Use an instance of AnonymousUser when a user doesn't exist
Allow exception logs to capture nil user
**Why**: This is the root cause for the KMS errors we experienced in
production yesterday. We were storing the OIDC request params in the
session for no reason. The more parameters an agency included in their
initial OIDC request, the bigger the session size. In the case of
USAJOBS, storing this data in the session increased the length of the
string that was sent to KMS after 2FA from 2948 to 4128.

There were exactly 2 SAML requests during the outage, so they played
an insignificant role. This most likely affected every single USAJOBS
session, which accounted for 74% of requests during that time. I don't
believe TTP was affected. The rest of the requests (4% of total
requests) came from 3 other SPs that I haven't looked into yet.

Note that this only solves the problem for LOA1 OIDC requests (as they
are currently made). SAML requests remain much larger than 4096 bytes.
In general, this KMS limit problem can be solved by one or more of these
solutions, which we have considered in the past:

- Storing the SP requests in the DB instead of the session
- Only encrypting the info that needs to be encrypted, as opposed to
the entire session
**Why**:
We can only send 4k of data to KMS for encryption. We need to
make sure we don't exceed that regardless of which method we
use so we know we can use KMS without errors.

**How**:
Raise an argument error regardless of the encyption method.
Remove unused code that inflated our session size
…ts-returning-to-sp-and-loses-branding

LG-519 Fix: session timeout prevents return to SP and loses SP branding
LG-512 Add a failure to proof url to service providers for LOA3
**Why**: Per request from the agency.

**How**: Update service_providers.yml
…ossil-energy-sp

LG-533 Add a New Redirect URI for the DOE - Fossil Energy SP
**Why**:
We want to make sure all phone configurations are present in
the new table before we start reading data from the table.

**How**:
We use a rake task that processes users in batches to make
sure a phone configuration row exists for the user.
…e-configurations-table

[LG-499] Rake task copies phone info to new table
…grant-user-access

LG-352 Create new CloudHSM key sharing script to grant IDP access to generated keys
monfresh and others added 8 commits August 10, 2018 10:02
* LG-454 Refactor AccountReset::CancelController

**Why**: It was not adhering to our controller design convention, and
was breaking the analytics format contract.

**How**:
- Create a `AccountReset::Cancel` class that validates the token, and if 
successful, notifies the user via email and SMS if the user has a phone, 
then updates the user's account reset request to reflect that it was
cancelled.

- Rename `cancel` to `create` in the controller to adhere to our
CRUD-only guideline.

Benefits: 
- The controller is a lot leaner and cleaner. 
- It keeps all the actions in one place, as opposed to having some 
actions in the controller (the notifications) and some in 
`AccountResetService` (updating the request in the DB).
- It respects the Analytics API, and logs all error scenarios, including
missing tokens.
Whenever there is a new PR the github slack integration shows a preview image from a piece of documentation we link to in our PR template. It adds a lot of noise to the channel and makes the posts take up a lot of space. Until github makes a change in their integration (sounds like others [others want this too](https://github.com/integrations/slack/issues/487)) we could change the link to a shortened version to purposely break this image preview.
Change the indexes in Rails link in PR template
**Why**: To disable account reset for LOA3 users while we determine how
the LOA3 account reset process will work
@jmhooper jmhooper merged commit 04f8f94 into stages/int Aug 13, 2018
@mitchellhenke mitchellhenke deleted the stages/rc-2018-08-16 branch December 28, 2021 16:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants