Skip to content

Remove unused code that inflated our session size#2412

Merged
monfresh merged 1 commit intomasterfrom
mb-remove-unused-code
Aug 6, 2018
Merged

Remove unused code that inflated our session size#2412
monfresh merged 1 commit intomasterfrom
mb-remove-unused-code

Conversation

@monfresh
Copy link
Contributor

@monfresh monfresh commented Aug 3, 2018

Why: This is the root cause for the KMS errors we experienced in
production yesterday. We were storing the OIDC request params in the
session for no reason. The more parameters an agency included in their
initial OIDC request, the bigger the session size. In the case of
USAJOBS, storing this data in the session increased the length of the
string that was sent to KMS after 2FA from 2948 to 4128.

There were exactly 2 SAML requests during the outage, so they played
an insignificant role. This most likely affected every single USAJOBS
session, which accounted for 74% of requests during that time. I don't
believe TTP was affected. The rest of the requests (4% of total
requests) came from 3 other SPs that I haven't looked into yet.

Note that this only solves the problem for LOA1 OIDC requests (as they
are currently made). SAML requests remain much larger than 4096 bytes.
In general, this KMS limit problem can be solved by one or more of these
solutions, which we have considered in the past:

  • Storing the SP requests in the DB instead of the session
  • Only encrypting the info that needs to be encrypted, as opposed to
    the entire session

Hi! Before submitting your PR for review, and/or before merging it, please
go through the checklists below. These represent the more critical elements
of our code quality guidelines. The rest of the list can be found in
CONTRIBUTING.md

Controllers

  • When adding a new controller that requires the user to be fully
    authenticated, make sure to add before_action :confirm_two_factor_authenticated
    as the first callback.

Database

  • Unsafe migrations are implemented over several PRs and over several
    deploys to avoid production errors. The strong_migrations gem
    will warn you about unsafe migrations and has great step-by-step instructions
    for various scenarios.

  • Indexes were added if necessary. This article provides a good overview
    of indexes in Rails.

  • Verified that the changes don't affect other apps (such as the dashboard)

  • When relevant, a rake task is created to populate the necessary DB columns
    in the various environments right before deploying, taking into account the users
    who might not have interacted with this column yet (such as users who have not
    set a password yet)

  • Migrations against existing tables have been tested against a copy of the
    production database. See LG-228 Make migrations safer and more resilient #2127 for an example when a migration caused deployment
    issues. In that case, all the migration did was add a new column and an index to
    the Users table, which might seem innocuous.

Encryption

  • The changes are compatible with data that was encrypted with the old code.

Routes

  • GET requests are not vulnerable to CSRF attacks (i.e. they don't change
    state or result in destructive behavior).

Session

  • When adding user data to the session, use the user_session helper
    instead of the session helper so the data does not persist beyond the user's
    session.

Testing

  • Tests added for this feature/bug
  • Prefer feature/integration specs over controller specs
  • When adding code that reads data, write tests for nil values, empty strings,
    and invalid inputs.

**Why**: This is the root cause for the KMS errors we experienced in
production yesterday. We were storing the OIDC request params in the
session for no reason. The more parameters an agency included in their
initial OIDC request, the bigger the session size. In the case of
USAJOBS, storing this data in the session increased the length of the
string that was sent to KMS after 2FA from 2948 to 4128.

There were exactly 2 SAML requests during the outage, so they played
an insignificant role. This most likely affected every single USAJOBS
session, which accounted for 74% of requests during that time. I don't
believe TTP was affected. The rest of the requests (4% of total
requests) came from 3 other SPs that I haven't looked into yet.

Note that this only solves the problem for LOA1 OIDC requests (as they
are currently made). SAML requests remain much larger than 4096 bytes.
In general, this KMS limit problem can be solved by one or more of these
solutions, which we have considered in the past:

- Storing the SP requests in the DB instead of the session
- Only encrypting the info that needs to be encrypted, as opposed to
the entire session
Copy link
Contributor

@stevegsa stevegsa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any references. LGTM

@stevegsa
Copy link
Contributor

stevegsa commented Aug 3, 2018

I'm always wary that someone might be assembling a token on the fly with a to_sym...

@monfresh monfresh merged commit f9e8c6e into master Aug 6, 2018
@monfresh monfresh deleted the mb-remove-unused-code branch August 6, 2018 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants