LG-15056 | Socure A/B test by n1zyy · Pull Request #11544 · 18F/identity-idp

n1zyy · 2024-11-22T21:28:46Z

🎫 Ticket

🛠 Summary of changes

We are planning to roll out Socure as an additional proofing vendor. The first step of that will be to use them in "shadow mode," where we send proofing data to them in parallel with existing vendors, but do not take action on the results. This will allow us to evaluate accuracy without adversely impacting users.

We have been asked to limit the number of users send through Socure mode. Rather than my suggestion to build a convoluted tracking system in Redis, we decided to use the existing A/B test framework as it gives us an easy tunable for volume.

This is not a typical usage of A/B testing. There is no user-facing change, and we're not comparing A vs. B per se. The A/B test framework is just the best path to sending a portion of traffic through this flow.

📜 Testing Plan

These are all settings in application.yml(.default). You will absolutely need to restart the app after each change.

Set idv_socure_shadow_mode_enabled: false and socure_idplus_shadow_mode_percent: 100 and ensure that the shadow mode proofer is not called since shadow mode is disabled.
Set idv_socure_shadow_mode_enabled: true and leave ``socure_idplus_shadow_mode_percent: 100`. Create a few users; all should have shadow mode fire.
Change to socure_idplus_shadow_mode_percent: 0 and run through proofing with a few users; none should have shadow mode run.
Now to the real fun, set socure_idplus_shadow_mode_percent: 50 and run through a handful of proofing runs with different accounts. Some should use shadow mode and some should not.

👀 Screenshots

No UI changes

changelog: Upcoming Features, Socure, Model Socure shadow mode as an A/B test

n1zyy · 2024-11-22T21:29:45Z

app/jobs/resolution_proofing_job.rb

  end

+  def use_shadow_mode(user:)
+    IdentityConfig.store.idv_socure_shadow_mode_enabled &&


There is some precedent for putting a feature flag check ☝️ in the A/B test itself, but it seemed like it would just convolute things here.

This method is also needed for testing (see further discussion in the spec for this class).

I like this better than putting the check in the AB test

n1zyy · 2024-11-22T21:30:59Z

config/initializers/ab_tests.rb

+    buckets: {
+      shadow_mode_enabled: IdentityConfig.store.socure_idplus_shadow_mode_percent,
+    },
+  ).freeze


As evidenced by RECOMMEND_WEBAUTHN_PLATFORM_FOR_SMS_USER above this one, the default discriminator for A/B tests is user&.uuid so no block is needed here.

n1zyy · 2024-11-22T21:31:55Z

config/initializers/ab_tests.rb

+
+  SOCURE_IDV_SHADOW_MODE = AbTest.new(
+    experiment_name: 'Socure shadow mode',
+    should_log: ['IdV: doc auth verify proofing results'].to_set,


I talked to @matthinz about this. We don't strictly need to log the bucket on any event, but logging it with "verify proofing results" gives us an additional check that it's working as expected.

n1zyy · 2024-11-22T21:32:36Z

lib/ab_test.rb

  # @param [String,nil] service_provider Issuer string for the service provider associated with
  #                                      the current session.
-  # @params [Hash] session
+  # @param [Hash] session


Apparent typo I found while I was looking through here.

n1zyy · 2024-11-22T21:33:52Z

lib/identity_config.rb

    config.add(:socure_docv_webhook_secret_key, type: :string)
    config.add(:socure_idplus_api_key, type: :string)
    config.add(:socure_idplus_base_url, type: :string)
+    config.add(:socure_idplus_shadow_mode_percent, type: :integer)


Is there an argument that this should be a float? I briefly considered it, since we'll probably be at 1% or so. Every other usage of a percentage is an integer, though.

I kind of wonder if some specialized syntax like "1 / 1000" would be good. This may come back around when we get to smooshing this A/B test in with the docv one a little more, but keeping this consistent is great for now.

Just spitballing, if we did the "1 / 1000" thing I'd probably advocate for these configs ending in _chance rather than _percent, e.g.:

socure_idplus_shadow_mode_chance: "1 / 1000"

n1zyy · 2024-11-22T21:34:58Z

spec/config/initializers/ab_tests_spec.rb

+    end
+
+    context 'when the A/B test is disabled' do
+      it 'does not return a bucket' do


These are the same examples as in the describe '.RECOMMEND_WEBAUTHN_PLATFORM_FOR_SMS_USER' block, but so much bespoke setup is intermingled that it didn't make sense to me to put them in a shared example.

I am 👍 on limiting usage of shared examples

n1zyy · 2024-11-22T21:37:22Z

spec/config/initializers/ab_tests_spec.rb

+      it 'returns a bucket' do
+        expect(bucket).to eq :shadow_mode_enabled
+      end
+    end


I started to add a test to prove that, when the percentage is 50%, the discriminator is user uuid and they're roughly balanced. This became a gross statistical quagmire, though, and I just abandoned it. (And I fear the wrath of the person trying to diagnose a test where 25 coin flips all came up :shadow_mode_enabled and the test failed.)

There is maybe an argument for letting us specify a seed for A/B test bucket assignment for just this purpose.

n1zyy · 2024-11-22T21:44:37Z

spec/jobs/resolution_proofing_job_spec.rb

        before do
-          allow(IdentityConfig.store).to receive(:idv_socure_shadow_mode_enabled).and_return(true)
+          # Is there a cleaner way of doing this?
+          allow_any_instance_of(described_class).to receive(:use_shadow_mode).and_return(true)


I am not thrilled with this change, but this is about the fifth, and cleanest, incarnation of this.

I first tried keeping the removed line and adding a mock for the percentage, setting it to 100. This has no effect, though, because the AbTests are evaluated as boot since they're constants in an initializer.

I then tried to mock AbTests::SOCURE_IDV_SHADOW_MODE to_receive(:bucket).with(anything).and_return(true), but you cannot do that with frozen objects.

Since I already had this in a little use_shadow_mode method, I made it public and just mock that here, which makes everything work nicely. It does mean that the method is public when it otherwise has no reason to be, and that we don't actually test the contents of the method. Arguably, the former is not a big deal, and the latter is OK-ish because it's a simple feature flag check followed by something tested separately, and if it were private like it otherwise ought to be, we wouldn't be testing it directly anyway.

I think someone has pointed out in the past that allow_any_instance_of(described_class) is an antipattern with a much cleaner alternative but I cannot for the life of me remember it.

If you have an elegant solution for how to accomplish this in mind, I am all ears. If you simply think this isn't particularly elegant, you are in good company.

Here I think:

Suggested change

allow_any_instance_of(described_class).to receive(:use_shadow_mode).and_return(true)

allow(instance).to receive(:use_shadow_mode).and_return(true)

would do it?

(I am going to add an agenda item to the next tech sync about the ResolutionProofingJob spec. It is really big)

n1zyy · 2024-11-22T21:58:44Z

app/jobs/resolution_proofing_job.rb

    end
  end

+  def use_shadow_mode(user:)


Suggested change

def use_shadow_mode(user:)

def use_shadow_mode?(user:)

n1zyy · 2024-11-22T22:02:35Z

spec/config/initializers/ab_tests_spec.rb

+        request: nil,
+        service_provider: nil,
+        session: nil,
+        user:,


Idle blather: I generally like the hash value omission feature, but it feels almost backwards in this usage, where the only value I don't specify is the one that I have available and that is not nil.

matthinz

LGTM!

n1zyy added 2 commits November 22, 2024 16:17

LG-15056 | A/B test for Socure shadow mode

154ed96

changelog: Upcoming Features, Socure, Model Socure shadow mode as an A/B test

Merge branch 'main' into mattw/LG-15056_ab_socure

8c88461

n1zyy commented Nov 22, 2024

View reviewed changes

n1zyy requested review from a team and matthinz November 22, 2024 22:04

matthinz approved these changes Nov 25, 2024

View reviewed changes

n1zyy added 2 commits November 26, 2024 12:57

Address a little PR feedback

52c4aa9

Merge branch 'main' into mattw/LG-15056_ab_socure

b53a0e7

n1zyy merged commit 9d8431d into main Nov 26, 2024

n1zyy deleted the mattw/LG-15056_ab_socure branch November 26, 2024 18:30

matthinz mentioned this pull request Dec 3, 2024

Deploy RC 435 to Production #11584

Merged

	allow_any_instance_of(described_class).to receive(:use_shadow_mode).and_return(true)
	allow(instance).to receive(:use_shadow_mode).and_return(true)

Conversation

n1zyy commented Nov 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🎫 Ticket

🛠 Summary of changes

📜 Testing Plan

👀 Screenshots

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

n1zyy Nov 22, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

matthinz left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

n1zyy commented Nov 22, 2024 •

edited

Loading

n1zyy Nov 22, 2024 •

edited

Loading