Skip to content

LG-15655: Submit reCAPTCHA annotations through worker job (Part 2 of 2)#11926

Merged
aduth merged 7 commits intomainfrom
lg-15655-annotate-async
Feb 28, 2025
Merged

LG-15655: Submit reCAPTCHA annotations through worker job (Part 2 of 2)#11926
aduth merged 7 commits intomainfrom
lg-15655-annotate-async

Conversation

@aduth
Copy link
Contributor

@aduth aduth commented Feb 27, 2025

🎫 Ticket

LG-15655

🛠 Summary of changes

Updates RecaptchaAnnotator to asynchronously schedule a worker job to perform the annotation later.

It also removes the feature flag controlling whether to submit annotations for reCAPTCHA assessments performed at sign-in, effectively enabling the behavior everywhere. This is because the asynchronous annotations behavior was the blocker for enabling the behavior, and therefore is satisfied by the changes to RecaptchaAnnotator.

This builds upon:

📜 Testing Plan

Repeat Testing Plan from #11883, noting that you will not have to replace commented code, since those are the changes being implemented in this pull request.

changelog: Internal, Anti-Fraud Tooling, Submit reCAPTCHA annotations through worker job
@aduth aduth requested a review from a team February 27, 2025 13:47
Copy link
Contributor

@mdiarra3 mdiarra3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Tested this locally and worked as expected. Had a recaptcha assessment stored in db too.

@aduth
Copy link
Contributor Author

aduth commented Feb 27, 2025

Had a recaptcha assessment stored in db too.

Actually, this reminds me I'd originally wondered if it would make sense to delete the record after it's submitted successfully, as a sort of auto-cleanup for the table, since we don't need it any more 🤔

Already tested with coverage on RecaptchaAnnotator#submit_assessment
@aduth
Copy link
Contributor Author

aduth commented Feb 27, 2025

Actually, this reminds me I'd originally wondered if it would make sense to delete the record after it's submitted successfully, as a sort of auto-cleanup for the table, since we don't need it any more 🤔

I pushed changes for this in ebc7d1e.

Originally considered approach deleting the record inside the annotator class
class RecaptchaAnnotateJob < ApplicationJob
def perform(assessment:)
RecaptchaAnnotator.submit_assessment(assessment)
assessment.destroy
Copy link
Contributor Author

@aduth aduth Feb 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that these changes would only be effective for half of the worker instances in the 50/50 state, therefore some stale assessments could remain. That might be acceptable, since this is just a clean-up of unnecessary data, but the other option would be to split out a separate pull request with just these changes to deploy before the main pull request.

A third option could be to move the deletion into the RecaptchaAnnotator class. I wasn't originally a fan of that because it feels like the annotator's job should be to just submit the annotation, but maybe it fits the scope of "annotating" to include the clean-up of the assessment record, and keeps the job implementation lean. Pragmatically, it would also avoid having to deal with these issues or a third pull request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, the third option might still be best, but it doesn't actually work around the 50/50 state, since the new application instances would schedule the job from RecaptchaAnnotator, but the actual submission in RecaptchaAnnotator would still happen in a (possibly-old) worker instance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on a conversation in Slack, I added timestamps to the table in be44d95, which should more easily allow us to delete any stale data in the future.

Comment on lines +3 to +5
add_timestamps :recaptcha_assessments
change_column_comment :recaptcha_assessments, :created_at, from: nil, to: 'sensitive=false'
change_column_comment :recaptcha_assessments, :updated_at, from: nil, to: 'sensitive=false'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally I would have expected to be able to add the comment as part of the add_timestamps call:

Suggested change
add_timestamps :recaptcha_assessments
change_column_comment :recaptcha_assessments, :created_at, from: nil, to: 'sensitive=false'
change_column_comment :recaptcha_assessments, :updated_at, from: nil, to: 'sensitive=false'
add_timestamps :recaptcha_assessments, comment: 'sensitive=false'

But this was causing an error during migration, and it looks like the SQL query was including a stringified Ruby proc. Maybe a bug with add_timestamps? I had even tried it with a change_table { |t| t.add_timestamps ... } and got the same result.

ALTER TABLE "recaptcha_assessments" ADD "created_at" timestamp(6) NOT NULL, #<Proc:0x00000001476d1120 /gems/activerecord-8.0.1/lib/active_record/connection_adapters/postgresql/schema_statements.rb:1048>, ADD "updated_at" timestamp(6) NOT NULL, #<Proc:0x00000001476d0d38 /gems/activerecord-8.0.1/lib/active_record/connection_adapters/postgresql/schema_statements.rb:1048>

@aduth aduth merged commit 621346b into main Feb 28, 2025
2 checks passed
@aduth aduth deleted the lg-15655-annotate-async branch February 28, 2025 15:29
@mdiarra3 mdiarra3 restored the lg-15655-annotate-async branch March 5, 2025 17:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants