Update long-running report queries to reset connection by zachmargolis · Pull Request #7104 · 18F/identity-idp

zachmargolis · 2022-10-06T21:17:34Z

🎫 Ticket

(none)

🛠 Summary of changes

Why: Occasionally we see PG::UnableToSend errors at which point the connection stops receiving queries. NewRelic link

In an attempt to recover the job and continue, we call #reconnect! on the connection. However, this would reset our session-specific timeouts, so there's a big of rejiggering to allow the connection timeout stuff from the reports to be called in to the db query classes

📜 Testing Plan

Try patching these changes in a Rails console and see if they work

**Why**: Occasionally we see PG::UnableToSend errors at which point the connection stops receiving queries. In an attempt to recover the job and continue, we call #reconnect! on the connection. However, this would reset our session-specific timeouts, so there's a big of rejiggering to allow the connection timeout stuff from the reports to be called in to the db query classes

app/services/db/monthly_sp_auth_count/total_monthly_auth_counts_within_iaa_window.rb

zachmargolis · 2022-10-06T21:25:10Z

app/jobs/reports/base_report.rb

    # We use good_job's concurrency features to cancel "extra" or duplicative runs of the same job
    discard_on GoodJob::ActiveJobExtensions::Concurrency::ConcurrencyExceededError

+    def self.transaction_with_timeout(rails_env = Rails.env)


rubocop got mad this was after the private keyword because static methods are public

zachmargolis · 2022-10-06T21:25:52Z

app/jobs/reports/base_report.rb

      Time.zone.now.end_of_day
    end

-    def report_timeout


we never overrode this timeout in any instances, so I just removed it by inlining it

[skip changelog]

mitchellhenke

Would switching from streaming results back to not streaming change anything regarding this?

zachmargolis · 2022-10-06T22:09:47Z

Would switching from streaming results back to not streaming change anything regarding this?

you know, at this point it's worth a shot. the streaming queries use some more low-level connection method so they're more likely to mess up the connections, I can try switching to an all-at-once and see if things retry better

zachmargolis · 2022-10-06T23:32:01Z

Would switching from streaming results back to not streaming change anything regarding this?

Ok so it took almost 1.5 hours, but when I ran the changes in this PR manually in the console, they worked. I'll merge this now because, at least it works. I'll have a follow-up PR where I remove streaming and see if it makes a difference in runtime

**Why**: Occasionally we see PG::UnableToSend errors at which point the connection stops receiving queries. In an attempt to recover the job and continue, we call #reconnect! on the connection. However, this would reset our session-specific timeouts, so there's a big of rejiggering to allow the connection timeout stuff from the reports to be called in to the db query classes * Move transaction_with_timeout to be static [skip changelog]

zachmargolis requested review from jmhooper, mitchellhenke and stevegsa October 6, 2022 21:17

zachmargolis commented Oct 6, 2022

View reviewed changes

app/services/db/monthly_sp_auth_count/total_monthly_auth_counts_within_iaa_window.rb Outdated Show resolved Hide resolved

Move transaction_with_timeout to be static

18bf999

zachmargolis commented Oct 6, 2022

View reviewed changes

Add changelog

8e19552

[skip changelog]

mitchellhenke approved these changes Oct 6, 2022

View reviewed changes

zachmargolis merged commit dc57be3 into main Oct 6, 2022

zachmargolis deleted the margolis-try-to-recover-big-report-queries branch October 6, 2022 23:32

zachmargolis mentioned this pull request Oct 7, 2022

Remove query streaming for big reports #7121

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update long-running report queries to reset connection#7104

Update long-running report queries to reset connection#7104
zachmargolis merged 3 commits intomainfrom
margolis-try-to-recover-big-report-queries

zachmargolis commented Oct 6, 2022

Uh oh!

Uh oh!

zachmargolis Oct 6, 2022

Uh oh!

zachmargolis Oct 6, 2022

Uh oh!

mitchellhenke left a comment

Uh oh!

zachmargolis commented Oct 6, 2022

Uh oh!

zachmargolis commented Oct 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zachmargolis commented Oct 6, 2022

🎫 Ticket

🛠 Summary of changes

📜 Testing Plan

Uh oh!

Uh oh!

zachmargolis Oct 6, 2022

Choose a reason for hiding this comment

Uh oh!

zachmargolis Oct 6, 2022

Choose a reason for hiding this comment

Uh oh!

mitchellhenke left a comment

Choose a reason for hiding this comment

Uh oh!

zachmargolis commented Oct 6, 2022

Uh oh!

zachmargolis commented Oct 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants