Update long-running report queries to reset connection#7104
Update long-running report queries to reset connection#7104zachmargolis merged 3 commits intomainfrom
Conversation
**Why**: Occasionally we see PG::UnableToSend errors at which point the connection stops receiving queries. In an attempt to recover the job and continue, we call #reconnect! on the connection. However, this would reset our session-specific timeouts, so there's a big of rejiggering to allow the connection timeout stuff from the reports to be called in to the db query classes
app/services/db/monthly_sp_auth_count/total_monthly_auth_counts_within_iaa_window.rb
Outdated
Show resolved
Hide resolved
| # We use good_job's concurrency features to cancel "extra" or duplicative runs of the same job | ||
| discard_on GoodJob::ActiveJobExtensions::Concurrency::ConcurrencyExceededError | ||
|
|
||
| def self.transaction_with_timeout(rails_env = Rails.env) |
There was a problem hiding this comment.
rubocop got mad this was after the private keyword because static methods are public
| Time.zone.now.end_of_day | ||
| end | ||
|
|
||
| def report_timeout |
There was a problem hiding this comment.
we never overrode this timeout in any instances, so I just removed it by inlining it
[skip changelog]
mitchellhenke
left a comment
There was a problem hiding this comment.
Would switching from streaming results back to not streaming change anything regarding this?
you know, at this point it's worth a shot. the streaming queries use some more low-level connection method so they're more likely to mess up the connections, I can try switching to an all-at-once and see if things retry better |
Ok so it took almost 1.5 hours, but when I ran the changes in this PR manually in the console, they worked. I'll merge this now because, at least it works. I'll have a follow-up PR where I remove streaming and see if it makes a difference in runtime |
**Why**: Occasionally we see PG::UnableToSend errors at which point the connection stops receiving queries. In an attempt to recover the job and continue, we call #reconnect! on the connection. However, this would reset our session-specific timeouts, so there's a big of rejiggering to allow the connection timeout stuff from the reports to be called in to the db query classes * Move transaction_with_timeout to be static [skip changelog]
🎫 Ticket
(none)
🛠 Summary of changes
Why: Occasionally we see PG::UnableToSend errors at which point the connection stops receiving queries. NewRelic link
In an attempt to recover the job and continue, we call #reconnect! on the connection. However, this would reset our session-specific timeouts, so there's a big of rejiggering to allow the connection timeout stuff from the reports to be called in to the db query classes
📜 Testing Plan