Skip to content

LG-13191 rescue faraday servererror#10533

Merged
svalexander merged 7 commits intomainfrom
shannon/lg-13191-update-network-errors
May 2, 2024
Merged

LG-13191 rescue faraday servererror#10533
svalexander merged 7 commits intomainfrom
shannon/lg-13191-update-network-errors

Conversation

@svalexander
Copy link
Copy Markdown
Contributor

@svalexander svalexander commented Apr 30, 2024

🎫 Ticket

Link to the relevant ticket:
LG-13191

🛠 Summary of changes

This pr rescues Faraday::ServerError and logs to our event channel when this error is encountered when running the UspsAuthTokenRefreshJob. Making this change will keep us from alerting the #login-alarms channel when this error is encountered.

📜 Testing Plan

Provide a checklist of steps to confirm the changes.

  • Specs run successfully

@svalexander svalexander marked this pull request as draft April 30, 2024 21:00
NETWORK_ERRORS = [
Faraday::TimeoutError,
Faraday::ConnectionFailed,
Faraday::ServerError,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this to address the UspsAuthTokenRefreshJob errors we've been seeing? If so, that job does not reference this constant so it won't have any effect

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok i think i;m missing something then. The def I found for warning_error_classes referenced NETWORK_ERRORS but i see now i didn't carefully check which job was using that. Where should i be looking instead?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

directly in that job:

rescue Faraday::TimeoutError, Faraday::ConnectionFailed => err

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alternatively, the UspsAuthTokenRefreshJob could override its warning_error_classes and implement its own list, similar to the RiscDeliveryJob:

def self.warning_error_classes
NETWORK_ERRORS + [RedisRateLimiter::LimitError]
end

@@ -86,7 +86,7 @@
end
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't select it, but I'd expect the test above to error -- if we're catching a ServerError, this should fail:

        expect do
          subject.perform
        end.to raise_error

because there is no error being thrown

Copy link
Copy Markdown
Contributor Author

@svalexander svalexander May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see what you're saying, it is passing though which is curious. I;m trying to figure out why this is happening.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you take off the expect { ... }.to_not raise_error you see that it's throwing an error for a method expectation based on a different logging method called.

so it's technically still throwing an error and the spec technically passes.

If the test was originally written to say expect { ... }.to_not raise_error(Faraday::ServerError) then it would have been a clearer failure here

Copy link
Copy Markdown
Contributor Author

@svalexander svalexander May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when i update the test to ...to_not raise_error(Faraday::ServerError) to check what happens it passes

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It passes with to_not raise_error(Faraday::ServerError) because gets this error instead from the stub

       #<InstanceDouble(Analytics) (anonymous)> received unexpected message :idv_usps_auth_token_refresh_job_network_error with ({:exception_class=>"Faraday::ServerError", :exception_message=>"the server responded with status 500"})

I pushed 8401797 to switch out the instance_double for a FakeAnalytics instance which is a better fake logger and has clearer spec failures when the rescue for Faraday::ServerError is removed

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ahh ok that makes sense! i think we still need the completion log, I'll add it in. Thanks for making this change!


context 'auth request throws network error' do
[Faraday::TimeoutError, Faraday::ConnectionFailed].each do |err_class|
[Faraday::TimeoutError, Faraday::ServerError, Faraday::ConnectionFailed].each do |err_class|
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zachmargolis I'm wondering how i can test this to confirm that #login-alarms wouldnt be alerted

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alarms go off when the jobs have unhandled exceptions, so by handling the exception (and the subject.perform doesn't throw) then the alarms won't go off

@svalexander svalexander marked this pull request as ready for review May 1, 2024 21:10
@svalexander svalexander changed the title add faraday servererror to network errors LG-13191 add faraday servererror to network errors May 1, 2024
@svalexander svalexander changed the title LG-13191 add faraday servererror to network errors LG-13191 rescue faraday servererror May 1, 2024
@svalexander svalexander requested review from a team and gina-yamada May 1, 2024 21:14
@svalexander svalexander merged commit ddcde41 into main May 2, 2024
@svalexander svalexander deleted the shannon/lg-13191-update-network-errors branch May 2, 2024 14:24
@jmdembe jmdembe mentioned this pull request May 7, 2024
samathad2023 pushed a commit that referenced this pull request May 11, 2024
* add faraday servererror to network errors

* move error to appropriate job

* update refresh job spec to include server error

* Use FakeAnalytics instead of stubbing to get clearer specs

* log job completed

* changelog: Internal, UspsAuthTokenRefreshJob, rescue Faraday::ServerError

---------

Co-authored-by: Zach Margolis <zachary.margolis@gsa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants