You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
As I was working on writing new error handling code for release asset downloading and trying to reuse the existing error handling for that purpose, I noticed that the _get_response and associated functions seem to need some significant refactoring.
For example, it looks like they were meant to be written to collect errors during the retry process, but that's not how they're being used in practice.
For example _request_http_error takes in an errors list which is also returns, but it never actually adds any errors to it. Then in _get_response, there is also an errors array (which is passes to _request_http_error presumably to collect any errors) and also returns, but it never gets added to there either.
Then _request_url_error (which is called whenever a URLError or socket.error happens in _get_response) calls log_error which immediately exits the script before it can return. Then if it could return, _get_response would immediately raise an exception which is not caught, so the script would die before any error handling could happen anyway.
Due to this, there are many cases of errors that should be recoverable that will kill the script, which isn't great if using it as a "set it and forget it" cron job.
I started refactoring the asset downloading to share those same error handling functions, but then those needed refactoring, which meant retrieve_data_gen needed refactoring (it seems the real error handling is actually happening there right now in the cases the script doesn't die first).
I'm going to put this aside for the moment so I can look at it with a fresh head later and work on one of the other simpler tickets I posted for the time being then come back to this.
Ideally, after refactoring this to better handle errors, a command line switch (maybe something like --strict-error-handling or something) should be added to allows users to choose to exit on errors if they want while defaulting to gracefully handling errors, logging what failed, and continuing. To help with that, it may also be worthwhile generating a separate error log file that is stored somewhere that includes only failed items so they can be retried.
Also as mentioned in #73, we should no longer log empty wikis as warnings/errors as it gives the impression of errors when in reality there was no error (it just tried to clone the wiki and it didn't exist, not that it failed to clone a repo).
As I was working on writing new error handling code for release asset downloading and trying to reuse the existing error handling for that purpose, I noticed that the
_get_response
and associated functions seem to need some significant refactoring.For example, it looks like they were meant to be written to collect errors during the retry process, but that's not how they're being used in practice.
For example
_request_http_error
takes in an errors list which is also returns, but it never actually adds any errors to it. Then in_get_response
, there is also an errors array (which is passes to_request_http_error
presumably to collect any errors) and also returns, but it never gets added to there either.Then
_request_url_error
(which is called whenever a URLError or socket.error happens in_get_response
) calls log_error which immediately exits the script before it can return. Then if it could return,_get_response
would immediately raise an exception which is not caught, so the script would die before any error handling could happen anyway.Due to this, there are many cases of errors that should be recoverable that will kill the script, which isn't great if using it as a "set it and forget it" cron job.
I started refactoring the asset downloading to share those same error handling functions, but then those needed refactoring, which meant
retrieve_data_gen
needed refactoring (it seems the real error handling is actually happening there right now in the cases the script doesn't die first).I'm going to put this aside for the moment so I can look at it with a fresh head later and work on one of the other simpler tickets I posted for the time being then come back to this.
Ideally, after refactoring this to better handle errors, a command line switch (maybe something like
--strict-error-handling
or something) should be added to allows users to choose to exit on errors if they want while defaulting to gracefully handling errors, logging what failed, and continuing. To help with that, it may also be worthwhile generating a separate error log file that is stored somewhere that includes only failed items so they can be retried.Also as mentioned in #73, we should no longer log empty wikis as warnings/errors as it gives the impression of errors when in reality there was no error (it just tried to clone the wiki and it didn't exist, not that it failed to clone a repo).
Related to #129, #110, and #130
Also likely related to #140
The text was updated successfully, but these errors were encountered: