Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

retry openaicompatible requests if invalid content received #761

Merged
merged 6 commits into from
Jul 1, 2024

Conversation

leondz
Copy link
Collaborator

@leondz leondz commented Jun 28, 2024

came back to a run and found this:

2024-06-27 20:16:29,399  DEBUG  response_closed.started
2024-06-27 20:16:29,399  DEBUG  response_closed.complete
2024-06-27 20:16:29,400  DEBUG  HTTP Response: POST https://integrate.api.nvidia.com/v1/chat/completions "202 Accepted" Headers([('date', 'Thu, 27 Jun 2024 18:16:29 GMT'), ('content-length', '0'), ('connection', 'keep-alive'), ('nvcf-reqid', '69f6d2a5-6564-4178-921f-bbe2fe5874dc'), ('nvcf-status', 'pending-evaluation'), ('referrer-policy', 'no-referrer'), ('strict-transport-security', 'max-age=31536000 ; includeSubDomains'), ('vary', 'Origin'), ('vary', 'Origin'), ('vary', 'Access-Control-Request-Method'), ('vary', 'Access-Control-Request-Headers'), ('x-content-type-options', 'nosniff'), ('x-frame-options', 'DENY'), ('x-xss-protection', '0')])
2024-06-27 20:16:29,400  DEBUG  request_id: None
2024-06-27 20:16:29,400  DEBUG  Could not read JSON from response data due to <class 'json.decoder.JSONDecodeError'> - Expecting value: line 1 column 1 (char 0)
2024-06-27 20:16:29,635  DEBUG  close.started
2024-06-27 20:16:29,636  DEBUG  close.complete

This patch assumes JSON output failures are transient, and so catches them with backoff. I would ideally like this to be configurable on/off - it's easy to imagine cases where both having it disable or enabled could be unexpected/frustrating. Conditional decorators look like a pain in Python, though.

@leondz leondz added bug Something isn't working generators Interfaces with LLMs labels Jun 28, 2024
@leondz leondz marked this pull request as ready for review June 28, 2024 04:28
@leondz leondz requested a review from jmartin-tech June 28, 2024 05:15
@jmartin-tech
Copy link
Collaborator

One thought for disable or enabled may be to match the patterns in restGenerator used for backoff error codes and raise a local error type when a JSON parsing error occurs based on an instance variable. So the wrapping would error type to backoff based on:

try:
# ... existing model generation here ...
except JSONDecodeError as e:
    logger.exception(e)
    if self.retry_json:
        raise GarakOpenAIBackoff from e
    else:
        raise e

Comment on lines 182 to 188
if self.generator not in (
self.client.chat.completions,
self.client.completions,
):
raise ValueError(
"Unsupported model at generation time in generators/openai.py - please add a clause!"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this was just moved so no action to take at this time. Just an inquiry to think on how to approach.

How can this occur? in theory _load_client would raise an error not part of the backoff set if it fails to set self.generator.

Should we look for a way to enforce validation of this earlier in the run?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be caught before here - it's def not intended to be mutable after init (ignoring the load/clear client mechanic). I guess _load_client is a good place to check, yeah.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_load_client is a blank method in OpenAICompatible which seems a distracting place to put this check; the check is already there in the OpenAIAPI class. Moved it to __init__, after the first _load_client.

@leondz leondz merged commit 1b8f7b8 into main Jul 1, 2024
6 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jul 1, 2024
@leondz leondz deleted the bugfix/openai_jsondecode branch August 15, 2024 15:04
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working generators Interfaces with LLMs
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants