Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Google assistants #301
add Google assistants #301
Changes from 1 commit
dad1024
74fd1e0
859d85a
f8d303b
7d0d240
61f60ed
2bc4f35
a1d34a5
9d6de3c
6f1f9e5
40d8a59
423c913
1e62e04
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to hard-code this then I would suggest a higher-value, as this is what most users will require.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the other way around: we want to hardcode
0.0
here, because that means determinism. If we learned one thing from trying to bring RAG to businesses is that they want to get exactly the same answer if they ask the same question twice. Of course we can't guarantee it since we don't control the model, but we can do our best to at least avoid sampling during generation.This is the same for all other assistants that we currently have
ragna/ragna/assistants/_anthropic.py
Line 52 in 1b53e62
ragna/ragna/assistants/_mosaicml.py
Line 43 in 1b53e62
ragna/ragna/assistants/_openai.py
Line 58 in 1b53e62
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even when streaming, the Google assistants return really large chunks and thus easily go over the default timeout. The new timeout is in line with what we use for our builtin assistants as well:
ragna/ragna/assistants/_api.py
Lines 19 to 22 in 1b53e62