-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make all outputs in "streaming" format #806
Conversation
python/src/aiconfig/editor/client/src/components/EditorContainer.tsx
Outdated
Show resolved
Hide resolved
python/src/aiconfig/editor/client/src/components/EditorContainer.tsx
Outdated
Show resolved
Hide resolved
config: serverConfigRes.aiconfig, | ||
}); | ||
} | ||
const enableStreaming = isStreamingSupported( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we change isStreamingSupported
to something like getModelSettingsStream
or something similar, and have it return either a boolean or undefined? Thinking here is:
- by default, models that support streaming but don't have 'stream' value set in the settings should still stream in editor to have the best UX by default
- if user explicitly states 'no' for stream value in settings, then we shouldn't stream
- our server implementation can technically support 'stream' style response for all models (so we can handle the undefined 'enableStreaming' param in Editor.tsx
runPrompt
function by defaulting to true. But other external implementations might not, so they can handle the undefined case however they want
python/src/aiconfig/editor/client/src/components/EditorContainer.tsx
Outdated
Show resolved
Hide resolved
python/src/aiconfig/editor/client/src/components/EditorContainer.tsx
Outdated
Show resolved
Hide resolved
c08f51b
to
bf9b17e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just two minor things
python/src/aiconfig/editor/client/src/components/EditorContainer.tsx
Outdated
Show resolved
Hide resolved
python/src/aiconfig/editor/client/src/components/EditorContainer.tsx
Outdated
Show resolved
Hide resolved
Doing this to make it easier for Ryan to parse with a singular unified output format, regardless of whether the model parser actually supports streaming or not. - moved `runPromptStream` -> `runPrompt`, overriding the old definition of `runPrompt` by now passing in an `enableStreaming` flag - still relying on `isStreamingSupported()` function to set the `enableStreaming` param to true or false - default to `stream` param being `True` on backend server (however this has no effect for non-streaming models like Dall-E) ## Test Plan Test that both streaming and non-streaming settings work as expected, as well as test that a model which does not support streaming (ex: Dall-E) still works. Now when user hasn't explicitly clicked the "stream" setting, it will default to streaming. However if they explictly toggle it turns off. Follow up we should have the "stream" button auto-enabled to reflect this (doesn't have to actually be in the config, we should just have the UI show it as on by default to match user expectation) Update: Updated this so that the default value for `stream` is now `true` inside of `OpenAIChatModelParserPromptSchema`, `HuggingFaceTextGenerationParserPromptSchema` and `AnyscaleEndpointPromptSchema`. Couldn't see it defined in `PaLMTextParserPromptSchema` https://github.com/lastmile-ai/aiconfig/assets/151060367/34214a66-0cea-4774-a917-9476359f172c
Before we used to not support streaming, so when we would return `aiconfig` it would be from a blocking hanging operation. This meant that we needed to set `isRunning` prompt state to be true while we were waiting, but now we don't need to do that anymore after we migrated all run events to return in streaming response format, even for non-streaming models: #806 Also we are now no longer using the `streamApi` helper since we added and are now using `streamingApiChain`, which was added in #789 Finally, if you want more resources on how streaming is connected, you can check out #910 which is a teaching guide I built for adding explaining how the code is connected ## Test Plan Both streaming and non-streaming models work as before
Before we used to not support streaming, so when we would return `aiconfig` it would be from a blocking hanging operation. This meant that we needed to set `isRunning` prompt state to be true while we were waiting, but now we don't need to do that anymore after we migrated all run events to return in streaming response format, even for non-streaming models: #806 Also we are now no longer using the `streamApi` helper since we added and are now using `streamingApiChain`, which was added in #789 Finally, if you want more resources on how streaming is connected, you can check out #910 which is a teaching guide I built for adding explaining how the code is connected ## Test Plan Both streaming and non-streaming models work as before https://github.com/lastmile-ai/aiconfig/assets/151060367/b62e7887-20af-4c0c-ab85-eeaacaab64e0
…911) Delete `aiconfig_complete` stream response, replace with `aiconfig` Before we used to not support streaming, so when we would return `aiconfig` it would be from a blocking hanging operation. This meant that we needed to set `isRunning` prompt state to be true while we were waiting, but now we don't need to do that anymore after we migrated all run events to return in streaming response format, even for non-streaming models: #806 Also we are now no longer using the `streamApi` helper since we added and are now using `streamingApiChain`, which was added in #789 Finally, if you want more resources on how streaming is connected, you can check out #910 which is a teaching guide I built for adding explaining how the code is connected ## Test Plan Both streaming and non-streaming models work as before https://github.com/lastmile-ai/aiconfig/assets/151060367/b62e7887-20af-4c0c-ab85-eeaacaab64e0 --- Stack created with [Sapling](https://sapling-scm.com). Best reviewed with [ReviewStack](https://reviewstack.dev/lastmile-ai/aiconfig/pull/911). * #912 * __->__ #911
Make all outputs in "streaming" format
Doing this to make it easier for Ryan to parse with a singular unified output format, regardless of whether the model parser actually supports streaming or not.
runPromptStream
->runPrompt
, overriding the old definition ofrunPrompt
by now passing in anenableStreaming
flagisStreamingSupported()
function to set theenableStreaming
param to true or falsestream
param beingTrue
on backend server (however this has no effect for non-streaming models like Dall-E)Test Plan
Test that both streaming and non-streaming settings work as expected, as well as test that a model which does not support streaming (ex: Dall-E) still works. Now when user hasn't explicitly clicked the "stream" setting, it will default to streaming. However if they explictly toggle it turns off. Follow up we should have the "stream" button auto-enabled to reflect this (doesn't have to actually be in the config, we should just have the UI show it as on by default to match user expectation)
Update: Updated this so that the default value for
stream
is nowtrue
inside ofOpenAIChatModelParserPromptSchema
,HuggingFaceTextGenerationParserPromptSchema
andAnyscaleEndpointPromptSchema
. Couldn't see it defined inPaLMTextParserPromptSchema
ef5ecba9-baae-4797-98ff-c1b1a834834d.mp4