[gpt-oss] use vLLM instead of openai types for streaming#25186
[gpt-oss] use vLLM instead of openai types for streaming#25186hmellor merged 6 commits intovllm-project:mainfrom
Conversation
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: Andrew Xia <axia@meta.com>
0c32be0 to
7b82dbf
Compare
Signed-off-by: Andrew Xia <axia@meta.com>
|
cc @alecsolder , @yeqcharlotte , @chaunceyjiang ready for review. i think the readthedocs CI failure is a flake |
|
maybe we need to think about whether allowing input_messages / output_messages for sync responses only or for streaming as well, appending long messages kills the efficiency that streaming provides cc @yeqcharlotte |
that's a good point! this PR only modifies events in streaming right now (ResponseCreatedEvent, ResponseInProgressEvent, ResponseCompletedEvent), so none of the deltas / intermediate events will not get additional payload. Or i could remove from ResponseCreatedEvent, ResponseInProgressEvent but maybe it's fine to keep it for now for consistency? |
yeqcharlotte
left a comment
There was a problem hiding this comment.
LGTM. could you add a basic skeleton tests that load the type. cc:
@chaunceyjiang @aarnphm PTAL too thanks!
| ResponseWebSearchCallCompletedEvent, | ||
| ResponseWebSearchCallInProgressEvent, | ||
| ResponseWebSearchCallSearchingEvent) | ||
| # yapf: enable |
There was a problem hiding this comment.
is the format change intentional
There was a problem hiding this comment.
yep, yapf/isort had a conflict, it seems like it happens pretty often...
There was a problem hiding this comment.
can we keep the original format?
There was a problem hiding this comment.
@yeqcharlotte we can't. The introduction of below makes the conflict, i tried to but there will be conflicts. You can see in serving_responses.py below # yapf conflicts with isort for this block happens often
from openai.types.responses import (
ResponseInProgressEvent as OpenAIResponseInProgressEvent)
yeqcharlotte
left a comment
There was a problem hiding this comment.
LGTM! thanks for the change!
Signed-off-by: Andrew Xia <axia@fb.com>
Head branch was pushed to by a user without write access
…t#25186) Signed-off-by: Andrew Xia <axia@meta.com> Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>
Signed-off-by: Andrew Xia <axia@meta.com> Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
…t#25186) Signed-off-by: Andrew Xia <axia@meta.com> Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com> Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
…t#25186) Signed-off-by: Andrew Xia <axia@meta.com> Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>
…t#25186) Signed-off-by: Andrew Xia <axia@meta.com> Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>
…t#25186) Signed-off-by: Andrew Xia <axia@meta.com> Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>
…t#25186) Signed-off-by: Andrew Xia <axia@meta.com> Signed-off-by: Andrew Xia <axia@fb.com> Co-authored-by: Andrew Xia <axia@fb.com>
Purpose
Test Plan
Test Result
client
ResponseCompletedEvent output (see that the input/output messages are there)
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.