Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for WebSockets, streaming responses to a frontend [Feature Request]: #1199

Closed
tyler-suard-parker opened this issue Jan 10, 2024 · 20 comments · Fixed by #1551
Closed
Assignees

Comments

@tyler-suard-parker
Copy link
Contributor

Is your feature request related to a problem? Please describe.

I am using a Microsoft Teams App for a frontend, and hosting Autogen on my backend. When I send a message to Autogen from my frontend, Autogen does some processing and then sends back the final answer. This can take up to 2 minutes, which is too long for my customers. The vast majority of that time is taken up by GPT-4 generating an answer. I would like it if the final answer generated by Autogen could be streamed back to the frontend, using websockets or another protocol that is compatible with Microsoft Teams Apps. This would give my users something to look at immediately, rather than waiting a long time for the complete answer to pop up.

Describe the solution you'd like

An easy, plug-and-play solution for Microsoft Teams Apps, that allows Autogen to stream information to the Teams App.

Additional context

No response

@Risingabhi
Copy link

Hi @tyler-suard-parker ,
We are facing similar issue, there is a workaround though, you can try to do this.

    
    if sender.name == "INTERVIEWER" or sender.name =="candidate":
        print(f"hiii {sender.name}: {message.get('content')}")
        socket_io.emit(
        "message", {"sender": sender.name, "content": message.get("content")}
        )
    else:
    	print("Ignored")
        # pass
GroupChatManager._print_received_message = new_print_received_message

@app.route("/run", methods=['GET'])
def run():
    def new_get_human_input(self, prompt):
        reply = request.args.get("stock")
        print("debug reply line 81", reply)

        #reply = input(f'PATCHED{prompt}')
        return reply
        
    UserProxyAgent.get_human_input = new_get_human_input

    mychat =interviewer.initiate_chat(
        groupchat_manager, message="Hello, I am an AI MANAGER, my name is John Dalley, I will be conducting your interview today."
    )``` so basically you need to write a wrapper around "
GroupChatManager._print_received_message = new_print_received_message"

@tyler-suard-parker
Copy link
Contributor Author

Just a suggestion: by adding "TERMINATE" at the beginning of the last message, rather than at the end of the last message, we could know which message is the final one, so that message could be streamed to the frontend.

@tyler-suard-parker
Copy link
Contributor Author

@Risingabhi that is an interesting approach, I agree, it might be a good idea to modify the print statement to send messages on the websockets instead. Thank you!

@super-syan
Copy link

super-syan commented Jan 14, 2024

can anyone share code for how to implement streaming for external application.
can you explain briefly about your implementation @Risingabhi

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 15, 2024

@Risingabhi @tyler-suard-parker @super-syan we are refactoring and hoping to make it very easy to add streaming. See #1240 and please give your feedback and comments.

@tyler-suard-parker
Copy link
Contributor Author

@ekzhu I am not familiar with hooks, I was actually hoping for something more like SSE.

@ekzhu
Copy link
Collaborator

ekzhu commented Jan 16, 2024

What is SSE? As I mentioned in the other discussion #1290 , the goal of the refactoring effort is to modularize functionalities as middleware, so it becomes very easy to extend an agent with things like emitting a message to a frontend receiver.

@davorrunje
Copy link
Contributor

davorrunje commented Jan 16, 2024

@ekzhu I am not familiar with hooks, I was actually hoping for something more like SSE.

@tyler-suard-parker the hooks got replaced by Middleware pattern and this seems like the best design pattern to use. For SSE, you basically need a generator that can be easily implemented using the Middleware pattern. We implement it as well while implementing other features.

@tyler-suard-parker
Copy link
Contributor Author

@ekzhu Sorry about that, SSE is server-sent events.

@tyler-suard-parker
Copy link
Contributor Author

@Risingabhi Could you give some more details about your implementation? Where are you declaring the server, client, etc?

@tyler-suard-parker
Copy link
Contributor Author

I just tried @Risingabhi 's implementation, it does not work with streaming, it just sends the complete message to a socket, not the chunks.

@tyler-suard-parker
Copy link
Contributor Author

I found a solution for this. I am using a singleton class in a file in the autogen/oai directory. I import that singleton class into my top script, and I store my websocket as a variable in that class. I also modify autogen/oai/client.py to import that same script and that same websocket, which I can then use to stream data from the _completions_create function in that file.

@davorrunje
Copy link
Contributor

I think we need to implement this in the framework for everyone else.

@tyler-suard-parker
Copy link
Contributor Author

@davorrunje I agree and I would be happy to do it, but my use case is very specific. I am running on Azure Web Apps and returning a stream to an Azure Teams App. Also I am not quite sure how to disable the streaming to a UI if it is not needed.

@davorrunje
Copy link
Contributor

Let's keep it open and I think I might take it soon. Any help would be appreciated though :)

@GokulrajKS
Copy link

GokulrajKS commented Jan 22, 2024

Hi I have implemented streaming for my app checkout my repository you can disable and enable using use_socket Boolean.
Refer the simple _chat.py.
You have to pass the socket server instance as a call back. It still lacks some of the checks.but it is usable....

@davorrunje davorrunje self-assigned this Jan 22, 2024
@Risingabhi
Copy link

Risingabhi commented Jan 24, 2024 via email

@ahernandezq
Copy link

ahernandezq commented Jan 25, 2024

@Risingabhi
Let us know any news

@tyler-suard-parker
Copy link
Contributor Author

All, I finally bit the bullet and figured out how to do this, I just instantiated a websocket at the top level, and then passed that down as a parameter through every function between the top level and the _create_completion function in oai/client.py. It works!

@Risingabhi
Copy link

Risingabhi commented Jan 25, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants