Skip to content

Conversation

@chaitanyarahalkar
Copy link
Contributor

@chaitanyarahalkar chaitanyarahalkar commented Jun 13, 2025

This PR brings real-time text streaming to the desktop UI:

  1. useTextStreaming.ts – reusable React hook that opens a Server-Sent Events (SSE) connection, parses incremental tokens, and exposes a clean streaming API to components.
  2. GooseMessage.tsx – refactored to consume the new hook and render partial LLM responses as they arrive, giving users immediate feedback instead of waiting for the full completion.

Context & Motivation

Large responses that appear all at once feel sluggish and block the conversation. Streaming completions improve perceived performance and match the interactive feel of modern chat apps. This work lays the foundation for richer UX features like inline code execution progress, token-level highlighting, and cancel/resume.

Implementation Details

  • Utilises the Fetch + ReadableStream API wrapped in an ergonomic hook.
  • Hook manages:
    • automatic abort on unmount / cancel
    • exponential back-off retry (network hiccups)
    • optional final “done” callback
  • GooseMessage now:
    • starts streaming on mount
    • switches from “typing indicator” → live text as soon as the first chunk arrives
    • gracefully handles abort / error states

Testing

  • Tested with Databricks, Ollama as well as OpenAI using an OpenAI API key

Demo

Before

Screen.Recording.2025-06-12.at.8.47.51.PM.mov

After

Screen.Recording.2025-06-12.at.9.09.18.PM.mov

🔗 Related Issues / PRs

@michaelneale
Copy link
Collaborator

very nice @chaitanyarahalkar - didn't require changes to providers? cc @jamadeo - you like this approach? look sgood to me

@Kvadratni
Copy link
Contributor

finally. I was under the impression that this is impossible due to how goose loop works. I am so happy to be wrong!

@michaelneale
Copy link
Collaborator

one problem is this looks odd with tool results, let me attach a recording (using databricks + claude4):
(and also is only streaming once things are finished - as I guess provider is not streaming?)

streaming2.mov

@michaelneale
Copy link
Collaborator

I think clock time needs to be no slower than now for this to appear believable - and hopefully we could have real streaming from the providers?

Copy link
Collaborator

@michaelneale michaelneale left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for tool cases shouldn't be slower than current wall clock time - so probably requires streaming support fro provider end first?

@chaitanyarahalkar
Copy link
Contributor Author

chaitanyarahalkar commented Jun 13, 2025

I think for tool cases shouldn't be slower than current wall clock time - so probably requires streaming support fro provider end first?

Yup, I can add this fix here and probably in a separate PR add streaming from the backend (directly through the API) for models that support it. I think only OpenAI does it for now?

@chaitanyarahalkar
Copy link
Contributor Author

chaitanyarahalkar commented Jun 13, 2025

@Kvadratni ideally we need it natively from the LLM API, will take a crack at doing that for OpenAI models.

This is just perceptively doing it on the UI end

@chaitanyarahalkar
Copy link
Contributor Author

one problem is this looks odd with tool results, let me attach a recording (using databricks + claude4): (and also is only streaming once things are finished - as I guess provider is not streaming?)
streaming2.mov

yup, provider is not streaming for now

@jamadeo
Copy link
Collaborator

jamadeo commented Jun 13, 2025

This was my take on adding streaming: #2677 though I've let the PR get stale. I'll have merge and fix some conflicts. I started just with Databricks because it's what I typically use but most of the plumbing is there for any other provider too.

@chaitanyarahalkar
Copy link
Contributor Author

This was my take on adding streaming: #2677 though I've let the PR get stale. I'll have merge and fix some conflicts. I started just with Databricks because it's what I typically use but most of the plumbing is there for any other provider too.

Nice I'll try to build off the changes here and see if I can make it work for OpenAI's APIs

@DOsinga
Copy link
Collaborator

DOsinga commented Jul 16, 2025

sorry for taking so long - it should be fixed in main now! thanks!

@DOsinga DOsinga closed this Jul 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants