New speech-to-speech model gpt-realtime is now generally available.
My understanding is that this includes the following advantages:
- Higher performance and lower cost of use
- two new voices, Cedar and Marin
- Remote MCP server support
- Image Input
I tried using this new model, but there seem to be changes in the message structure, which will require some work.