-
-
Notifications
You must be signed in to change notification settings - Fork 21.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VoIP: streaming microphone input to other players? #18133
Comments
Thoughts: invidual players' audio-in streams would be broadcast onto the network to the game server which would then be responsible for routing the streams where they need to go. This would operate on new NetworkAudioStream 'channels' which AudioStreamPlayer nodes would dial into in order to subscribe to them in order to receive anything from the game server. A player's audio-in stream could be routed to one or more NetworkAudioStream channels simultaneously, permitting them to be radioing their team through an established team-radio NetAudioStream while also able to be overheard and eavesdropped on by anybody nearby their actual scene player node - which would have another AudioStreamPlayer2D/3D that's streaming from the NetAudioStream that represents that player's "voice" in the world that emanates from their character. If multiple player audio streams are trying to overwrite eachother on something like a shared team-radio NetAudioStream perhaps there could be an option to mix them together, or just have a first-come-first-serve so that it locks on to whoever sends audio first (i.e. pushes transmit button on their radio and whatnot). |
I kinda already made a hacky implementation. the thing that really annoys me that I need multiple network stack because the current implementation of enet doesnt really allow a way to manage multiple ports. I solve the NetworkAudiostream by making it into a lockless buffer etc and making one per character you should look into godot servers to handle the network on a seperate thread. https://godotengine.org/article/why-does-godot-use-servers-and-rids |
You have a choice of using captain proto or google protobuf for your VOIP protocol. I am debating the merits of varint. The lead dev of captain proto who made protobuf said he regretted the varint. https://capnproto.org/news/2014-06-17-capnproto-flatbuffers-sbe.html |
I'm having a hard time following. Are you talking about working in and recompiling the engine, or a plugin? Or worse? Could you provide more details? It sounds like you've made things a bit more complicated than they need to be. LibOpus is already in the engine's codebase, so you can get the best possible speech compression without adding anything. Just tack compressed speech data to the end of all outgoing game state packets, and you eliminate the need for any extra ports, etc.. |
You can add your own VOIP protocol as a godot module. The annoying part is that the network code is not designed to support two different protocols at the same time. Since the scenetree holds the network and takes control of the enet peer. for opus, you just need to include the headers and opus should work. Opus only support 48000 sample rate and intervals of 120 frame size. |
Has there been any development on this? Is there any hope that Godot will get a voice chat plugin anytime soon? |
@Aranir I kinda already wrote one, but it basically a mumble copy and i have to refactor it as a godot server. I waiting for the mic My main issue is that I cannot put multiple streams in the current godot network implementation so I must use multiple connections which is wasteful and crashy. |
With waiting for the mic you are referring to #19106 ? For the multiple streams, is that a known limitation which will be fixed before 3.1? |
@Aranir I can make a mic with SDL2. The mic is a non issue. The larger issue is that I cannot use the highlevel network implementation as is. |
@hungrymonkey are there any plans that this will be possible? I couldn't find anything on the roadmap... Or will #18827 solve it? |
@Aranir I already been playing around with my own VOIP for awhile now. The largest issue is that I need to refactor mine into a godot server and the current network implementation does allow me to split channels for my own use. |
@Aranir I have to evaluate the change. The way the current networking work is that ScreneTree practically dequeues all network packets. I cannot use that behavior because I need packets to be queued in their own channel buffer which isnt the case right now. |
I am in the process of refactoring my old code to make it work with the new multiplayerapi https://github.com/hungrymonkey/godot/tree/up_voip It doesnt work right now, I have to figure out if the packets are queue properly etc. |
I asked reduz, he put this feature on the back burner until the microphone api is done. |
@hungrymonkey Hi, what's the status of this? Have you checked it now that the microphone API is implemented? |
@marcelofg55 I did check. I kinda like a more signal based approach. Oh well, I will have to do buffering in my tree. I kinda asked Reduz awhile ago and he basically said to wait awhile. I haven't really done much since then. I wouldn't mind refactoring up but I really do not know if anyone is interested at all. I kinda need to figure out a nice way to associate N pid and N audiostreams. |
I got stuck trying to access the input from the record bus effect. Any suggestions? My first approach is to make the record bus effect expose a ring buffer, but I don't know. |
@fire I just made audiostream a lockless ring buffer. I think this approach is less complicated since it ingrates with current godot design.
|
Well it's cool if it's possible to record a mic at a certain game location and mix in a bit of the ingame background, but that's a bit tricky. |
Like what type of mix? |
@marcelofg55 Sure, i guess will learn a bit more. I probably need to register an irc nickname. |
#19106 looks like mic api is settled for 3.1 |
@hungrymonkey how do you write to an audio stream? |
@Seabass247 I wrote a guide |
Hello! I am trying to create a simple VoIP for a game I am making, and I am not quite following your custom audio streams guide.
Any help would greatly be appreciated! Or if you have a project I can jump on to help build so I don't have to build it from scratch, I would appreciate that as well. =) Thanks! |
In order to create a VOIP module, you subclass AudioStream into a custom single producer-single consumer lockless buffer. When microphone data is created, you compress it with opus and sent it to as a generic internet packet to the other players in your custom defined protocol. When the packet is received, you match the packet data to the correct custom Audiostream. You should refer to my Godot server guide to make the VOIP module on a separate thread. My largest annoyance was that Godot networking does not have the ability to connect to multiple ports with one peer class. You end up doing strange logic trying to match the game peer and VOIP peer. |
Hi! I'm currently investigating about compress the audio input of the microphone to opus and sent that info to the network. I really don't know where to start. Thanks in advance! |
Opus is an audio codec. Codec = COmpression DECompression. Just pass the
raw audio PCM data into Opus to get the compressed stream data that you
then send off over the network. Just make sure that you're buffering enough
received audio long enough to prevent any network transmission
jitter/irregularity from causing gaps between received audio chunks. You'll
always get a gap here and there no matter what, but try to get the gaps
reduced for the most part by queuing up received chunks for playback at a
consistent interval that's offset by a few hundred milliseconds after the
initial audio chunk was received. There's already a delay incurred by the
time spend recording the chunk on the sender's side (which is determined by
size of your chunks, which is determined by interval you send out audio
chunks) and then the network transmission latency too, plus the buffering
delay you're adding ontop of that to smooth out the jitter. Lets say you're
buffering for 150ms and you have the first chunk for a stream arriving at
29.300sec you'd play it at 29.450s. Say chunks are send out every 50ms, and
are therefore 50ms long, you should be able to queue up at least 2 more
chunks while having a 3rd one already playing that is the oldest in the
queue. Ideally you'd receive the next two chunks at ~29.350 and ~29.400,
and have a 4th chunk arriving right as you're playing back the Opus-decoded
1st chunk (decode on receive and queue up for playback). I'd create a
"voicecast" container object for the queue whenever a player starts
broadcasting audio, to keep track of timing and playback chunks. It is
destroyed upon receipt of last chunk, which would include a "destroy" flag,
or auto-destroy if a last chunk is never received for more than a few
chunks' worth of delay. You could send a timestamp with each chunk and base
your buffering delayed playback timing on that but it really doesn't have
to be that complicated, just automatically assume all chunks should play at
times derived from the initial chunk's arrival - and instantiation of the
voicecast object. Also, keep track of player IDs responsible for the
inbound audio chunks! You could just store your voicestream container
object in your player objects to keep it organized.
You could initially just play chunks as they're received to get things
up-and-running then go over everything, which will be poppy and glitchy
sounding but it should work for the most part, and then do a 2nd pass on
your code and add in the queuing/buffering. Or if you feel comfy enough to
plan it out and do it all at once go ahead and do that. Anyway, hope
something here helps, good luck!
…On Wed, Jan 15, 2020 at 11:14 AM stranker ***@***.***> wrote:
Hi! I'm currently investigating about compress the audio input of the
microphone to opus and sent that info to the network. I really don't know
where to start.
I know that I can get the input (or recording) from the micro using an
AudioEffectRecord and that gave me an AudioStreamSample. But in order to
compress that data where I need to do it?
Thanks in advance!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#18133?email_source=notifications&email_token=ACYGB5LCG454A2NRPMEMIIDQ55OBRA5CNFSM4E2D5D6KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJBO6MQ#issuecomment-574811954>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACYGB5MYEVGBQOWXQW377WLQ55OBRANCNFSM4E2D5D6A>
.
|
Thanks for your quick response! My main issue now is where I need to compress the raw data that I get from the AudioRecordEffect. I know that get_recording() retrieves a sample with all the data but it depends on the audio format (8bit, 16bit, IMA_ADPCM @Not implemented). So my question is, I need to create a new FORMAT_OPUS on AudioStreamSample in order to compress that audio? Or I missing something? Thanks in advance! |
Any progress? I am working on a project which would need voice chat (also a multiplayer VR game) |
@Wapit1 As far as I know, nobody is currently working on a VoIP implementation. |
@Wapit1 Just a fair warning. I have experience with an annoyance with enet manages only one socket. When you load your VOIP Godot server, you will end up syncing between two different game ID. You migrate it by forking Godot Enet and make it open two different sockets |
@hungrymonkey how do you make a VOIP server ? any link to documentation |
https://docs.godotengine.org/en/3.1/development/cpp/custom_godot_servers.html |
No one made a working Voice chat system before for Godot ? |
@Wapit1 , I made a basic 1 channel VOIP work but I end up disliking my implementation because I was creating two network trees and syncing data between them. It made both the GDScript and the C++ code ugly. |
@hungrymonkey it is still better than no VOIP, |
@Wapit1 I have to clean it up with the addition of the mic. I did not mind OSS the project at the time I made it but Reduz was not that interested a few years ago. I will need to find time to clean it up to make it remotely acceptable. |
Has anyone investigated wrapping an existing VOIP service using GDNative? This looks like it has a pretty generous Free tier and fully cross-platform SDK: I might take some time and investigate this to see if it could be integrated with Godot. |
just gonna leave that here... just sayin' probably would make sense to just interface the standardized non-proprietary cross platform open source voip service that's already being used anyway. and wrapping the required libwebrtc / javascript functions doesn't seem too impossible of a task |
My quick approach for getting VoIP into my multiplayer project with spatial 3d audio output support for now is to use the godot python binding together with https://github.com/spatialaudio/python-sounddevice/ (which is some portaudio wrapper) and https://github.com/orion-labs/opuslib (thin wrapper around libopus) A rather simple python class represented as a node in godot is responsible for getting raw microphone data from the portaudio wrapper and directly feeds it into libopus result in some chunk of encoded opus data. This data can then just be used by some node with a gdscript on it and transfer this opus data to remote machines using a rpc_unreliable call (to keep latency low in my use case). On the remote machines the data is fed again into some node with a python script on it that internally calls libopus and fills a PoolRealArray, this data is then used to drive a AudioStreamGenerator. This was just some prototype implementation but seems work good enough for now that I consider just keeping it like this and just to carry the python overhead with me instead of diving into how to do this with gdnative. This is the actual essential gdscript code that's left with this approach to enable this voip functionality:
From this approach I would conclude that it just takes some little changes to what's exposed via the API to gdscripts to allow implementing VoIP using just very simple gdscript code like this. If one could for example just get encoded opus* audio from any audio bus similar like the record effect works, sending these data chunks using some rpc call and pushing that into some AudioStreamOpus* playback would be very simple to use imho. *Replace opus with any other audio codec suitable for this approach or even allow sending raw audio data. |
On another note: |
We are moving proposals to the Godot proposals repository. There's already a proposal for this feature godotengine/godot-proposals#870 and it had this issue linked. Any further discussion should be moved there. |
I'm working on a multiplayer VR game in Godot and one of the must-have features when being in the presence of other players in a virtual world is being able to speak directly to one-another. I see that Godot still lacks microphone support, but it also lacks streaming any kind of audio over the network.
I've been mulling over what the best way this functionality should fit into Godot's existing audio and network interfaces. In many instances players could have an AudioStream associated with them, or perhaps a new type: a NetAudioStream, and then their physics body would have a child AudioStreamPlayer3D to emit their voice. There's also the possibility of players radioing their team or other players, which means non-spatial playback of a network-inbound audio stream.
Also, there's the problem of how devs want this to actually work. Should there be more fine-grained control? Should there even be any of this high-level stuff or should there just be microphone input, and audio compression capability which devs would manually transmit the output from across the network?
I'm still learning how Godot does things, so I'm a bit fuzzy as to what the best way this functionality should be exposed, but judging by what I've read in the docs so far it looks like it would fit right into the existing audio system really well, and then it's just a matter of how devs want players to communicate (i.e. in person, via radio, with their team only, with everybody, etc). I'm just wondering what would best support the largest variety of possibilities. Maybe I want to add a radio static effect or distortion over a player's incoming 'radio' voice stream, based on some raytrace I did that determined there's a bunch of stuff in the way causing interference.
Thanks!
The text was updated successfully, but these errors were encountered: