Add support for recording and sending VoIP data #870

IoneGod · 2020-05-22T06:03:16Z

Describe the project you are working on:
Multiplayer shooting game

Describe the problem or limitation you are having in your project:
I am trying to send sound packets over the network to the other player so as to better my multiplayer game interaction

Describe the feature / enhancement and how it helps to overcome the problem or limitation:
Adding a Sound Recorder to record sounds and save them as temporary .wav or any other supported audio format files to send over the internet and other networks and can be useful for other purposes

Describe how your proposal will work, with code, pseudocode, mockups, and/or diagrams:
A SoundRecorder node would have able to save sounds like voice data which could be useful in sending greetings over a network by different players instead of having to type a message during game play , Voice data would also be useful in the authentication of users so as to prevent data manipulation by hackers and give better security for users

If this enhancement will not be used often, can it be worked around with a few lines of script?:
A few lines a code wouldn't cut it

Is there a reason why this should be core and not an add-on in the asset library?:
There are different platforms that have different sound recorder classes creating a sound recorder node will allow acess to all those platforms at once without having to rewrite the code for different platforms. A plugin wouldnt be able to manage that

Jummit · 2020-05-22T15:35:32Z

Current workflow for recording: https://docs.godotengine.org/en/latest/tutorials/audio/recording_with_microphone.html
Lengthy discussion about VoIP: godotengine/godot#18133
VoIP demo: https://github.com/cbarsugman/godot-voip-demo

Calinou · 2020-05-22T15:49:49Z

Audio recording is already supported since Godot 3.1. That said, it may not work on all platforms due to bugs (see godotengine/godot#33184).

nonchip · 2020-05-24T17:36:25Z

even though @Calinou reacted with a 😕 to my proposal over in the already mentioned lengthy discussion (without commenting, so not sure which part about it was confusing/bad, sorry, but i guess it might've been my not-so-professional wording as a result of reading through various overcomplicated proposed third party service bindings), i just wanna mention again, godot already successfully implements webrtc data channels for multiplayer on (afaik) all platforms, so it might be a good idea to just wrap the webrtc audio channels too and expose those to allow for VoIP.

Calinou · 2020-05-24T19:33:28Z

@nonchip WebRTC adds a significant amount of complexity on its own. I'd advise not relying on it unless you need to support HTML5 exports somehow. Most networked games don't need to support HTML5 exports, so I would prefer an easier to set up solution. (Do you have STUN/TURN servers at hand? To my knowledge, this is pretty much required for WebRTC.)

nonchip · 2020-05-24T20:29:14Z

@Calinou good point, might be worthy to take into consideration for the html5 export of the voip solution though.
also about stun/turn you technically don't need it but you really want to because it's a pain to do it in any other way. i'm running a spreed WebRTC service myself, that has it all included, but is pretty much meant as a "go to that website and start talking" kinda thing, i looked into setting up stun/turn manually and decided my sanity was more important :P

Wavesonics · 2020-05-28T17:05:27Z

I believe that in order to implement real time voice streaming, we're still missing one part of WebRTC: #813

That is of course if you want to do it using WebRTC

Wavesonics · 2020-05-29T18:49:30Z

I've been digging into this a little more recently. I implemented a VOIP demo similar to the one @Jummit linked, and that highlighted some of the issues here to me.

I have some time right now where I could probably get a real solution implemented, so I thought I'd get some input on what a real solution would actually look like.

Here is the breakdown of the problem as I see it:

1) Recording audio:

This was solved in 3.1. Maybe it could be made a little more friendly with something like a SoundRecorder node as @iapps207 suggested. But even if not, it does work as it exists today.

2) Sending the data over the wire:

There's a variety of ways we could accomplish this, and we probably don't want to be too prescriptive here. But the problem with how my demo works and the cbarsugman demo is that they are not truly streaming the audio. They record, then send the whole audio buffer. It's simple, but pretty terrible for real time communication. So here's the options as I see them:

A) Send via existing network methods (rset or rpc argument). Depending on how this is configured, all data will pass through the server on the way to each client. This as far as I can figure it, will work like my existing demo, and not be truly real time.
B) WebRTC MediaStreams: Not currently implemented yet #813. This is very prescriptive, but it would be extremely easy to setup, and using STUN/TURN would allow peer-to-peer when available, saving lots of bandwidth on the server.
C) WebRTC without MediaStreams: I haven't gone too deep here yet, but I can't see why we couldn't just use the existing WebRTC data channel. This just puts the burden on the sender and receiver to properly handle the data like audio.
D) Some sort of lower level system based on existing Godot networking, I don't think this would require any new features, but it would have to be able to work along side the existing high level multiplayer API in my opinion. I haven't tried to combine the high level API with low level networking, is it possible?

3) Encoding the data for transit:

All of this is a moot point at the moment, because the data returned from the Microphone API is (as it should be) a wav. Obviously the data will be far too large to use for a real-time VOIP application. And as it stands right now, there is no Audio encoder exposed to the scripting interface (or from my discussion with @Calinou even in the engine at all). So I think this is the first issue that must be addressed. From some research it looks like Opus is the best open codec for voice data, so I cloned their repo and have been poking around the docs.

My question to anyone here is: should we have libOpus in the engine it's self for this purpose? I would certainly lean toward yes, but I can see this being only for VOIP so maybe being too specific of a use case.

I asked around on the opus IRC channel, and it looks like the higher level opus libs are specifically for file access, or http streams. So we'll probably be stuck with just the base libOpus...

Last thing to note: If we do go with opus, it has the added advantage of being the codec used by all browsers for WebRTC Media Streams. So it might pair nicely with an implementation of that.

I'm definitely looking for feedback/suggestions. Am I totally off the mark on anything here? Is this even a thing people are interested in?

Wavesonics · 2020-06-02T07:36:00Z

Ok I've taken the past few days to starting figuring out libOpus.
What I've got here is a proof of concept, mainly a way for me to learn how Opus works, and how we might be able to integrate it with Godot.

Here is a proof of concept GDNative library wrapping libOpus:
libopus-gdnative

And here is a demo project using the gdnative library:
libopus-gdnative-demo (only compiled for windows x64 currently)

One issue I ran into is that libOpus only accepts a select few sample rates, and Godot's 44.1kH is not one of them. The closest Opus has is 48kH.

Godot has a bicubic resampler it looks like for playback: AudioStreamPlaybackResampled, but ideally we'd be able to resample the input from the Microphone. If that's possible with AudioStreamPlaybackResampled I haven't figured it out yet.

This was causing me some problems, until I found a great hack. I get 44.1kH audio from Godot's microphone API. Then I tell libOpus that this is in fact 48kH audio. The resulting compression is distorted due to this. Then on the decode side, libOpus produces the distorted 48kH audio, which I hand off to Godot, but I tell Godot it is actually 44.1kh, and thus it de-warps it xD

As great as that is, if we really wanted 1st class support for VOIP, I think we'd need the Microphone API to allow us to specify the sample rate, as well as mono VS stereo. There is no need for Stereo PCM data for a microphone. For VOIP anyway.

Lastly, as expected, the compression ratio is just fantastic. In simple demos, I was seeing greater than 100x size reduction over the raw PCM audio.

Wavesonics · 2020-06-02T18:20:10Z

Just some notes here from looking around at solutions to the Sample Rate Conversion problem.

The most common one I've found is: libsamplerate
Which is C and looks good over all. ~~The problem is it's GPL~~ (It switched to BSD 2 in 2018)

Here is a C++ library which is MIT license: r8brain

That might be a good option if any of this work was ever considered for inclusion in Godot.

Lots of into about sample rate libs here: https://ccrma.stanford.edu/~jos/resample/Free_Resampling_Software.html

Calinou · 2020-06-02T19:47:41Z

Doesn't Opus include its own resampler? I read that somewhere on a forum while searching for a solution to this specific issue.

Wavesonics · 2020-06-02T20:59:00Z

libOpus doesn't appear to? At least not that I could find looking through it's docs.

I think it's derivatives like opusenc or opusfile might.

Wavesonics · 2020-06-02T21:09:14Z

I was discussing this with iFire on discord, and he apparently had a PR which not only added libOpus support, but added the next component which I have not yet addressed: being able to actually stream the decoded audio into Godot's audio system.

fire/godot@37ec390

He said it was rejected, but we didn't have time to get into the details.

If any of the core contributors have any insight into what was wrong with it, how or if it could be changed to be acceptable I'd love to discuss it!

Lastly from my discussion with iFire, the existing AudioEffectRecord will probably not be ideal for streaming audio in it's current form as it does a large buffer re-allocation as part of it.

I'm going to pursue my current approach as a stop-gap (providing libOpus as a gdnative library) but that PR looks much more comprehensive starting point for providing true 1st class support for streaming VOIP.

Wavesonics · 2020-06-09T02:17:12Z

@Calinou found this in their FAQ, maybe opus-tools is what you had read about?

How do I use 44.1 kHz or some other sampling rate not directly supported by Opus?

Tools which read or write Opus should inter-operate with other sampling rates by transparently performing sample rate conversion behind the scenes whenever necessary. In particular, software developers should not use Opus Custom for 44.1 kHz support, except in the very specific circumstances outlined above.

Note that it's generally preferable for a decoder to output at 48kHz, even when you know the original input was 44.1kHz. This is not only because you can skip resampling, but also because many cheaper audio interfaces have poor quality output for 44.1kHz.

The opus-tools package source code contains a small, high quality, high performance, BSD licensed resampler which can be used where resampling is required.

So maybe their small BSD licensed resampler would be a good option:
https://opus-codec.org/release/dev/2018/09/18/opus-tools-0_2.html

Wavesonics · 2020-06-15T20:46:34Z

I polished up my addon and put it on the library here:
https://godotengine.org/asset-library/asset/650

It's certainly far from the real streaming solution we'd like to get to, but I'm using it to pretty good effect I think in my project.

The lag is obvious, but people seem to adjust to it pretty quickly, if you want to see what the experience is like my project is here: https://github.com/FugitiveTheGame/Fugitive3D/

On the path to true streaming audio, the next biggest blocking factor is the need for direct access to an audio buffer inside Godot which we can write frames to directly. @fire 's PR that I linked above looks like a good starting point to me, but I honestly haven't dug into that part of the issue much yet.

Transport of the audio is still an issue, but much more solvable in various ways.

IoneGod · 2020-06-25T20:30:48Z

@Calinou @Wavesonics @nonchip you guys have to see Vivox which is integratable into other game engines made by unity https://unity.com/products/vivox

Calinou · 2020-06-25T20:56:04Z

@iapps207 Vivox is a proprietary library, which is therefore unsuitable for inclusion in Godot. Nothing prevents a third-party from publishing a module for it though.

NEO97online · 2020-09-27T23:47:36Z

@Wavesonics have you made any discoveries on this front since then?

For reference, I found the relevant PR here with a little more details: godotengine/godot#35402
It includes some messages from @reduz on the topic, but that's about it.

Here's the original issue from @fire: #399

It seems they were closed in favor of a better implementation due to packet ordering and delays.

Wavesonics · 2020-10-02T00:48:40Z

@auderer no, I haven't spent any time on this recently. The opus plugin I released on the asset store allows some simple forms of voip to work. But the road block to true voip now is lower level access to an audio buffer like that PR you linked provides. For opus in particular, we need to stream individual opus packets, decompress them and insert them directly into an audio stream buffer. As far as I'm aware that's not possible with the current audio system implementation.

fire · 2021-01-30T20:07:52Z

Announcement: godotengine/godot#45593 was approved for voip usage in #2013.

Faless · 2021-07-23T17:44:00Z

The facilities to capture and process audio have been implemented via godotengine/godot#45593 , supporting a specific VoIP protocol is outside the scope of Godot core. It is now possible to create addons that implements VoIP via the new AudioEffectCapture API.

dreadpon · 2023-10-18T10:21:32Z

@Faless
Sorry to bring it up again after all this time, but do

facilities to capture and process audio have been implemented via godotengine/godot#45593

Also provide

the Microphone API to allow us to specify the sample rate, as well as mono VS stereo

Because afaik they don't.

I'm currently porting a project to 4.2, and was surprised to discover that my custom opus compression code performs slower than AudioEffectCapture fills up its buffer.
Now, I could of course lower the mix_rate in ProjectSettings, but that would affect all other audio we might use in the project, resulting in poor music/sfx quality.

While my compression code likely isn't ideal, I think reducing the amount of input data is one of the most important optimizations I can do. I surely don't need my input audio to be 44100/48000 kHz when I intend to lossy compress it anyway

Edit:
After some debugging I realized I overstated the importance of mix rate in my particular case, but I think this concern is still valid and might become an issue if someone would actually want to do any sort of processing on the input audio data
(the problem in my case was related to PackedByteArray.resize(), it became quite a bit slower in Godot 4.x)

Calinou added topic:audio topic:network labels May 22, 2020

Calinou changed the title ~~Sound Recorder~~ Add support for recording and sending VoIP data May 22, 2020

KoBeWi mentioned this issue May 28, 2020

VoIP: streaming microphone input to other players? godotengine/godot#18133

Closed

troy-lamerton mentioned this issue Jul 6, 2021

.OGG files are played back with rolled-off high frequencies (low-pass) godotengine/godot#49131

Closed

Faless closed this as completed Jul 23, 2021

Calinou added the archived label Jul 23, 2021

follower mentioned this issue Sep 6, 2021

Fix the "AudioEffectRecord" descriptions. godotengine/godot#52441

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for recording and sending VoIP data #870

Add support for recording and sending VoIP data #870

IoneGod commented May 22, 2020

Jummit commented May 22, 2020 •

edited

Loading

Calinou commented May 22, 2020 •

edited

Loading

nonchip commented May 24, 2020 •

edited

Loading

Calinou commented May 24, 2020

nonchip commented May 24, 2020

Wavesonics commented May 28, 2020 •

edited

Loading

Wavesonics commented May 29, 2020 •

edited

Loading

Wavesonics commented Jun 2, 2020 •

edited

Loading

Wavesonics commented Jun 2, 2020 •

edited

Loading

Calinou commented Jun 2, 2020 •

edited

Loading

Wavesonics commented Jun 2, 2020 •

edited

Loading

Wavesonics commented Jun 2, 2020

Wavesonics commented Jun 9, 2020

Wavesonics commented Jun 15, 2020

IoneGod commented Jun 25, 2020 •

edited

Loading

Calinou commented Jun 25, 2020 •

edited

Loading

NEO97online commented Sep 27, 2020

Wavesonics commented Oct 2, 2020 •

edited

Loading

fire commented Jan 30, 2021

Faless commented Jul 23, 2021

dreadpon commented Oct 18, 2023 •

edited

Loading

Add support for recording and sending VoIP data #870

Add support for recording and sending VoIP data #870

Comments

IoneGod commented May 22, 2020

Jummit commented May 22, 2020 • edited Loading

Calinou commented May 22, 2020 • edited Loading

nonchip commented May 24, 2020 • edited Loading

Calinou commented May 24, 2020

nonchip commented May 24, 2020

Wavesonics commented May 28, 2020 • edited Loading

Wavesonics commented May 29, 2020 • edited Loading

1) Recording audio:

2) Sending the data over the wire:

3) Encoding the data for transit:

Wavesonics commented Jun 2, 2020 • edited Loading

Wavesonics commented Jun 2, 2020 • edited Loading

Calinou commented Jun 2, 2020 • edited Loading

Wavesonics commented Jun 2, 2020 • edited Loading

Wavesonics commented Jun 2, 2020

Wavesonics commented Jun 9, 2020

Wavesonics commented Jun 15, 2020

IoneGod commented Jun 25, 2020 • edited Loading

Calinou commented Jun 25, 2020 • edited Loading

NEO97online commented Sep 27, 2020

Wavesonics commented Oct 2, 2020 • edited Loading

fire commented Jan 30, 2021

Faless commented Jul 23, 2021

dreadpon commented Oct 18, 2023 • edited Loading

Jummit commented May 22, 2020 •

edited

Loading

Calinou commented May 22, 2020 •

edited

Loading

nonchip commented May 24, 2020 •

edited

Loading

Wavesonics commented May 28, 2020 •

edited

Loading

Wavesonics commented May 29, 2020 •

edited

Loading

Wavesonics commented Jun 2, 2020 •

edited

Loading

Wavesonics commented Jun 2, 2020 •

edited

Loading

Calinou commented Jun 2, 2020 •

edited

Loading

Wavesonics commented Jun 2, 2020 •

edited

Loading

IoneGod commented Jun 25, 2020 •

edited

Loading

Calinou commented Jun 25, 2020 •

edited

Loading

Wavesonics commented Oct 2, 2020 •

edited

Loading

dreadpon commented Oct 18, 2023 •

edited

Loading