AudioStreamGeneratorPlayback delay #46490

ikbencasdoei · 2021-02-27T21:48:07Z

Godot version:
3.2.4-rc4

OS/device including version:
Windows 10

Issue description:
I'm currently developing a new version of my godot-voip plugin which uses the new AudioEffectCapture for real time voice input. (see: ikbencasdoei/godot-voip#7) To take advantage of this I switched from using a regular AudioStreamPlayer to using a AudioStreamGeneratorPlayback to play the voice input in real time. However this introduced a significant amount of latency even when used locally. This does not seem like expected behavior and I'm not sure what causes this or what can be done about this.

Steps to reproduce:
Push audio frames into an AudioStreamGeneratorPlayback.

Minimal reproduction project:
godot-voip-e119e3bcadf98e37f0de2e3e3d1bfa4bba59dd7e.zip

Calinou · 2021-02-27T21:56:31Z

cc @lyuma as they implemented AudioEffectCapture in #45593.

It would be helpful if you could try decreasing the Output Latency in the Project Settings, but this setting won't be available on Windows until #38210 is merged.

ikbencasdoei · 2021-02-27T22:15:25Z

Its very much an issue with the AudioStreamGenerator because when using a regular AudioStreamPlayer I did not experience this issue. The AudioEffectCapture has been working great so far.

lyuma · 2021-02-28T00:08:31Z

This reply got a bit lengthy. TL;DR: I believe that the code in your reproduction project is responsible for the delay, and this behavior does not indicate a bug in Godot.

But you've pretty much hit what makes writing real-time audio code so complex and challenging, so I'll go into detail in the problem you ran into and some possible solutions (I'm sure there are other approaches, too).

What causes the extra delay

So the reason for that latency is the combination of two things.
First, this code:

func _process_input():
	for i in range(_playback.get_frames_available()):
		if _receive_buffer.size() > 0:
			_playback.push_frame(Vector2(_receive_buffer[0], _receive_buffer[0]))
			_receive_buffer.remove(0)
		else:
			_playback.push_frame(Vector2.ZERO)

This code in the sample project is deliberately filling up the playback buffer. This is technically fine in terms of quality, but it does guarantee you incur maximal latency.

Second, the default buffer length defined in AudioStreamGenerator's constructor:

AudioStreamGenerator::AudioStreamGenerator() {
	mix_rate = 44100;
	buffer_len = 0.5;
}

The combination of the way you use it, and the default buffer size, ensures that you will always incur a latency of half a second. I suspect this is what you are observing
(in a VoIP round trip test, you'll see this on both ends of the connection, so you might notice a whole second of latency, depending on how you are testing).

Mitigations

The godot_speech GDNative plugin has been successfully running with a smaller buffer length of 0.1, as follows:

var generator: AudioStreamGenerator = AudioStreamGenerator.new()
new_generator.set_mix_rate(48000)
new_generator.set_buffer_length(0.1)

Actually, the application we're working on is still using the above code at 0.1 seconds, with a similar loop to your example project, and it's "good enough" for now: While 100ms delay in VoIP is "acceptable", it's not as good as we can do.

You could stop reading here if you want. Or keep going if you want to know how to do even better.

How to avoid filling up the buffer

However, even the above is not perfect. One thing that can be done instead to dynamically determine the delay, is to write no frames until the AudioStreamGeneratorPlayback reports skips, as a way to learn that the playback thread has started processing.
In your demo, it would be:

var skips = _playback.get_skips()
if skips < 1:
	return

However, since the code fills up the buffer at each _process tick, the code would need to be restructured.

One idea would be to use the amount of data available in the capture buffer to determine exactly how much to push to playback. However, this may lead to clicking if you buffer too little data.

Another idea is to make everything time-based. So every _process, you see what time it is, and how many frames should be inserted since the last call to _process. You'd still need to track skips and make sure there is enough data in the buffer that you don't underrun.

However, with these modifications, you're still tied to the game framerate. You will necessarily have additional delay if your game skips frames for any reason, and that delay will end up in your playback or capture buffers unless you have a means to clear it out.

Threads to the rescue

Finally, this brings me to what I probably ultimately recommend doing: use a thread to drive the AudioStreamGeneratorPlayback and the AudioEffectCapture buffers.

This thread can be decoupled with the main thread's processing loops, and therefore allows processing at a fixed interval.

WIth this approach:

Make a Thread to handle audio, and feed it references to the AudioEffectStream and the AudioStreamGenerator
Use get_playback_position() ((BUG: This is not exported to GDScript)) or get_skips() to determine when playback started; and
Use get_frames_available() on the AudioEffectCapture to determine when capture is started
And finally, create a tight loop, delaying based on depending on VoIP packet sizes (e.g. 10ms) [you'll want to use an absolute timer instead of a fixed sleep call to allow catching up if you sleep longer than desired]
In the loop, you can call get_buffer and push_buffer as needed. (NOTE: push_buffer and get_buffer are specifically designed to be usable by threads, as long as only one thread uses them. This is due to the single-producer, single-consumer RingBuffer architecture common in most audio systems)

ikbencasdoei · 2021-02-28T00:59:33Z

Thank you so much for this explanation!

Calinou · 2021-02-28T01:12:33Z

@lyuma We should probably amend the documentation and/or class reference for this 🙂

Calinou added the topic:audio label Feb 27, 2021

ikbencasdoei closed this as completed Feb 28, 2021

Calinou reopened this Feb 28, 2021

Calinou added the documentation label Feb 28, 2021

MJacred mentioned this issue May 7, 2023

[TRACKER] Audio Issues #76797

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AudioStreamGeneratorPlayback delay #46490

AudioStreamGeneratorPlayback delay #46490

ikbencasdoei commented Feb 27, 2021 •

edited

Loading

Calinou commented Feb 27, 2021 •

edited

Loading

ikbencasdoei commented Feb 27, 2021

lyuma commented Feb 28, 2021

ikbencasdoei commented Feb 28, 2021

Calinou commented Feb 28, 2021

AudioStreamGeneratorPlayback delay #46490

AudioStreamGeneratorPlayback delay #46490

Comments

ikbencasdoei commented Feb 27, 2021 • edited Loading

Calinou commented Feb 27, 2021 • edited Loading

ikbencasdoei commented Feb 27, 2021

lyuma commented Feb 28, 2021

What causes the extra delay

Mitigations

How to avoid filling up the buffer

Threads to the rescue

ikbencasdoei commented Feb 28, 2021

Calinou commented Feb 28, 2021

ikbencasdoei commented Feb 27, 2021 •

edited

Loading

Calinou commented Feb 27, 2021 •

edited

Loading