Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AudioEffectCapture: attach a scriptable audio processing pipeline to a bus. #2013

Closed
lyuma opened this issue Dec 23, 2020 · 2 comments · Fixed by godotengine/godot#45593
Closed
Milestone

Comments

@lyuma
Copy link

lyuma commented Dec 23, 2020

Describe the project you are working on

Godot Speech, a VoIP module for Godot

Describe the problem or limitation you are having in your project

Previously, there was no way to capture microphone input, or the output of an audio bus, in a reliable and performant way.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

This proposal is to expose a simple interface: attach a AudioEffectCapture to the effect bus. The audio is shared via an internal RingBuffer to a StreamAudio node, and the application can then collect AudioFrame's in bulk (less ideally) from a callback such as _process(), or (better / lower latency) from a dedicated thread, and process them as needed.

This solves the problem as it has bridged the gap from the audio effect bus (with mic input or the result of other in-game sounds) and application processing code such as a VoIP engine.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

To start with, I will point at an existing implementation of this system for Master.
Godot engine master branch with the following changes applied:
https://github.com/lyuma/godot/tree/audio_effect_capture (An implementation of this proposal)
https://github.com/lyuma/godot/tree/speech_master (A port of our GDNative plugin to a godot module, as GDNative is currently non-functional in master. Thanks, fire for the help!)

Then, the example project available here: https://github.com/lyuma/godot_voip_experimental (godot4 branch)
To run, play multiple copies of the project. Have one copy host, and the other copy join. You should see the other client in the room and audio should transmit.

About the proposal itself:

AudioEffectCapture is an AudioEffect reference type which basically contains a ring buffer. When attached to the bus, it will begin collecting frames in its internal ring buffer.

StreamAudio is the node counterpart to the AudioEffectStream, extensible by GDScript and accessible in the main thread. Audio is collected by AudioEffectStream and exposed via StreamAudio through use of a shared RingBuffer reference.

Viewed as a sort of "audio pipe", the AudioEffectCaptureInstance is the write end of the pipe, collecting audio frames from the attached bus, and the AudioEffectCapture resource is the read end of the pipe, interacting with application code.

Application code (usually GDNative or a Godot Module) will then consume data from this ring buffer and process it as needed, possibly providing application defined effects, or transmitting it over the network.

Due to the low-latency tolerances and demand of VoIP and other real-time audio processing scenarios, here is how this solution approaches these problems:

  • reliable: Using a ring buffer permits application code to safely process audio frames outside of the audio effect bus, with no risk of locking the AudioServer threads.
  • Additionally, the ring buffer is of fixed size, and will not consume infinite memory if misused.
  • performant: we use a non-locking ring buffer to allow audio frames to be transmitted in bulk from the audio server to application code.

If this enhancement will not be used often, can it be worked around with a few lines of script?

The closest workaround is to mis-use the AudioEffectRecord for this purpose. It is a misuse because the primary function of AudioEffectRecord is to record to a file on disk.

AudioEffectRecord is also unreliable and even crash-prone, as it was not designed for infinite length data streams, and was also not designed for data to be polled in real time.

We have found no other way to access the audio bus in a safe way. Even when considering usage of a Godot module to provide this functionality, we would need at a minimum, the implementation of AudioEffectCapture

Is there a reason why this should be core and not an add-on in the asset library?

Realtime audio processing is a necessity for some game genres. Additionally, VoIP is a common requirement or usecase in many multiplayer games.

While VoIP itself might not be desirable as a core feature, this proposal would be needed as core, as it would enable the creation of VoIP frameworks as a custom module or built using a combination of GDScript and GDNative and distributed via the Asset Library.

@reduz
Copy link
Member

reduz commented Dec 23, 2020

You can output to an audio stream using AudioStreamGenerator (name may not be great, suggestions accepted for renaming in 4.0). To capture from a bus, I was thinking an AudioEffectCapture might make more sense, using a similar API to Generator (frame based or chunk based).

@lyuma lyuma changed the title AudioEffectStream: attach a scriptable audio processing pipeline to a bus. AudioEffectCapture: attach a scriptable audio processing pipeline to a bus. Jan 30, 2021
@lyuma
Copy link
Author

lyuma commented Jan 30, 2021

Created a PR at godotengine/godot#45593 with a drastically simplified API more analogous playback API, AudioStreamGenerator

AudioEffectCapture now lives standalone, and no longer requires a node or auxiliary class to use.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants