You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the problem or limitation you are having in your project
Previously, there was no way to capture microphone input, or the output of an audio bus, in a reliable and performant way.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
This proposal is to expose a simple interface: attach a AudioEffectCapture to the effect bus. The audio is shared via an internal RingBuffer to a StreamAudio node, and the application can then collect AudioFrame's in bulk (less ideally) from a callback such as _process(), or (better / lower latency) from a dedicated thread, and process them as needed.
This solves the problem as it has bridged the gap from the audio effect bus (with mic input or the result of other in-game sounds) and application processing code such as a VoIP engine.
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
Then, the example project available here: https://github.com/lyuma/godot_voip_experimental (godot4 branch)
To run, play multiple copies of the project. Have one copy host, and the other copy join. You should see the other client in the room and audio should transmit.
About the proposal itself:
AudioEffectCapture is an AudioEffect reference type which basically contains a ring buffer. When attached to the bus, it will begin collecting frames in its internal ring buffer.
StreamAudio is the node counterpart to the AudioEffectStream, extensible by GDScript and accessible in the main thread. Audio is collected by AudioEffectStream and exposed via StreamAudio through use of a shared RingBuffer reference.
Viewed as a sort of "audio pipe", the AudioEffectCaptureInstance is the write end of the pipe, collecting audio frames from the attached bus, and the AudioEffectCapture resource is the read end of the pipe, interacting with application code.
Application code (usually GDNative or a Godot Module) will then consume data from this ring buffer and process it as needed, possibly providing application defined effects, or transmitting it over the network.
Due to the low-latency tolerances and demand of VoIP and other real-time audio processing scenarios, here is how this solution approaches these problems:
reliable: Using a ring buffer permits application code to safely process audio frames outside of the audio effect bus, with no risk of locking the AudioServer threads.
Additionally, the ring buffer is of fixed size, and will not consume infinite memory if misused.
performant: we use a non-locking ring buffer to allow audio frames to be transmitted in bulk from the audio server to application code.
If this enhancement will not be used often, can it be worked around with a few lines of script?
The closest workaround is to mis-use the AudioEffectRecord for this purpose. It is a misuse because the primary function of AudioEffectRecord is to record to a file on disk.
AudioEffectRecord is also unreliable and even crash-prone, as it was not designed for infinite length data streams, and was also not designed for data to be polled in real time.
We have found no other way to access the audio bus in a safe way. Even when considering usage of a Godot module to provide this functionality, we would need at a minimum, the implementation of AudioEffectCapture
Is there a reason why this should be core and not an add-on in the asset library?
Realtime audio processing is a necessity for some game genres. Additionally, VoIP is a common requirement or usecase in many multiplayer games.
While VoIP itself might not be desirable as a core feature, this proposal would be needed as core, as it would enable the creation of VoIP frameworks as a custom module or built using a combination of GDScript and GDNative and distributed via the Asset Library.
The text was updated successfully, but these errors were encountered:
You can output to an audio stream using AudioStreamGenerator (name may not be great, suggestions accepted for renaming in 4.0). To capture from a bus, I was thinking an AudioEffectCapture might make more sense, using a similar API to Generator (frame based or chunk based).
lyuma
changed the title
AudioEffectStream: attach a scriptable audio processing pipeline to a bus.
AudioEffectCapture: attach a scriptable audio processing pipeline to a bus.
Jan 30, 2021
Describe the project you are working on
Godot Speech, a VoIP module for Godot
Describe the problem or limitation you are having in your project
Previously, there was no way to capture microphone input, or the output of an audio bus, in a reliable and performant way.
Describe the feature / enhancement and how it helps to overcome the problem or limitation
This proposal is to expose a simple interface: attach a AudioEffectCapture to the effect bus.
The audio is shared via an internal RingBuffer to a StreamAudio node, andthe application can then collect AudioFrame's in bulk (less ideally) from a callback such as _process(), or (better / lower latency) from a dedicated thread, and process them as needed.This solves the problem as it has bridged the gap from the audio effect bus (with mic input or the result of other in-game sounds) and application processing code such as a VoIP engine.
Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams
To start with, I will point at an existing implementation of this system for Master.
Godot engine master branch with the following changes applied:
https://github.com/lyuma/godot/tree/audio_effect_capture (An implementation of this proposal)
https://github.com/lyuma/godot/tree/speech_master (A port of our GDNative plugin to a godot module, as GDNative is currently non-functional in master. Thanks, fire for the help!)
Then, the example project available here: https://github.com/lyuma/godot_voip_experimental (godot4 branch)
To run, play multiple copies of the project. Have one copy host, and the other copy join. You should see the other client in the room and audio should transmit.
About the proposal itself:
AudioEffectCapture is an AudioEffect reference type which basically contains a ring buffer. When attached to the bus, it will begin collecting frames in its internal ring buffer.
StreamAudio is the node counterpart to the AudioEffectStream, extensible by GDScript and accessible in the main thread. Audio is collected by AudioEffectStream and exposed via StreamAudio through use of a shared RingBuffer reference.Viewed as a sort of "audio pipe", the AudioEffectCaptureInstance is the write end of the pipe, collecting audio frames from the attached bus, and the AudioEffectCapture resource is the read end of the pipe, interacting with application code.
Application code (usually GDNative or a Godot Module) will then consume data from this ring buffer and process it as needed, possibly providing application defined effects, or transmitting it over the network.
Due to the low-latency tolerances and demand of VoIP and other real-time audio processing scenarios, here is how this solution approaches these problems:
If this enhancement will not be used often, can it be worked around with a few lines of script?
The closest workaround is to mis-use the AudioEffectRecord for this purpose. It is a misuse because the primary function of AudioEffectRecord is to record to a file on disk.
AudioEffectRecord is also unreliable and even crash-prone, as it was not designed for infinite length data streams, and was also not designed for data to be polled in real time.
We have found no other way to access the audio bus in a safe way. Even when considering usage of a Godot module to provide this functionality, we would need at a minimum, the implementation of AudioEffectCapture
Is there a reason why this should be core and not an add-on in the asset library?
Realtime audio processing is a necessity for some game genres. Additionally, VoIP is a common requirement or usecase in many multiplayer games.
While VoIP itself might not be desirable as a core feature, this proposal would be needed as core, as it would enable the creation of VoIP frameworks as a custom module or built using a combination of GDScript and GDNative and distributed via the Asset Library.
The text was updated successfully, but these errors were encountered: