Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement "mono" AudioEffectRecord to ease integration with speech recognition 3rd party library #7105

Open
allan-simon opened this issue Jun 17, 2023 · 4 comments

Comments

@allan-simon
Copy link

Describe the project you are working on

A 2D space shooter aimed to children to learn addition/multiplication/subtraction

The twist is that the enenies have basic math operation on them like 2 x 4 .

Your player has a energy ball weapon, where the energy ball has a number on it

if you shoot the 2x4 ennemy with a 8 energy ball , it explodes , otherwise it is immune to it

The goal is to help children do the operation to know which number to shoot at which ennemy

image

Describe the problem or limitation you are having in your project

In order to make the UX as smooth as possible, I want the player to directly speak loudly the number he wants to shoot

so I've written a GDExtension that link with vosk https://alphacephei.com/vosk/

However vosk as most if not all speech recognition libraries works on mono PCM, while AudioEffectRecord produce a stereo PCM

Describe the feature / enhancement and how it helps to overcome the problem or limitation

Add a property /checkbox/enum "stereo/mono" to AudioEffectRecord that will allow to choose between both.

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

  1. a property stereo/mono (a boolean ?) that will replace the hardcoded boolean here https://github.com/godotengine/godot/blob/master/servers/audio/effects/audio_effect_record.cpp#L206
  2. if selected we reuse the same code as in https://github.com/godotengine/godot/blob/master/editor/import/resource_importer_wav.cpp#L447-L458

(I actually wonder if one can simply just use the left channel rather than computing a mean, as I'm pretty sure for microphone both are equal anyway ? )

If this enhancement will not be used often, can it be worked around with a few lines of script?

It could be done in gdscript , but I don't think it will be performant especially as we need to do operation "live" , especially on mobile

Is there a reason why this should be core and not an add-on in the asset library?

it's a simple change in a class where it feels natural to find it.

@allan-simon
Copy link
Author

After an other question may also be (but my godot knowledge are pretty limited) why AudioEffectRecord could not directly receive a Mono stream from AudioStreamMicrophone 🤔

@fire
Copy link
Member

fire commented Jun 18, 2023

We created a new tool called AudioEffectCapture because the old tool, AudioEffectRecord, didn't work well when we needed to quickly capture sound. The old tool was better for recording sounds that lasted for several minutes.

Here's a link to learn more about it:

Godot Proposal #2013

@fire
Copy link
Member

fire commented Jun 20, 2023

I think the better design is an AudioEffect that makes each channel mono. This keeps the original design of dual channels. I was sure it existed, but I can't find it.

@Calinou
Copy link
Member

Calinou commented Jun 21, 2023

I think the better design is an AudioEffect that makes each channel mono. This keeps the original design of dual channels. I was sure it existed, but I can't find it.

There's AudioEffectStereoEnhance which can use negative values.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants