Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong format on mumble_onAudioInput #6464

Closed
guillermo-menguez-alvarez opened this issue Jun 12, 2024 · 2 comments · Fixed by #6499
Closed

Wrong format on mumble_onAudioInput #6464

guillermo-menguez-alvarez opened this issue Jun 12, 2024 · 2 comments · Fixed by #6499
Labels
bug A bug (error) in the software client

Comments

@guillermo-menguez-alvarez

Description

Acording to the documentation, in the mumble_onAudioInput callback I should get 16 bit PCM audio, but no matter how I save it I don't manage to get the right sound. I'm accumulating 100 calls (480 samples at 48000), which is a second, and I always get the second half of each group of samples (480) as random/unassigned values:

image

Focusing on 10ms, 480 samples:

image

Seems that the first part is "ok" but the second one is just random values.

Maybe I'm doing something wrong, see my code below.

Steps to reproduce

Write a mumble plugin, try to save the audio on the mumble_onAudioInput callback.

Mumble version

1.5.634

Mumble component

Client

OS

Windows

Reproducible?

Yes

Additional information

This is my code for all this:

// Function to write inputPCM as a wav file
void writeInputPCMAsWavFile(short* inputPCM, uint32_t sampleCount, uint16_t channelCount, uint32_t sampleRate) {
    // Open the file
    FILE* file = fopen("C:\\Users\\MenguezG\\Desktop\\input.wav", "wb");
    if (file == NULL) {
        mumbleAPI.log(ownID, "Failed to open file");
        return;
    }

    // Write wav header
    uint16_t bitsPerSample = 16;
    uint32_t byteRate = sampleRate * channelCount * bitsPerSample / 8;
    uint16_t blockAlign = channelCount * bitsPerSample / 8;
    uint32_t dataSize = sampleCount * channelCount * bitsPerSample / 8;
    uint32_t fileSize = 36 + dataSize;

    // Write RIFF header
    fwrite("RIFF", 1, 4, file);
    fwrite(&fileSize, 4, 1, file);
    fwrite("WAVE", 1, 4, file);

    // Write fmt subchunk
    fwrite("fmt ", 1, 4, file);
    uint32_t fmtSize = 16;
    fwrite(&fmtSize, 4, 1, file);
    uint16_t audioFormat = 1;
    fwrite(&audioFormat, 2, 1, file);
    fwrite(&channelCount, 2, 1, file);
    fwrite(&sampleRate, 4, 1, file);
    fwrite(&byteRate, 4, 1, file);
    fwrite(&blockAlign, 2, 1, file);
    fwrite(&bitsPerSample, 2, 1, file);

    // Write data subchunk
    fwrite("data", 1, 4, file);
    fwrite(&dataSize, 4, 1, file);

    // Write audio data
    for (uint32_t i = 0; i < sampleCount; i++) {
        for (uint16_t j = 0; j < channelCount; j++) {
            short sample = inputPCM[i * channelCount + j];
            fwrite(&sample, sizeof(short), 1, file);
        }
    }

    fclose(file);
}


// Global variables to accumulate samples
std::vector<short> accumulatedSamples;
uint32_t accumulatedSampleCount = 0;
static const uint32_t MAX_ACCUMULATED_SAMPLES = 48000;

// Function to accumulate samples and call writeInputPCMAsWavFile
void accumulateAndWriteInputPCM(short* inputPCM, uint32_t sampleCount, uint16_t channelCount, uint32_t sampleRate, bool isSpeech) {
    for (uint32_t i = 0; i < sampleCount; i++) {
        for (uint16_t j = 0; j < channelCount; j++) {
            accumulatedSamples.push_back(inputPCM[i * channelCount + j]);
        }
    }

    accumulatedSampleCount += sampleCount;

    // Check if 48000 samples have been accumulated
    if (accumulatedSamples.size() >= MAX_ACCUMULATED_SAMPLES * channelCount) {
        mumbleAPI.log(ownID, "Writing accumulated samples as a wav file");

        // Call the function to write accumulated samples as a wav file
        writeInputPCMAsWavFile(accumulatedSamples.data(), accumulatedSamples.size() / channelCount, channelCount, sampleRate);

        // Clear the accumulated samples
        accumulatedSamples.clear();
    }
}

// On audio input
bool mumble_onAudioInput(short* inputPCM, uint32_t sampleCount, uint16_t channelCount, uint32_t sampleRate, bool isSpeech) {
    if (isSpeech) {
        std::string message = "Speech detected (number of samples: " + std::to_string(sampleCount) + ", number of channels: " + std::to_string(channelCount) + ", rate: " + std::to_string(sampleRate) + ")";
        mumbleAPI.log(ownID, message.c_str());
    }
    else {
        std::string message = "Noise detected (number of samples: " + std::to_string(sampleCount) + ", number of channels: " + std::to_string(channelCount) + ", rate: " + std::to_string(sampleRate) + ")";
        mumbleAPI.log(ownID, message.c_str());
    }

    // Call the function to accumulate samples and write inputPCM as a wav file
    accumulateAndWriteInputPCM(inputPCM, sampleCount, channelCount, sampleRate, isSpeech);
    //writeInputPCMAsWavFile(inputPCM, sampleCount, channelCount, sampleRate);

    // release the memory
    mumbleAPI.freeMemory(ownID, inputPCM);

    return false;
}

However if I save just 480 samples I get the same. I tried to use libraries to save the wav file, same result. I saved just the PCM data without wav header and loaded it on Audacity as signed 16 bit PCM, interleaved audio at 48k, same result.

Relevant log output

No response

Screenshots

No response

@guillermo-menguez-alvarez guillermo-menguez-alvarez added bug A bug (error) in the software triage This issue is waiting to be triaged by one of the project members labels Jun 12, 2024
@Krzmbrzl
Copy link
Member

Krzmbrzl commented Jun 23, 2024

What is the channelCount that you observe? If it is 2, I think I have an explanation of what is going on:

We pass iFrameSize as sampleCount to the plugin callback:

emit audioInputEncountered(psSource, iFrameSize, iMicChannels, SAMPLE_RATE, bIsSpeech);

However, iFramesize is the total length of the passed audio data buffer as illustrated e.g.
opusBuffer.insert(opusBuffer.end(), psSource, psSource + iFrameSize);

Therefore, I think this is a bug on Mumble side where the passed sampleCount is actually the total buffer length (sampleCount * channelCount). However, I think that Mumble doesn't even support more than a single channel for audio input so if the above is true, I'd have to dig a bit further to see where the downmixing to mono happens 🤔


Some things I noticed in your code:

// release the memory
mumbleAPI.freeMemory(ownID, inputPCM);

You must not free the audio data - it is still used by Mumble itself. Functionally, this just does nothing as the plugin backend does some checking and only deletes objects that it knows have to be deleted. Thus, this call will always return MUMBLE_EC_POINTER_NOT_FOUND.

for (uint32_t i = 0; i < sampleCount; i++) {
   for (uint16_t j = 0; j < channelCount; j++) {
       accumulatedSamples.push_back(inputPCM[i * channelCount + j]);
   }
}

This is functionally equivalent to simply iterating over the sampleCount * channelCount entries of inputPCM and sequentially copying them to accumulatedSamples. Thus, the nested loop is unnecessary.

@Krzmbrzl Krzmbrzl added client needs-more-input and removed triage This issue is waiting to be triaged by one of the project members labels Jun 23, 2024
@guillermo-menguez-alvarez
Copy link
Author

Hi Robert,

You are right, that is exactly the case, channelCount is being passed by mumble as 2, however I've done a quick test now and if I assume there is only one channel (with 480 samples) and save it as mono audio, now I get the audio correctly saved. Thanks a lot!

Also, thank you for your other comments about my code.

Krzmbrzl added a commit that referenced this issue Jul 8, 2024
…llback

In the mumble_onAudioInput callback that plugins can use, the sampleCount parameter was actually wrong for all cases in which the input had more than a single channel. Instead of the sample count per channel, the total sample count (across all channels) was passed along.

Fixes #6464
Krzmbrzl added a commit that referenced this issue Jul 8, 2024
In the mumble_onAudioInput callback that plugins can use, the
sampleCount parameter was actually wrong for all cases in which the
input had more than a single channel. Instead of the sample count per
channel, the total sample count (across all channels) was passed along.

Fixes #6464

(cherry picked from commit e48dd5c)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug A bug (error) in the software client
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants