Skip to content

AgoraIO-Extensions/Agora-Java-Server-SDK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Agora Java Server SDK

Table of Contents

System Requirements

Hardware Requirements

  • Operating System: Ubuntu 18.04+ or CentOS 7.0+
  • CPU Architecture: x86-64
  • Performance Requirements:
    • CPU: 8 cores at 1.8 GHz or higher
    • Memory: 2 GB (4 GB+ recommended)
  • Network Requirements:
    • Public IP
    • Access to .agora.io and .agoralab.co domains

Software Requirements

  • Apache Maven or other build tools
  • JDK 8+

Quick Start

Refer to the official example documentation

SDK Acquisition

API Examples

For detailed examples, please refer to examples/README.md

API Reference

Basic API Reference

For complete API documentation, please visit Agora Java Server SDK API Reference

AgoraAudioVadV2 Class

Overview

AgoraAudioVadV2 is a Voice Activity Detection (VAD) module used to process audio frames. It can detect voice activity in audio streams and handle them based on configuration parameters.

Classes and Methods

AgoraAudioVadV2
Constructor
public AgoraAudioVadV2(AgoraAudioVadConfigV2 config)
  • Parameters

    • config: AgoraAudioVadConfigV2 type, VAD configuration.
    AgoraAudioVadConfigV2 Properties
Property Name Type Description Default Value Range
preStartRecognizeCount int Number of audio frames saved before starting speech state 16 [0, Integer.MAX_VALUE]
startRecognizeCount int Number of audio frames in speech state 30 [1, Integer.MAX_VALUE]
stopRecognizeCount int Number of audio frames in stop speech state 20 [1, Integer.MAX_VALUE]
activePercent float Percentage of active frames in startRecognizeCount frames 0.7 [0.0, 1.0]
inactivePercent float Percentage of inactive frames in stopRecognizeCount frames 0.5 [0.0, 1.0]
startVoiceProb int Probability threshold for starting voice detection 70 [0, 100]
stopVoiceProb int Probability threshold for stopping voice detection 70 [0, 100]
startRmsThreshold int RMS threshold for starting voice detection -50 [-100, 0]
stopRmsThreshold int RMS threshold for stopping voice detection -50 [-100, 0]
Notes
  • startVoiceProb: The lower the value, the higher the probability that the frame is judged as active, and the earlier the start phase begins. Lower it for more sensitive voice detection.
  • stopVoiceProb: The higher the value, the higher the probability that the frame is judged as inactive, and the earlier the stop phase begins. Increase it for quicker end of voice detection.
  • startRmsThreshold and stopRmsThreshold:
    • The higher the value, the more sensitive to voice activity.
    • In quiet environments, the default value of -50 is recommended.
    • In noisy environments, it can be adjusted to between -40 and -30 to reduce false positives.
    • Fine-tune according to the actual usage scenario and audio characteristics for optimal results.
Methods
public synchronized VadProcessResult processFrame(AudioFrame frame)
  • Parameters
    • frame: AudioFrame type, the audio frame.
  • Returns
    • VadProcessResult type, the result of the VAD process.
public synchronized void destroy()
  • Destroys the VAD module and releases resources.
VadProcessResult

Stores the VAD process result.

Constructor
public VadProcessResult(byte[] result, Constants.VadState state)
  • Parameters
    • result: byte[] type, the processed audio data.
    • state: Constants.VadState type, the current VAD state.

Usage Example

Here is a simple example demonstrating how to use AgoraAudioVadV2 to process audio frames:

import io.agora.rtc.AgoraAudioVadV2;
import io.agora.rtc.AgoraAudioVadConfigV2;
import io.agora.rtc.Constants;
import io.agora.rtc.AudioFrame;
import io.agora.rtc.VadProcessResult;

public class Main {
    public static void main(String[] args) {
        // Create VAD configuration
        AgoraAudioVadConfigV2 config = new AgoraAudioVadConfigV2();
        config.setPreStartRecognizeCount(16);
        config.setStartRecognizeCount(30);
        config.setStopRecognizeCount(20);
        config.setActivePercent(0.7f);
        config.setInactivePercent(0.5f);
        config.setStartVoiceProb(70);
        config.setStopVoiceProb(70);
        config.setStartRmsThreshold(-50);
        config.setStopRmsThreshold(-50);

        // Create VAD instance
        AgoraAudioVadV2 vad = new AgoraAudioVadV2(config);

        // Simulate audio frame processing
        AudioFrame frame = new AudioFrame();
        // Set frame properties...

        VadProcessResult result = vad.processFrame(frame);
        if (result != null) {
            System.out.println("VAD State: " + result.getState());
            System.out.println("Processed Data Length: " + result.getResult().length);
        }

        // Destroy VAD instance
        vad.destroy();
    }
}

Changelog

v4.4.31 (2024-12-23)

New Features

  • Added DomainLimit configuration option in AgoraServiceConfig for domain restriction management.
  • Added VadDumpUtils utility class to support exporting VAD process debug data for troubleshooting.
  • Added AudioConsumerUtils class, providing optimized PCM data transmission mechanism to effectively prevent audio distortion.
  • Modified registerAudioFrameObserver method in AgoraLocalUser to support AgoraAudioVadConfigV2 parameter configuration.
  • Added vadResult parameter in onPlaybackAudioFrameBeforeMixing callback of IAudioFrameObserver to provide more detailed VAD processing results.
  • Added sendAudioMetaData method in AgoraLocalUser class for sending audio metadata.
  • Added onAudioMetaDataReceived callback in ILocalUserObserver class for receiving audio metadata.
  • Added ColorSpace property in the ExternalVideoFrame class to support custom color space configuration.

Performance Improvements

  • Optimized code logic architecture to significantly improve memory efficiency.
  • Fixed multiple memory leak issues to enhance system stability.
  • Enhanced memory access security mechanism to effectively prevent memory corruption.

v4.4.30.2 (2024-11-20)

  • Enhanced the processFrame handling in AgoraAudioVadV2 with new START_SPEAKING and STOP_SPEAKING state callbacks.
  • Improved parameter types for encoded frame callbacks. onEncodedAudioFrameReceived, onEncodedVideoImageReceived, and onEncodedVideoFrame now use ByteBuffer instead of Byte arrays, enhancing performance and flexibility.
  • Optimized VAD plugin startup; enableExtension is now implemented within the SDK, so applications no longer need to call this method manually.
  • Fixed issues with the handling of alphaBuffer and metadataBuffer in VideoFrame.

Developer Notes

  • Please update the code using encoded frame callbacks to accommodate the new ByteBuffer parameter type.
  • If you previously called the enableExtension method for the VAD plugin manually, you can now remove that call.

v4.4.30.1 (2024-11-12)

  • Added Vad2 interfaces related to AgoraAudioVad2 and removed Vad interfaces related to AgoraAudioVad.
  • Added a new callback interface for receiving encoded audio frames: IAudioEncodedFrameObserver.
  • Fixed crashes related to LocalAudioDetailedStats callbacks.
  • Modified the parameter types for the onAudioVolumeIndication callback.

v4.4.30 (2024-10-24)

FAQ

If you encounter any issues, please refer to the Documentation Center or search for related issues on GitHub Issues

Support

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published