Skip to content

Commit

Permalink
Work on whistle algorithm (#477)
Browse files Browse the repository at this point in the history
* Work on whistle algorithm

Issue #470 Endeavour to fix error where event was found outside search frequency band. I believe this error was caused by the rounding of the search bounds from Hertz to a frequency bin. I have endeavored to tighten up the conversion of Hertz bound to frequency bin. Also added amore components to the unit test for whistle events.
Also added two lines to write spectrogram results for visualization. This is a temporary fix - to be removed later.

* Fixed two unit tests. And fixed inconsistent variable use.

Issue #470 Fixed broken whistle unit test that resulted from changing the name of two variables. Fixed Australasian Pipet broken test that resulted from previous changes as noted by Truskinger.

In addition, added two new parameters, SearchbandMinHertz and SearchbandMaxHertz, because the parameters MinHertz and MaxHertz were  being used to specify two different kinds of bound, that is the min and max bounds of the search band in which to find events and as the min and max bounds of an actual event. This issue has been fixed for the Whistle and the Whip generic events but I request Truskinger to determine if he is happy with how this has been done before proceeding with other generic events where this problem also occurs.

* Update MinAndMaxBandwidthParameters.cs

#470 Add more detailed summary to variables SearchbandMinHertzand and SearchbandMaxHertz.

* Revert two changes of variable names

Issue #477
Removed two variables that set min and max search bounds. Instead added comments to existing variable MinHertz and MaxHertz that indicate these variables in fact are used to set in and max search bounds. i.e. they are not the bounds of an acoustic event except where the discovered events occupy the entire search bandwidth.

* Corrections as requested by Anthony.

Issue #447 I have also written an extensive class summary for OneBinTrackAlgorithm i.e. the algorithm used to find whistles.
  • Loading branch information
towsey authored May 24, 2021
1 parent 2ffce6f commit 9a2b560
Show file tree
Hide file tree
Showing 7 changed files with 204 additions and 56 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ PostProcessing:
# Step 2: Combine possible syllable sequences and filter on excess syllable count.
# Step 3: Remove events whose bandwidth is too small or large.
# Step 4: Remove events that have excessive noise in their side-bands.
PostProcessInDecibelGroups: true
PostProcessInDecibelGroups: false
# 1: Combine overlapping events
CombineOverlappingEvents: true

Expand Down
12 changes: 9 additions & 3 deletions src/AudioAnalysisTools/CommonParameters.cs
Original file line number Diff line number Diff line change
Expand Up @@ -47,13 +47,19 @@ public abstract class CommonParameters : IValidatableObject
/// </summary>
public double? BgNoiseThreshold { get; set; }

/// <summary>snr
/// Gets or sets the bottom bound of the rectangle. Units are Hertz.
/// <summary>
/// Gets or sets the bottom bound of a search band. Units are Hertz.
/// A search band is the frequency band within which an algorithm searches for a particular track or event.
/// This is to be carefully distinguished from the top and bottom bounds of a specific event.
/// A search band consists of two parallel lines/freqeuncy bins.
/// An event is represented by a rectangle.
/// Events will/should always lie within a search band. There may be exception in edge cases, i.e. where an event sits on a search bound.
/// </summary>
public int? MinHertz { get; set; }

/// <summary>
/// Gets or sets the the top bound of the rectangle. Units are Hertz.
/// Gets or sets the the top bound of a search band. Units are Hertz.
/// A search band is the frequency band within which an algorithm searches for a particular track or event.
/// </summary>
public int? MaxHertz { get; set; }

Expand Down
6 changes: 4 additions & 2 deletions src/AudioAnalysisTools/Tracks/MinAndMaxBandwidthParameters.cs
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,19 @@ namespace AnalysisPrograms.Recognizers.Base
public class MinAndMaxBandwidthParameters : CommonParameters
{
/// <summary>
/// Gets or sets the minimum bandwidth, units = Hertz.
/// Gets or sets the minimum allowed bandwidth of a spectrogram track or event, units = Hertz.
/// </summary>
public int? MinBandwidthHertz { get; set; }

/// <summary>
/// Gets or sets maximum bandwidth, units = Hertz.
/// Gets or sets the maximum allowed bandwidth of a spectrogram track or event, units = Hertz.
/// </summary>
public int? MaxBandwidthHertz { get; set; }

public override IEnumerable<ValidationResult> Validate(ValidationContext validationContext)
{
yield return this.MinHertz.ValidateNotNull(nameof(this.MinHertz));
yield return this.MaxHertz.ValidateNotNull(nameof(this.MaxHertz));
yield return this.MinBandwidthHertz.ValidateNotNull(nameof(this.MinBandwidthHertz));
yield return this.MaxBandwidthHertz.ValidateNotNull(nameof(this.MaxBandwidthHertz));

Expand Down
109 changes: 82 additions & 27 deletions src/AudioAnalysisTools/Tracks/OnebinTrackAlgorithm.cs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,26 @@ namespace AudioAnalysisTools.Tracks
using TowseyLibrary;
using TrackType = AudioAnalysisTools.Events.Tracks.TrackType;

/// <summary>
/// This class searches a spectrogram for whistles, that is, for tones or spectral peaks that persist in one frequency bin.
/// In practice, the whistles of birds and other natural sources do not occupy a single frequency bin,
/// although this statement is confounded by the choice of recording sample rate and frame size.
/// But typically, a bird whistle spreads itself across three or more frequency bins using typical values for SR etc.
/// In this class, we make an assumption about the spectral profile of a whistle and the user is expected to find the appropriate
/// sample rate, frame size and frame step such that the target whistle is detected using the profile.
/// We define a whistle profile that is 11 bins wide. The actual whistle occupies the centre three bins, ie bins -1, 0 , +1.
/// Bins -2 and +2 are ignored to allow for some flexibility in getting he right combination of sample rate, frame size and frame step.
/// To establish that the centre three bins contain a spectral peak (i.e. are part of a potential whistle),
/// we define top and bottom sidebands, each of width three bins.
/// These are used to establish a baseline intensity which must be less than that of the centre three bins.
/// The bottom sideband = bins -3, -4, -5. The top sideband = bins +3, +4, +5.
/// Defining a whistle this way introduces edge effects at the top and bottom of the spectrogram.
/// In case of the low frequency edge, in order to get as close as possible to the frequency bin zero, we do not incorporate a bottom sidebound into the calculations.
/// Also note that a typical bird whistle is not exactly a pure tone. It typically fluctuates slightly from one frequency bin to an adjacent bin and back.
/// Consequently a final step in this whistle detection algorithm is to merge adjacent whistle tracks.
/// The algorithm is not perfect but it does detect constant tone sounds. Theis algorithm is designed so as not to pick up chirps,
/// i.e. gradually rising and falling tones. However, here again the right choice of SR, frame size and frame step are important.
/// </summary>
public static class OnebinTrackAlgorithm
{
private static readonly ILog Log = LogManager.GetLogger(MethodBase.GetCurrentMethod().DeclaringType);
Expand All @@ -40,6 +60,11 @@ public static (List<EventCommon> Events, List<Plot> DecibelPlots) GetOnebinTrack
segmentStartOffset,
decibelThreshold.Value);

foreach (var ev in events)
{
ev.Name = profileName;
}

spectralEvents.AddRange(events);

var plot = Plot.PreparePlot(decibelArray, $"{profileName} (Whistles:{decibelThreshold.Value:F0}dB)", decibelThreshold.Value);
Expand All @@ -49,72 +74,100 @@ public static (List<EventCommon> Events, List<Plot> DecibelPlots) GetOnebinTrack
}

/// <summary>
/// This method returns whistle (spectral peak) tracks enclosed in spectral events.
/// This method returns whistle (spectral peak) tracks enclosed as spectral events.
/// It averages dB log values incorrectly but it is faster than doing many log conversions.
/// </summary>
/// <param name="sonogram">The spectrogram to be searched.</param>
/// <param name="spectrogram">The spectrogram to be searched.</param>
/// <param name="parameters">The parameters that determine the search.</param>
/// <param name="segmentStartOffset">Enables assignment of a start time (relative to recording) to any valid event.</param>
/// <param name="decibelThreshold">The threshold for detection of a track.</param>
/// <returns>A list of acoustic events containing whistle tracks.</returns>

public static (List<EventCommon> ListOfevents, double[] CombinedIntensityArray) GetOnebinTracks(
SpectrogramStandard sonogram,
SpectrogramStandard spectrogram,
OnebinTrackParameters parameters,
TimeSpan segmentStartOffset,
double decibelThreshold)
{
var sonogramData = sonogram.Data;
int frameCount = sonogramData.GetLength(0);
int binCount = sonogramData.GetLength(1);
int nyquist = sonogram.NyquistFrequency;
var spectroData = spectrogram.Data;
int frameCount = spectroData.GetLength(0);
int binCount = spectroData.GetLength(1);
int nyquist = spectrogram.NyquistFrequency;
double binWidth = nyquist / (double)binCount;
int minBin = (int)Math.Round(parameters.MinHertz.Value / binWidth);
int maxBin = (int)Math.Round(parameters.MaxHertz.Value / binWidth);

// calculate the frequency bin for bottom of search band
// Allow for whistle sideband = one bin
int minSearchBin = (int)Math.Floor(parameters.MinHertz.Value / binWidth);
if (minSearchBin < 1)
{
minSearchBin = 1;
}

// calculate the frequency bin for top of search band, allowing for the top sideband.
// see class summary above.
int topSideband = 6;
int maxSearchBin = (int)Math.Floor(parameters.MaxHertz.Value / binWidth) - 1;
if (maxSearchBin > binCount - topSideband)
{
maxSearchBin = binCount - topSideband;
}

// get max and min duration for the whistle event.
double minDuration = parameters.MinDuration.Value;
double maxDuration = parameters.MaxDuration.Value;

var converter = new UnitConverters(
segmentStartOffset: segmentStartOffset.TotalSeconds,
sampleRate: sonogram.SampleRate,
frameSize: sonogram.Configuration.WindowSize,
frameOverlap: sonogram.Configuration.WindowOverlap);
sampleRate: spectrogram.SampleRate,
frameSize: spectrogram.Configuration.WindowSize,
frameOverlap: spectrogram.Configuration.WindowOverlap);

//Find all bin peaks and place in peaks matrix
var peaks = new double[frameCount, binCount];
// tf = timeframe and bin = frequency bin.
var peaksMatrix = new double[frameCount, binCount];
for (int tf = 0; tf < frameCount; tf++)
{
for (int bin = minBin + 1; bin < maxBin - 1; bin++)
for (int bin = minSearchBin; bin <= maxSearchBin; bin++)
{
if (sonogramData[tf, bin] < decibelThreshold)
//skip spectrogram cells below threshold
if (spectroData[tf, bin] < decibelThreshold)
{
continue;
}

// here we define the amplitude profile of a whistle. The buffer zone around whistle is five bins wide.
var bandIntensity = ((sonogramData[tf, bin - 1] * 0.5) + sonogramData[tf, bin] + (sonogramData[tf, bin + 1] * 0.5)) / 2.0;
var topSidebandIntensity = (sonogramData[tf, bin + 3] + sonogramData[tf, bin + 4] + sonogramData[tf, bin + 5]) / 3.0;
// Here we define the amplitude profile of a whistle. The profile is 11 bins wide.
// The whistle occupies the centre three bins, ie bins -1, 0 , +1. Bins -2 and +2 are ignored.
// A top and bottom sidebands, each of width three bins, are used to establish a baseline intensity.
// The bottom sideband = bins -3, -4, -5. The top sideband = bins +3, +4, +5.
// For more detail see the class summary.
var bandIntensity = ((spectroData[tf, bin - 1] * 0.5) + spectroData[tf, bin] + (spectroData[tf, bin + 1] * 0.5)) / 2.0;
var topSidebandIntensity = (spectroData[tf, bin + 3] + spectroData[tf, bin + 4] + spectroData[tf, bin + 5]) / 3.0;
var netAmplitude = 0.0;
if (bin < 4)
if (bin < 5)
{
// if bin < 5, i.e. too close to the bottom bin of the spectrogram, then only subtract intensity of the top sideband.
// see class summary above.
netAmplitude = bandIntensity - topSidebandIntensity;
}
else
{
var bottomSideBandIntensity = (sonogramData[tf, bin - 3] + sonogramData[tf, bin - 4] + sonogramData[tf, bin - 5]) / 3.0;
var bottomSideBandIntensity = (spectroData[tf, bin - 3] + spectroData[tf, bin - 4] + spectroData[tf, bin - 5]) / 3.0;
netAmplitude = bandIntensity - ((topSidebandIntensity + bottomSideBandIntensity) / 2.0);
}

if (netAmplitude >= decibelThreshold)
{
peaks[tf, bin] = sonogramData[tf, bin];
peaksMatrix[tf, bin] = spectroData[tf, bin];
}
}
}

var tracks = GetOnebinTracks(peaks, minDuration, maxDuration, decibelThreshold, converter);
var tracks = GetOnebinTracks(peaksMatrix, minDuration, maxDuration, decibelThreshold, converter);

// Initialise tracks as events and get the combined intensity array.
var events = new List<WhistleEvent>();
var combinedIntensityArray = new double[frameCount];
var scoreRange = new Interval<double>(0, decibelThreshold * 5);
int scalingFactor = 5; // used to make plot easier to interpret.
var scoreRange = new Interval<double>(0, decibelThreshold * scalingFactor);

foreach (var track in tracks)
{
Expand All @@ -138,14 +191,16 @@ public static (List<EventCommon> ListOfevents, double[] CombinedIntensityArray)
{
SegmentStartSeconds = segmentStartOffset.TotalSeconds,
SegmentDurationSeconds = frameCount * converter.SecondsPerFrameStep,
Name = "Whistle",
Name = "Whistle", // this name can be overridden later.
};

events.Add(ae);
}

// This algorithm tends to produce temporally overlapped whistle events in adjacent channels.
// Combine overlapping whistle events
// This is because a typical bird whistle is not exactly horozontal.
// Combine overlapping whistle events if they are within four frequency bins of each other.
// The value 4 is somewhat arbitrary but is consistent with the whistle profile described in the class comments above.
var hertzDifference = 4 * binWidth;
var whistleEvents = WhistleEvent.CombineAdjacentWhistleEvents(events, hertzDifference);

Expand All @@ -169,8 +224,8 @@ public static List<Track> GetOnebinTracks(double[,] peaks, double minDuration, d
var tracks = new List<Track>();

// Look for possible track starts and initialise as track.
// Cannot include edge rows & columns because of edge effects.
// Each row is a time frame which is a spectrum. Each column is a frequency bin
// Cannot include the three edge columns/frequency bins because of edge effects when determining a valid peak.
for (int row = 0; row < frameCount; row++)
{
for (int col = 3; col < bandwidthBinCount - 3; col++)
Expand Down
8 changes: 4 additions & 4 deletions src/AudioAnalysisTools/Tracks/UpwardTrackAlgorithm.cs
Original file line number Diff line number Diff line change
Expand Up @@ -70,8 +70,8 @@ public static (List<EventCommon> Events, double[] CombinedIntensity) GetUpwardTr
var frameStep = sonogram.FrameStep;
int nyquist = sonogram.NyquistFrequency;
double binWidth = nyquist / (double)binCount;
int minBin = (int)Math.Round(parameters.MinHertz.Value / binWidth);
int maxBin = (int)Math.Round(parameters.MaxHertz.Value / binWidth);
int minSearchBin = (int)Math.Round(parameters.MinHertz.Value / binWidth);
int maxSearchBin = (int)Math.Round(parameters.MaxHertz.Value / binWidth);
var minBandwidthHertz = parameters.MinBandwidthHertz ?? throw new ArgumentNullException($"{nameof(UpwardTrackParameters.MinBandwidthHertz)} must be set. Check your config file?");
var maxBandwidthHertz = parameters.MaxBandwidthHertz ?? throw new ArgumentNullException($"{nameof(UpwardTrackParameters.MinBandwidthHertz)} must be set. Check your config file?");

Expand All @@ -90,7 +90,7 @@ public static (List<EventCommon> Events, double[] CombinedIntensity) GetUpwardTr
var peaks = new double[frameCount, binCount];
for (int row = 1; row < frameCount - 1; row++)
{
for (int col = minBin; col < maxBin; col++)
for (int col = minSearchBin; col < maxSearchBin; col++)
{
if (sonogramData[row, col] < decibelThreshold)
{
Expand All @@ -107,7 +107,7 @@ public static (List<EventCommon> Events, double[] CombinedIntensity) GetUpwardTr
}

//NOTE: the Peaks matrix is same size as the sonogram.
var tracks = GetUpwardTracks(peaks, minBin, maxBin, minBandwidthHertz, maxBandwidthHertz, decibelThreshold, converter);
var tracks = GetUpwardTracks(peaks, minSearchBin, maxSearchBin, minBandwidthHertz, maxBandwidthHertz, decibelThreshold, converter);

// initialise tracks as events and get the combined intensity array.
var events = new List<SpectralEvent>();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -66,15 +66,16 @@ public void TestRecognizer()
// events[2] should be a composite event.
var ev = (CompositeEvent)events[2];
Assert.IsInstanceOfType(events[2], typeof(CompositeEvent));
Assert.AreEqual(22, ev.EventStartSeconds, TestHelper.AllowedDelta);
Assert.AreEqual(22, ev.EventEndSeconds, TestHelper.AllowedDelta);
Assert.AreEqual(22.000, ev.EventStartSeconds, TestHelper.AllowedDelta);
Assert.AreEqual(22.368, ev.EventEndSeconds, TestHelper.AllowedDelta);
Assert.AreEqual(4743, ev.BandWidthHertz);

var componentEvents = ev.ComponentEvents;
Assert.AreEqual(3, componentEvents.Count);
//Assert.AreEqual(3, componentEvents.Count);
Assert.AreEqual(13, componentEvents.Count);

// This tests that the component tracks are correctly combined.
//This can also be tested somewhere else, starting with just the comosite event in json file.
//This can also be tested somewhere else, starting with just the composite event in json file.
var points = EventExtentions.GetCompositeTrack(componentEvents.Cast<WhipEvent>()).ToArray();
Assert.AreEqual(22.016, points[1].Seconds.Minimum, TestHelper.AllowedDelta);
Assert.AreEqual(5456, points[1].Hertz.Minimum);
Expand Down
Loading

0 comments on commit 9a2b560

Please sign in to comment.