Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue471 check formant gap #498

Merged
merged 21 commits into from
Jun 16, 2021
Merged

Issue471 check formant gap #498

merged 21 commits into from
Jun 16, 2021

Conversation

towsey
Copy link
Contributor

@towsey towsey commented Jun 13, 2021

Title of PR

Detection of Harmonic Events.

What is the purpose of this PR?
Fix error in detection of Harmonic Events.

Fixes #471

Changes

There are three major changes:
1: I detrended the array of dct coefficients before determining the maximum coefficient.
2: I removed coefficients 0 to 3 from contention as maximum coefficients. We expect there to be a minimum of three formants and three or more formants can only be detected by coefficients 4 and above.
3: I fill in gaps of one or two frames in a sequence of harmonics. This is a hack. -> this is the new SmoothingWindow parameter

I also made various other changes to unit tests and refactoring of methods.

Issues

Detection of stacked harmonics still requires careful consideration of the appropriate parameter values.

Visual Changes

There are no visual changes to the spectrograms produced by this recognizer.

Final Checklist

  • [Anthony ] Assign reviewers if you have permission
  • Assign labels if you have permission
  • Link issues related to PR
  • Link any PRs or issues blocking this PR from being merged
  • Remove/Reduce warnings from edited files
  • [ Yes] Unit tests have been added for changes
  • Ensure CI build is passing

atruskie and others added 13 commits June 9, 2021 16:54
Also adds some test assertions for validating objects.

Related to #471
Issue #471 In order to condition the array of DCT coefficient values nicely, I have  removed the linear trend. This helps to lower the long-wavelenth coefficients making it more likely to find the correct maximum.
Issue #471 When searching for harmonic events, sometimes small gaps appear in the harmonic score array which means the events are too short and therefore rejected. This hack fills in gaps of one or two frames. This problem arises because the maximum DCT coefficient is sometimes of a longer wavelength (lower array index) than the the expected wavelength.
Issue #471 Reworked these test. They now both pass. The problem lies in the return DCT coefficients being unduly sensitive to noise. The harmonic recognizer still appears to be sensitive to the minHertz and maxHertz parameter values in ways that I would not expect. THis remains to be checked.
Issue #471 Create a new method to detect harmonic events in a score array. Create new property in the HarmonicEvent class.
Issue #471 Add more detailed comments on the harmonic detection and DCT. Also clean out some unused code.
Issue #471 More explanatory comments.
Issue #471 The Harmonic Algorithm tests are now working as one would expect.
Issue #471 Refactor the method that draws a matrix as an image so that the method returns the image. This had side effect on the Oscillations2010 class.
Issue #471 Create a unit test for the method that draws a matrix of cosine basis functions.
Issue #471 Adjust this unit test following all previous changes to the methods that find harmonic events.
Issue #471 Removed the only useful method from this class and placed in HarmonicPArameters.cs.
THis CrossCorrelations.cs class is now redundant and could possibly be removed except that it contains methods previously used in recognition of crow calls and human speech. However I doubt they of use any longer.
Issue #471 Shifted method DetectHarmonicsInSpectrogramData() to a location where it better belongs.
@codecov
Copy link

codecov bot commented Jun 13, 2021

Codecov Report

Merging #498 (35bbc6c) into master (2099caf) will decrease coverage by 38.01%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #498       +/-   ##
==========================================
- Coverage   38.72%   0.70%   -38.02%     
==========================================
  Files         482     481        -1     
  Lines       67466   48881    -18585     
  Branches     7870    7659      -211     
==========================================
- Hits        26123     344    -25779     
- Misses      41252   48487     +7235     
+ Partials       91      50       -41     
Impacted Files Coverage Δ
...coustics.Shared/Extensions/EnumerableExtensions.cs 0.00% <0.00%> (-67.91%) ⬇️
.../AnalysisPrograms/Recognizers/GenericRecognizer.cs 0.00% <0.00%> (-78.36%) ⬇️
src/AudioAnalysisTools/CrossCorrelation.cs 0.00% <ø> (-45.55%) ⬇️
src/AudioAnalysisTools/DSP/MFCCStuff.cs 0.00% <0.00%> (-74.00%) ⬇️
...c/AudioAnalysisTools/Events/Types/HarmonicEvent.cs 0.00% <0.00%> (ø)
...AudioAnalysisTools/Harmonics/HarmonicParameters.cs 0.00% <0.00%> (-67.65%) ⬇️
...AudioAnalysisTools/Ocillations/Oscillations2010.cs 0.00% <0.00%> (-14.25%) ⬇️
...AudioAnalysisTools/Ocillations/Oscillations2012.cs 0.00% <ø> (-92.97%) ⬇️
src/TowseyLibrary/ImageTools.cs 0.00% <0.00%> (-38.28%) ⬇️
src/TowseyLibrary/MatrixTools.cs 0.00% <ø> (-40.73%) ⬇️
... and 470 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2099caf...35bbc6c. Read the comment docs.

@towsey
Copy link
Contributor Author

towsey commented Jun 13, 2021

I believe the two failing tests are to do with ProcessRunner.

Copy link
Member

@atruskie atruskie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good change.

  • we should adjust for spectrogram smoothing temporal shifts ourselves when we report the events
  • The changes to the specification tests should be rolled back
  • I don't like that we define a harmonic as 3 or more formants (technically only one other frequency is needed other than the fundamental), but if you insist, then we should document the change: edit /docs/technical/apidoc/HarmonicParameters.md .

src/AudioAnalysisTools/Events/Types/HarmonicEvent.cs Outdated Show resolved Hide resolved
Comment on lines 281 to 292
public static List<EventCommon> ConvertScoreArray2Events(
SpectrogramStandard spectrogram,
double[] scores,
double[] dBArray,
int[] maxIndexArray,
double minDuration,
double maxDuration,
int minHz,
int maxHz,
int bandBinCount,
double scoreThreshold,
TimeSpan segmentStartOffset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, this method seems like a copy of an existing method.

  • It seems like you should have passed a constructor function into the particular method you wanted to change?
  • If it must stay as a copy, then it should be improved. E.g. the UnitCoverters are an object that is not meant to be reconstructed in every iteration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have shifted the UnitConverter as you suggested. I did consider trying to refactor the code so as not to copy previous method. However, the HarmonicEvent detection requires is own treatment within the various loops.

var parameters = new HarmonicParameters
{
MinHertz = 400,
MaxHertz = 6000,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value was set at 5500 - why has it been bumped to 6000?
The 5500 value is measured from the source recording.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The harmonic detection should work without the top and bottom frequency bounds being tight. In fact it is a desirable feature that it should perform well with loose frequency bounds. The TestHarmonicsAlgorithm() in GenericRecognizerTests works well with loose very loose bounds. But this is an artificial example. Your test recording is more difficult to deal with because there appears to be a lot of "structured noise" in the background.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this code with the original values - it almost works.

Comment on lines 57 to 58
MaxFormantGap = 550,
MinFormantGap = 300,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the test still work with tighter bounds? The harmonic interval is 440Hz, I'd really expect tighter bounds to work. Like 400 and 480.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes the test works with tighter bounds. But it is still not exactly 440.

Comment on lines +60 to +63
//Need to make allowance for a longer than actual duration.
//because of smoothing of the spectrogram frames prior to the auto-crosscorrelation.
MinDuration = 1.0,
MaxDuration = 1.16,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking from a user perspective here, really we should adjust for smoothing concerns. That process is invisible to a user. Any user making a recogniser will measure their target event and set a range. The event is one second long.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, even in the debug spectrograms we can see the event is advanced forward by a frame. Another good reason to fix the problem in the algorithm.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made smoothing of the spectrogram prior to detecting harmonics accessible to the user through a "SmoothingWindow" parameter. Its default value is zero, i.e. no smoothing. I have removed all the trimming "fixes" that were previously done to compensate for the smoothing that was done by default.

Comment on lines 98 to 106
{
MinHertz = 400,
MaxHertz = 6000,
MaxFormantGap = 1100,
MinFormantGap = 950,
MinDuration = 0.95,
MaxDuration = 1.17,
DecibelThresholds = new double?[] { this.decibelThreshold },
DctThreshold = 0.3,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comments on parameters from the first test changing from the supplied tests apply here.

double[,] cosineBasisFunctions = MFCCStuff.Cosines(8, 8);

//following line writes matrix of cos values for checking.
FileTools.WriteMatrix2File_Formatted(cosineBasisFunctions, this.TestOutputDirectory.FullName + "Cosines.txt", "F3");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use the SaveTestOutput function and use the Csv.WriteMatrix function

@towsey
Copy link
Contributor Author

towsey commented Jun 15, 2021

Concerning your comment - "I don't like that we define a harmonic as 3 or more formants (technically only one other frequency is needed other than the fundamental)". The harmonic detection relies upon a DCT which can give spurious results where there are other components contributing to the spectrum. The justification for requiring a minimum of three formants is that this gives two intervals. It is very difficult for a DCT to determine the presence of harmonics given only one interval. Requiring two intervals narrows down the possibility for a spurious result. You previously had to set a high formant gap because the DCT was picking up background components at the low index end of the DCT coefficient array. I have had this problem before but not had time to fix it. The things I have done in this round of fixes has helped a lot. But harmonic detection remains prone to error in presence of background noise.

towsey added 4 commits June 15, 2021 12:14
Issue #471 Make spectrogram smoothing accessible to user. The default is no spectrogram smoothing.
Issue $471 Adjust expected values in unit tests. Returned frequency bounds to previous values. Added in the HarmonicInterval as a value to be checked.
Issue #471 Remove hard coded path.
Copy link
Member

@atruskie atruskie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you've introduced new functionality in the change since the last review, which need some tweaks.

You also didn't add to the documentation file the notes about the limits of the harmonic algorithm requiring three or more formants.

/// This is used to run a moving average filter along each of the frequency bins.
/// It can help to smooth over discontinuous formants.
/// </summary>
public int? SmoothingWindow { get; set; } = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should probably default to null if it is nullable

Also add validation to new properties so we can give users nice feedback for bad input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added new functionality due to your comment about adjusting start/end frames to compensate for smoothing. Making the smoothing explicit means I could do away with hidden adjusting. I will remove the nullable because it is best to have a default value = 0.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do I add validation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like this:

public override IEnumerable<ValidationResult> Validate(ValidationContext validationContext)
{
return base
.Validate(validationContext)
.Append(
this.BottomHertzBuffer.ValidateNotNull(nameof(this.BottomHertzBuffer)))
.Append(
this.TopHertzBuffer.ValidateNotNull(nameof(this.TopHertzBuffer)));
}
}

/// </summary>
/// <value>
/// The ?average/calculated? gap between formant peaks, in hertz.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the question marks meant I didn't know, clarify please

@@ -53,6 +62,7 @@ public class HarmonicParameters : CommonParameters
spectrogram,
hp.MinHertz.Value,
hp.MaxHertz.Value,
hp.SmoothingWindow.Value,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could be null, don't do .Value rather pass the nullable into the function and use is null as your test

for (int t = 2; t < rowCount - 2; t++)
// Run a moving average filter along each frequency bin.
// This may help to fill noise gaps in formants. Ignore values <3.
if (smoothingWindow > 2)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (smoothingWindow > 2)
if (smoothingWindow is not null)

assuming you take the above suggestion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is meant to be >2 because there is no point doing a moving average with window = 0 or 1. And the best minimum value if it is to be done is 3.

Copy link
Member

@atruskie atruskie Jun 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, there are two concerns, should we smooth? and is the smoothing value valid?

If you add the validation from the comment above, we can ensure a bad value (<3) is not supplied by the user.

Suggested change
if (smoothingWindow > 2)
if (smoothingWindow is not null) {
var smoothing = Math.Max(3, smoothingWindow.Value);

or

Suggested change
if (smoothingWindow > 2)
if (smoothingWindow is not null) {
if (smoothingWindow.Value < 3) { throw new ArgumentException("cannot be less than 3", nameof(smoothingWindow)' }

dealer's choice

towsey added 2 commits June 15, 2021 14:53
Issue #471 Small changes requested by Anthony.
Issue #471 Editing of the md file for Harmonics, taking account of recent changes.
@towsey
Copy link
Contributor Author

towsey commented Jun 15, 2021

I have just committed a few more changes including changes to the md documentation. I think this is it for this issue.??

Restored Value in xml doc
@atruskie atruskie merged commit dd5e5e1 into master Jun 16, 2021
@atruskie atruskie deleted the Issue471_CheckFormantGap branch June 16, 2021 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The MaxFormantGap parameter is broken or needs clarification
2 participants