-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue471 check formant gap #498
Conversation
Also adds some test assertions for validating objects. Related to #471
Issue #471 In order to condition the array of DCT coefficient values nicely, I have removed the linear trend. This helps to lower the long-wavelenth coefficients making it more likely to find the correct maximum.
Issue #471 When searching for harmonic events, sometimes small gaps appear in the harmonic score array which means the events are too short and therefore rejected. This hack fills in gaps of one or two frames. This problem arises because the maximum DCT coefficient is sometimes of a longer wavelength (lower array index) than the the expected wavelength.
Issue #471 Reworked these test. They now both pass. The problem lies in the return DCT coefficients being unduly sensitive to noise. The harmonic recognizer still appears to be sensitive to the minHertz and maxHertz parameter values in ways that I would not expect. THis remains to be checked.
Issue #471 Create a new method to detect harmonic events in a score array. Create new property in the HarmonicEvent class.
Issue #471 Add more detailed comments on the harmonic detection and DCT. Also clean out some unused code.
Issue #471 More explanatory comments.
Issue #471 The Harmonic Algorithm tests are now working as one would expect.
Issue #471 Refactor the method that draws a matrix as an image so that the method returns the image. This had side effect on the Oscillations2010 class.
Issue #471 Create a unit test for the method that draws a matrix of cosine basis functions.
Issue #471 Adjust this unit test following all previous changes to the methods that find harmonic events.
Issue #471 Removed the only useful method from this class and placed in HarmonicPArameters.cs. THis CrossCorrelations.cs class is now redundant and could possibly be removed except that it contains methods previously used in recognition of crow calls and human speech. However I doubt they of use any longer.
Issue #471 Shifted method DetectHarmonicsInSpectrogramData() to a location where it better belongs.
Codecov Report
@@ Coverage Diff @@
## master #498 +/- ##
==========================================
- Coverage 38.72% 0.70% -38.02%
==========================================
Files 482 481 -1
Lines 67466 48881 -18585
Branches 7870 7659 -211
==========================================
- Hits 26123 344 -25779
- Misses 41252 48487 +7235
+ Partials 91 50 -41
Continue to review full report at Codecov.
|
I believe the two failing tests are to do with ProcessRunner. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a good change.
- we should adjust for spectrogram smoothing temporal shifts ourselves when we report the events
- The changes to the specification tests should be rolled back
- I don't like that we define a harmonic as 3 or more formants (technically only one other frequency is needed other than the fundamental), but if you insist, then we should document the change: edit /docs/technical/apidoc/HarmonicParameters.md .
public static List<EventCommon> ConvertScoreArray2Events( | ||
SpectrogramStandard spectrogram, | ||
double[] scores, | ||
double[] dBArray, | ||
int[] maxIndexArray, | ||
double minDuration, | ||
double maxDuration, | ||
int minHz, | ||
int maxHz, | ||
int bandBinCount, | ||
double scoreThreshold, | ||
TimeSpan segmentStartOffset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, this method seems like a copy of an existing method.
- It seems like you should have passed a constructor function into the particular method you wanted to change?
- If it must stay as a copy, then it should be improved. E.g. the UnitCoverters are an object that is not meant to be reconstructed in every iteration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have shifted the UnitConverter as you suggested. I did consider trying to refactor the code so as not to copy previous method. However, the HarmonicEvent detection requires is own treatment within the various loops.
var parameters = new HarmonicParameters | ||
{ | ||
MinHertz = 400, | ||
MaxHertz = 6000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value was set at 5500
- why has it been bumped to 6000
?
The 5500
value is measured from the source recording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The harmonic detection should work without the top and bottom frequency bounds being tight. In fact it is a desirable feature that it should perform well with loose frequency bounds. The TestHarmonicsAlgorithm() in GenericRecognizerTests works well with loose very loose bounds. But this is an artificial example. Your test recording is more difficult to deal with because there appears to be a lot of "structured noise" in the background.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this code with the original values - it almost works.
MaxFormantGap = 550, | ||
MinFormantGap = 300, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the test still work with tighter bounds? The harmonic interval is 440Hz, I'd really expect tighter bounds to work. Like 400
and 480
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the test works with tighter bounds. But it is still not exactly 440.
//Need to make allowance for a longer than actual duration. | ||
//because of smoothing of the spectrogram frames prior to the auto-crosscorrelation. | ||
MinDuration = 1.0, | ||
MaxDuration = 1.16, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking from a user perspective here, really we should adjust for smoothing concerns. That process is invisible to a user. Any user making a recogniser will measure their target event and set a range. The event is one second long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, even in the debug spectrograms we can see the event is advanced forward by a frame. Another good reason to fix the problem in the algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made smoothing of the spectrogram prior to detecting harmonics accessible to the user through a "SmoothingWindow" parameter. Its default value is zero, i.e. no smoothing. I have removed all the trimming "fixes" that were previously done to compensate for the smoothing that was done by default.
{ | ||
MinHertz = 400, | ||
MaxHertz = 6000, | ||
MaxFormantGap = 1100, | ||
MinFormantGap = 950, | ||
MinDuration = 0.95, | ||
MaxDuration = 1.17, | ||
DecibelThresholds = new double?[] { this.decibelThreshold }, | ||
DctThreshold = 0.3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comments on parameters from the first test changing from the supplied tests apply here.
double[,] cosineBasisFunctions = MFCCStuff.Cosines(8, 8); | ||
|
||
//following line writes matrix of cos values for checking. | ||
FileTools.WriteMatrix2File_Formatted(cosineBasisFunctions, this.TestOutputDirectory.FullName + "Cosines.txt", "F3"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use the SaveTestOutput
function and use the Csv.WriteMatrix
function
Co-authored-by: Anthony Truskinger <[email protected]>
Concerning your comment - "I don't like that we define a harmonic as 3 or more formants (technically only one other frequency is needed other than the fundamental)". The harmonic detection relies upon a DCT which can give spurious results where there are other components contributing to the spectrum. The justification for requiring a minimum of three formants is that this gives two intervals. It is very difficult for a DCT to determine the presence of harmonics given only one interval. Requiring two intervals narrows down the possibility for a spurious result. You previously had to set a high formant gap because the DCT was picking up background components at the low index end of the DCT coefficient array. I have had this problem before but not had time to fix it. The things I have done in this round of fixes has helped a lot. But harmonic detection remains prone to error in presence of background noise. |
Issue #471 Make spectrogram smoothing accessible to user. The default is no spectrogram smoothing.
Issue $471 Adjust expected values in unit tests. Returned frequency bounds to previous values. Added in the HarmonicInterval as a value to be checked.
…coustics/audio-analysis into Issue471_CheckFormantGap
Issue #471 Remove hard coded path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, you've introduced new functionality in the change since the last review, which need some tweaks.
You also didn't add to the documentation file the notes about the limits of the harmonic algorithm requiring three or more formants.
/// This is used to run a moving average filter along each of the frequency bins. | ||
/// It can help to smooth over discontinuous formants. | ||
/// </summary> | ||
public int? SmoothingWindow { get; set; } = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should probably default to null
if it is nullable
Also add validation to new properties so we can give users nice feedback for bad input
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added new functionality due to your comment about adjusting start/end frames to compensate for smoothing. Making the smoothing explicit means I could do away with hidden adjusting. I will remove the nullable because it is best to have a default value = 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I add validation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like this:
audio-analysis/src/AudioAnalysisTools/Events/BlobParameters.cs
Lines 30 to 39 in 9a2b560
public override IEnumerable<ValidationResult> Validate(ValidationContext validationContext) | |
{ | |
return base | |
.Validate(validationContext) | |
.Append( | |
this.BottomHertzBuffer.ValidateNotNull(nameof(this.BottomHertzBuffer))) | |
.Append( | |
this.TopHertzBuffer.ValidateNotNull(nameof(this.TopHertzBuffer))); | |
} | |
} |
/// </summary> | ||
/// <value> | ||
/// The ?average/calculated? gap between formant peaks, in hertz. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the question marks meant I didn't know, clarify please
@@ -53,6 +62,7 @@ public class HarmonicParameters : CommonParameters | |||
spectrogram, | |||
hp.MinHertz.Value, | |||
hp.MaxHertz.Value, | |||
hp.SmoothingWindow.Value, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could be null, don't do .Value
rather pass the nullable into the function and use is null
as your test
for (int t = 2; t < rowCount - 2; t++) | ||
// Run a moving average filter along each frequency bin. | ||
// This may help to fill noise gaps in formants. Ignore values <3. | ||
if (smoothingWindow > 2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (smoothingWindow > 2) | |
if (smoothingWindow is not null) |
assuming you take the above suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is meant to be >2 because there is no point doing a moving average with window = 0 or 1. And the best minimum value if it is to be done is 3.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, there are two concerns, should we smooth? and is the smoothing value valid?
If you add the validation from the comment above, we can ensure a bad value (<3
) is not supplied by the user.
if (smoothingWindow > 2) | |
if (smoothingWindow is not null) { | |
var smoothing = Math.Max(3, smoothingWindow.Value); |
or
if (smoothingWindow > 2) | |
if (smoothingWindow is not null) { | |
if (smoothingWindow.Value < 3) { throw new ArgumentException("cannot be less than 3", nameof(smoothingWindow)' } |
dealer's choice
I have just committed a few more changes including changes to the md documentation. I think this is it for this issue.?? |
Restored Value in xml doc
Title of PR
Detection of Harmonic Events.
What is the purpose of this PR?
Fix error in detection of Harmonic Events.
Fixes #471
Changes
There are three major changes:
1: I detrended the array of dct coefficients before determining the maximum coefficient.
2: I removed coefficients 0 to 3 from contention as maximum coefficients. We expect there to be a minimum of three formants and three or more formants can only be detected by coefficients 4 and above.
3: I fill in gaps of one or two frames in a sequence of harmonics. This is a hack. -> this is the new
SmoothingWindow
parameterI also made various other changes to unit tests and refactoring of methods.
Issues
Detection of stacked harmonics still requires careful consideration of the appropriate parameter values.
Visual Changes
There are no visual changes to the spectrograms produced by this recognizer.
Final Checklist