Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 23 additions & 13 deletions doc/BufPitch.rst
Original file line number Diff line number Diff line change
@@ -1,37 +1,48 @@
:digest: A Selection of Pitch Descriptors on a Buffer
:digest: Pitch Descriptor
:species: buffer-proc
:sc-categories: Libraries>FluidDecomposition
:sc-related: Guides/FluidCorpusManipulationToolkit, Classes/SpecCentroid, Classes/SpecFlatness, Classes/SpecCentroid, Classes/SpecPcile
:see-also: Pitch, BufLoudness, BufMelBands, BufMFCC, BufSpectralShape, BufStats
:description: Implements three pitch descriptors, computed as frequency and the confidence in its value.
:discussion: The process will return a multichannel buffer with two channels per input channel, one for pitch and one for the pitch tracking confidence. A pitch of 0 Hz is yield (or -999.0 when the unit is in MIDI note) when the algorithm cannot find a fundamental at all. Each sample represents a value, which is every hopSize. Its sampling rate is sourceSR / hopSize.
:process: This is the method that calls for the pitch descriptor to be calculated on a given source buffer.
:output: Nothing, as the destination buffer is declared in the function call.
:description: Three popular pitch descriptors, all of which compute frequency and the confidence that a pitch is present.
:discussion:

:fluid-obj:`Pitch` returns both ``pitch`` and ``confidence`` values. When no pitch can be detected, a pitch of 0 Hz is returned (or -999.0 when the unit is in MIDI note mode).

For information about the pitch descriptor algorithms, see the ``algorithm`` parameter below.

The "confidence" output is a value between 0 and 1 indicating how confident the algorithm is in the pitch that it is reporting. In effect this can be an estimation of how "noisy" (closer to 0) or "harmonic" (closer to 1) the spectrum is. The confidence may also be low when a signal contains polyphony, as the algorithms are not intended for multiple pitch streams.

The ``unit`` argument indicates whether the pitch output should be in hertz (indicated by 0) or MIDI note numbers (indicated by 1). MIDI note numbers may be useful, not only because of their direct relationship to MIDI-based synthesis systems, but also because of the logarithmic relationship to hertz, making them perceptually evenly-spaced units (1 MIDI note = 1 semitone).

For more information visit https://learn.flucoma.org/reference/pitch/.

:process: This is the method that calls for the descriptor to be calculated on a given source buffer.

:output: Nothing, as the destination buffer is declared in the function call.

:control source:

The index of the buffer to use as the source material to be pitch-tracked. The different channels of multichannel buffers will be processing sequentially.

:control startFrame:

Where in the srcBuf should the process start, in sample.
Where in ``source`` to start the analysis, in samples. The default is 0.

:control numFrames:

How many frames should be processed.
How many samples to analyse. The default of -1 indicates to analyse through to the end of the buffer.

:control startChan:

For multichannel srcBuf, which channel should be processed first.
For multichannel ``source``, from which channel to begin analysing. The default is 0.

:control numChans:

For multichannel srcBuf, how many channel should be processed.
For multichannel ``source``, how many channel should be processed. The default of -1 indicates to analyse through the last channel in the buffer.

:control features:

The destination buffer for the pitch descriptors.
The destination buffer for the descriptors.

:control algorithm:

Expand All @@ -50,11 +61,11 @@

:control minFreq:

The minimum frequency that the algorithm will search for an estimated fundamental. This sets the lowest value that will be generated.
The minimum frequency that the algorithm will search for an estimated fundamental. This sets the lowest value that will be generated. The default is 20.

:control maxFreq:

The maximum frequency that the algorithm will search for an estimated fundamental. This sets the highest value that will be generated.
The maximum frequency that the algorithm will search for an estimated fundamental. This sets the highest value that will be generated. The default is 10000.

:control unit:

Expand Down Expand Up @@ -83,4 +94,3 @@
:control action:

A Function to be evaluated once the offline process has finished and all Buffer's instance variables have been updated on the client side. The function will be passed [features] as an argument.

30 changes: 20 additions & 10 deletions doc/Pitch.rst
Original file line number Diff line number Diff line change
@@ -1,26 +1,37 @@
:digest: A Selection of Pitch Descriptors in Real-Time
:digest: Real-time Pitch Descriptor
:species: descriptor
:sc-categories: Libraries>FluidDecomposition
:sc-related: Guides/FluidCorpusManipulationToolkit, Classes/Pitch
:see-also: BufPitch, MFCC, MelBands, Loudness, SpectralShape
:description: Three popular pitch descriptors, computed as frequency and the confidence in its value.
:discussion: The process will return a multichannel control steam with [pitch, confidence] values, which will be repeated if no change happens within the algorithm, i.e. when the hopSize is larger than the signal vector size. A pitch of 0 Hz is yield (or -999.0 when the unit is in MIDI note) when the algorithm cannot find a fundamental at all.
:description: Three popular monophonic pitch descriptors, all of which compute frequency and confidence.
:discussion:

:fluid-obj:`Pitch` returns both ``pitch`` and ``confidence`` values. When no pitch can be detected, a pitch of 0 Hz is returned (or -999.0 when the unit is in MIDI note mode).

For information about the pitch descriptor algorithms, see the ``algorithm`` parameter below.

The "confidence" output is a value between 0 and 1 indicating how confident the algorithm is in the pitch that it is reporting. In effect this can be an estimation of how "noisy" (closer to 0) or "harmonic" (closer to 1) the spectrum is. The confidence may also be low when a signal contains polyphony, as the algorithms are not intended for multiple pitch streams.

The ``unit`` argument indicates whether the pitch output should be in hertz (indicated by 0) or MIDI note numbers (indicated by 1). MIDI note numbers may be useful, not only because of their direct relationship to MIDI-based synthesis systems, but also because of the logarithmic relationship to hertz, making them perceptually evenly-spaced units (1 MIDI note = 1 semitone).

For more information visit https://learn.flucoma.org/reference/pitch/.

:process: The audio rate in, control rate out version of the object.
:output: A 2-channel KR signal with the [pitch, confidence] descriptors. The latency is windowSize.

:output: The two descriptors: [pitch, confidence]. The latency is windowSize.

:control in:

The audio to be processed.

:control algorithm:

The algorithm to estimate the pitch. The options are:
The algorithm to estimate the pitch. (The default is 2.) The options are:

:enum:

:0:
Cepstrum: Returns a pitch estimate as the location of the second highest peak in the Cepstrum of the signal (after DC).
Cepstrum: Returns a pitch estimate as the location of the highest peak (not including DC) in the Cepstrum of the signal.

:1:
Harmonic Product Spectrum: Implements the Harmonic Product Spectrum algorithm for pitch detection . See e.g. A. Lerch, "An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics." John Wiley & Sons, 2012.https://onlinelibrary.wiley.com/doi/book/10.1002/9781118393550
Expand All @@ -30,15 +41,15 @@

:control minFreq:

The minimum frequency that the algorithm will search for an estimated fundamental. This sets the lowest value that will be generated.
The minimum frequency that the algorithm will search for. This sets the lowest value that can be generated. The default is 20.

:control maxFreq:

The maximum frequency that the algorithm will search for an estimated fundamental. This sets the highest value that will be generated.
The maximum frequency that the algorithm will search for. This sets the highest value that can be generated. The default is 10000.

:control unit:

The unit of the estimated value. The default of 0 is in Hz. A value of 1 will convert to MIDI note values.
The unit of the pitch output. The default of 0 indicates to output in Hz. A value of 1 will output MIDI note values.

:control windowSize:

Expand All @@ -55,4 +66,3 @@
:control maxFFTSize:

How large can the FFT be, by allocating memory at instantiation time. This cannot be modulated.

17 changes: 12 additions & 5 deletions example-code/sc/BufPitch.scd
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,10 @@ code::
(
// load a sound file
~scratchy = Buffer.read(s,FluidFilesPath("Tremblay-ASWINE-ScratchySynth-M.wav"));

// and a buffer to write the FluidBufPitch features into
~pitch_features_buf = Buffer.new(s);

// specify some params for the analysis (these are the defaults, but we'll specify them here so we can use them later)
~windowSize = 1024;
~hopSize = 512;
Expand All @@ -15,14 +17,17 @@ code::
FluidBufPitch.processBlocking(s,~scratchy,features:~pitch_features_buf,windowSize:~windowSize,hopSize:~hopSize);
~pitch_features_buf.loadToFloatArray(action:{
arg fa;
~pitch_features_array = fa.clump(2);
~pitch_features_array = fa.clump(~pitch_features_buf.numChannels);
"done".postln;
});
)

//look at the retrieved formatted array of [pitch,confidence] values
~pitch_features_array.postln

//iterate and make an array of the indices which are fitting the conditions
//iterate and make an array of the indices which are fitting the conditions:
// - pitch > 500 hz
// - confidence > 0.98
(
~selected_indices = List.new;
~pitch_features_array.do({
Expand All @@ -36,7 +41,7 @@ FluidBufPitch.processBlocking(s,~scratchy,features:~pitch_features_buf,windowSiz
)

(
// In order to granulate the frames, we need to convert our indices to centerPos.
// In order to granulate the frames, we need to convert our indices to centerPos in seconds for TGrains to use.
~selected_center_pos = ~selected_indices.collect({arg i; (i * ~hopSize) / ~scratchy.sampleRate});
~selected_center_pos.postln;
// Load this list of center positions into a buffer so we can look them up later on the server
Expand Down Expand Up @@ -67,8 +72,10 @@ CODE::
)

// composite one on left one on right as test signals
(
FluidBufCompose.processBlocking(s,~piano, destination:~both,action:{"done".postln});
FluidBufCompose.processBlocking(s,~guitar,numFrames:~piano.numFrames,startFrame:555000,destStartChan:1,destination:~both,action:{"done".postln});
)

// listen
~both.play
Expand All @@ -77,10 +84,10 @@ FluidBufCompose.processBlocking(s,~guitar,numFrames:~piano.numFrames,startFrame:
~pitch_analysis = Buffer(s);

//run the process on them, with limited bandwidth
FluidBufPitch.process(s, ~both, features: ~pitch_analysis, minFreq:60, maxFreq:4000,action:{"done".postln});
FluidBufPitch.processBlocking(s, ~both, features: ~pitch_analysis, minFreq:60, maxFreq:4000,action:{"done".postln});

// look at the buffer: [pitch,confidence] for left then [pitch,confidence] for right
FluidWaveform(~both,featureBuffer:~pitch_analysis,stackFeatures:true,bounds:Rect(0,0,1600,400));
FluidWaveform(~both,featuresBuffer:~pitch_analysis,stackFeatures:true,bounds:Rect(0,0,1600,400));

// blue is piano pitch
// orange is piano pitch confidence
Expand Down
4 changes: 2 additions & 2 deletions example-code/sc/Pitch.scd
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ code::
// load some audio
~scratchy = Buffer.read(s,FluidFilesPath("Tremblay-ASWINE-ScratchySynth-M.wav"));

// This synth sends the source sound to the delay only when the pitch confidence is above 0.95.
// This synth sends the source sound to the delay only when the pitch confidence is above a threshold.
// This way the scratchy, distorted parts of the sound file are not heard in the delay.
(
{
Expand All @@ -33,7 +33,7 @@ code::
var latency = windowSize / SampleRate.ir;
# freq, conf = FluidPitch.kr(src,windowSize:windowSize);
src = DelayN.ar(src,latency,latency);
sig = CombC.ar(src * (conf > 0.98).lag(0.01),0.5,0.1,3);
sig = CombC.ar(src * (conf > 0.99).lag(0.005),0.5,0.1,3);
[src,sig];
}.play;
)
Expand Down