diff --git a/doc/BufPitch.rst b/doc/BufPitch.rst index 2261376..78c9c74 100644 --- a/doc/BufPitch.rst +++ b/doc/BufPitch.rst @@ -1,13 +1,24 @@ -:digest: A Selection of Pitch Descriptors on a Buffer +:digest: Pitch Descriptor :species: buffer-proc :sc-categories: Libraries>FluidDecomposition :sc-related: Guides/FluidCorpusManipulationToolkit, Classes/SpecCentroid, Classes/SpecFlatness, Classes/SpecCentroid, Classes/SpecPcile :see-also: Pitch, BufLoudness, BufMelBands, BufMFCC, BufSpectralShape, BufStats -:description: Implements three pitch descriptors, computed as frequency and the confidence in its value. -:discussion: The process will return a multichannel buffer with two channels per input channel, one for pitch and one for the pitch tracking confidence. A pitch of 0 Hz is yield (or -999.0 when the unit is in MIDI note) when the algorithm cannot find a fundamental at all. Each sample represents a value, which is every hopSize. Its sampling rate is sourceSR / hopSize. -:process: This is the method that calls for the pitch descriptor to be calculated on a given source buffer. -:output: Nothing, as the destination buffer is declared in the function call. +:description: Three popular pitch descriptors, all of which compute frequency and the confidence that a pitch is present. +:discussion: + + :fluid-obj:`Pitch` returns both ``pitch`` and ``confidence`` values. When no pitch can be detected, a pitch of 0 Hz is returned (or -999.0 when the unit is in MIDI note mode). + + For information about the pitch descriptor algorithms, see the ``algorithm`` parameter below. + + The "confidence" output is a value between 0 and 1 indicating how confident the algorithm is in the pitch that it is reporting. In effect this can be an estimation of how "noisy" (closer to 0) or "harmonic" (closer to 1) the spectrum is. The confidence may also be low when a signal contains polyphony, as the algorithms are not intended for multiple pitch streams. + + The ``unit`` argument indicates whether the pitch output should be in hertz (indicated by 0) or MIDI note numbers (indicated by 1). MIDI note numbers may be useful, not only because of their direct relationship to MIDI-based synthesis systems, but also because of the logarithmic relationship to hertz, making them perceptually evenly-spaced units (1 MIDI note = 1 semitone). + For more information visit https://learn.flucoma.org/reference/pitch/. + +:process: This is the method that calls for the descriptor to be calculated on a given source buffer. + +:output: Nothing, as the destination buffer is declared in the function call. :control source: @@ -15,23 +26,23 @@ :control startFrame: - Where in the srcBuf should the process start, in sample. + Where in ``source`` to start the analysis, in samples. The default is 0. :control numFrames: - How many frames should be processed. + How many samples to analyse. The default of -1 indicates to analyse through to the end of the buffer. :control startChan: - For multichannel srcBuf, which channel should be processed first. + For multichannel ``source``, from which channel to begin analysing. The default is 0. :control numChans: - For multichannel srcBuf, how many channel should be processed. + For multichannel ``source``, how many channel should be processed. The default of -1 indicates to analyse through the last channel in the buffer. :control features: - The destination buffer for the pitch descriptors. + The destination buffer for the descriptors. :control algorithm: @@ -50,11 +61,11 @@ :control minFreq: - The minimum frequency that the algorithm will search for an estimated fundamental. This sets the lowest value that will be generated. + The minimum frequency that the algorithm will search for an estimated fundamental. This sets the lowest value that will be generated. The default is 20. :control maxFreq: - The maximum frequency that the algorithm will search for an estimated fundamental. This sets the highest value that will be generated. + The maximum frequency that the algorithm will search for an estimated fundamental. This sets the highest value that will be generated. The default is 10000. :control unit: @@ -83,4 +94,3 @@ :control action: A Function to be evaluated once the offline process has finished and all Buffer's instance variables have been updated on the client side. The function will be passed [features] as an argument. - diff --git a/doc/Pitch.rst b/doc/Pitch.rst index 795ee07..3ab502c 100644 --- a/doc/Pitch.rst +++ b/doc/Pitch.rst @@ -1,13 +1,24 @@ -:digest: A Selection of Pitch Descriptors in Real-Time +:digest: Real-time Pitch Descriptor :species: descriptor :sc-categories: Libraries>FluidDecomposition :sc-related: Guides/FluidCorpusManipulationToolkit, Classes/Pitch :see-also: BufPitch, MFCC, MelBands, Loudness, SpectralShape -:description: Three popular pitch descriptors, computed as frequency and the confidence in its value. -:discussion: The process will return a multichannel control steam with [pitch, confidence] values, which will be repeated if no change happens within the algorithm, i.e. when the hopSize is larger than the signal vector size. A pitch of 0 Hz is yield (or -999.0 when the unit is in MIDI note) when the algorithm cannot find a fundamental at all. +:description: Three popular monophonic pitch descriptors, all of which compute frequency and confidence. +:discussion: + + :fluid-obj:`Pitch` returns both ``pitch`` and ``confidence`` values. When no pitch can be detected, a pitch of 0 Hz is returned (or -999.0 when the unit is in MIDI note mode). + + For information about the pitch descriptor algorithms, see the ``algorithm`` parameter below. + + The "confidence" output is a value between 0 and 1 indicating how confident the algorithm is in the pitch that it is reporting. In effect this can be an estimation of how "noisy" (closer to 0) or "harmonic" (closer to 1) the spectrum is. The confidence may also be low when a signal contains polyphony, as the algorithms are not intended for multiple pitch streams. + + The ``unit`` argument indicates whether the pitch output should be in hertz (indicated by 0) or MIDI note numbers (indicated by 1). MIDI note numbers may be useful, not only because of their direct relationship to MIDI-based synthesis systems, but also because of the logarithmic relationship to hertz, making them perceptually evenly-spaced units (1 MIDI note = 1 semitone). + + For more information visit https://learn.flucoma.org/reference/pitch/. + :process: The audio rate in, control rate out version of the object. -:output: A 2-channel KR signal with the [pitch, confidence] descriptors. The latency is windowSize. +:output: The two descriptors: [pitch, confidence]. The latency is windowSize. :control in: @@ -15,12 +26,12 @@ :control algorithm: - The algorithm to estimate the pitch. The options are: + The algorithm to estimate the pitch. (The default is 2.) The options are: :enum: :0: - Cepstrum: Returns a pitch estimate as the location of the second highest peak in the Cepstrum of the signal (after DC). + Cepstrum: Returns a pitch estimate as the location of the highest peak (not including DC) in the Cepstrum of the signal. :1: Harmonic Product Spectrum: Implements the Harmonic Product Spectrum algorithm for pitch detection . See e.g. A. Lerch, "An Introduction to Audio Content Analysis: Applications in Signal Processing and Music Informatics." John Wiley & Sons, 2012.https://onlinelibrary.wiley.com/doi/book/10.1002/9781118393550 @@ -30,15 +41,15 @@ :control minFreq: - The minimum frequency that the algorithm will search for an estimated fundamental. This sets the lowest value that will be generated. + The minimum frequency that the algorithm will search for. This sets the lowest value that can be generated. The default is 20. :control maxFreq: - The maximum frequency that the algorithm will search for an estimated fundamental. This sets the highest value that will be generated. + The maximum frequency that the algorithm will search for. This sets the highest value that can be generated. The default is 10000. :control unit: - The unit of the estimated value. The default of 0 is in Hz. A value of 1 will convert to MIDI note values. + The unit of the pitch output. The default of 0 indicates to output in Hz. A value of 1 will output MIDI note values. :control windowSize: @@ -55,4 +66,3 @@ :control maxFFTSize: How large can the FFT be, by allocating memory at instantiation time. This cannot be modulated. - diff --git a/example-code/sc/BufPitch.scd b/example-code/sc/BufPitch.scd index 84d0f44..84f2371 100644 --- a/example-code/sc/BufPitch.scd +++ b/example-code/sc/BufPitch.scd @@ -3,8 +3,10 @@ code:: ( // load a sound file ~scratchy = Buffer.read(s,FluidFilesPath("Tremblay-ASWINE-ScratchySynth-M.wav")); + // and a buffer to write the FluidBufPitch features into ~pitch_features_buf = Buffer.new(s); + // specify some params for the analysis (these are the defaults, but we'll specify them here so we can use them later) ~windowSize = 1024; ~hopSize = 512; @@ -15,14 +17,17 @@ code:: FluidBufPitch.processBlocking(s,~scratchy,features:~pitch_features_buf,windowSize:~windowSize,hopSize:~hopSize); ~pitch_features_buf.loadToFloatArray(action:{ arg fa; - ~pitch_features_array = fa.clump(2); + ~pitch_features_array = fa.clump(~pitch_features_buf.numChannels); + "done".postln; }); ) //look at the retrieved formatted array of [pitch,confidence] values ~pitch_features_array.postln -//iterate and make an array of the indices which are fitting the conditions +//iterate and make an array of the indices which are fitting the conditions: +// - pitch > 500 hz +// - confidence > 0.98 ( ~selected_indices = List.new; ~pitch_features_array.do({ @@ -36,7 +41,7 @@ FluidBufPitch.processBlocking(s,~scratchy,features:~pitch_features_buf,windowSiz ) ( -// In order to granulate the frames, we need to convert our indices to centerPos. +// In order to granulate the frames, we need to convert our indices to centerPos in seconds for TGrains to use. ~selected_center_pos = ~selected_indices.collect({arg i; (i * ~hopSize) / ~scratchy.sampleRate}); ~selected_center_pos.postln; // Load this list of center positions into a buffer so we can look them up later on the server @@ -67,8 +72,10 @@ CODE:: ) // composite one on left one on right as test signals +( FluidBufCompose.processBlocking(s,~piano, destination:~both,action:{"done".postln}); FluidBufCompose.processBlocking(s,~guitar,numFrames:~piano.numFrames,startFrame:555000,destStartChan:1,destination:~both,action:{"done".postln}); +) // listen ~both.play @@ -77,10 +84,10 @@ FluidBufCompose.processBlocking(s,~guitar,numFrames:~piano.numFrames,startFrame: ~pitch_analysis = Buffer(s); //run the process on them, with limited bandwidth -FluidBufPitch.process(s, ~both, features: ~pitch_analysis, minFreq:60, maxFreq:4000,action:{"done".postln}); +FluidBufPitch.processBlocking(s, ~both, features: ~pitch_analysis, minFreq:60, maxFreq:4000,action:{"done".postln}); // look at the buffer: [pitch,confidence] for left then [pitch,confidence] for right -FluidWaveform(~both,featureBuffer:~pitch_analysis,stackFeatures:true,bounds:Rect(0,0,1600,400)); +FluidWaveform(~both,featuresBuffer:~pitch_analysis,stackFeatures:true,bounds:Rect(0,0,1600,400)); // blue is piano pitch // orange is piano pitch confidence diff --git a/example-code/sc/Pitch.scd b/example-code/sc/Pitch.scd index a4d60b6..f5e74c7 100644 --- a/example-code/sc/Pitch.scd +++ b/example-code/sc/Pitch.scd @@ -24,7 +24,7 @@ code:: // load some audio ~scratchy = Buffer.read(s,FluidFilesPath("Tremblay-ASWINE-ScratchySynth-M.wav")); -// This synth sends the source sound to the delay only when the pitch confidence is above 0.95. +// This synth sends the source sound to the delay only when the pitch confidence is above a threshold. // This way the scratchy, distorted parts of the sound file are not heard in the delay. ( { @@ -33,7 +33,7 @@ code:: var latency = windowSize / SampleRate.ir; # freq, conf = FluidPitch.kr(src,windowSize:windowSize); src = DelayN.ar(src,latency,latency); - sig = CombC.ar(src * (conf > 0.98).lag(0.01),0.5,0.1,3); + sig = CombC.ar(src * (conf > 0.99).lag(0.005),0.5,0.1,3); [src,sig]; }.play; )