Constant Q-Transform #588

vincentqb · 2020-04-26T20:33:25Z

We would like to have in torchaudio

constant q-transform, as librosa
inverse constant q-transform, as librosa

dhgrs · 2020-09-20T06:51:44Z

Hi, I'm interested in implementing CQT and I have questions about it.

After LibROSA 0.8, the links in your post are outdated. Maybe cqt and griffinlim_cqt, right?
LibROSA has some variants of CQT. If we test the code by comparing the results with LibROSA, these is a difficulty. Because librosa.cqt and librosa.griffinlim include kaiser_* resampling inside of it but torchaudio doesn't have them.
How about focusing on librosa.pseudo_cqt? It doesn't include resampling but doesn't support inverse conversion.

KinWaiCheuk · 2020-11-01T10:45:05Z

I have implemented the librosa CQT in my project, I hope it would be useful for you guys.
https://github.com/KinWaiCheuk/nnAudio/blob/master/Installation/nnAudio/Spectrogram.py#L990

Several improvements can be made:

The for loop at line 1223 is looping over different octaves, however, is currently the bottleneck.
The lowpass filter at 1032 is not as good as the librosa version
In librosa, they use a sparse matrix to store the frequency-domain CQT kernels. I did not use any sparse matrix in my implementation. Instead, I realized that it is not necessary to obtain the frequency-domain CQT kernels, the time-domain CQT kernels also work as well and even faster. This improved version has been implemented as another PyTorch class called CQT2010v2 in my project.

Regarding pseduo_cqt, I know that they did not use any downsampling, but I am not sure how is it different from the CQT algorithm proposed in 1992. If they are the same, then I also have this version of CQT named as CQT1992 in my project.

I hope it would be useful for you guys, and I am currently also curious about how to implement inverse CQT.

Flask tutorial

ktatar · 2021-03-23T06:32:10Z

Hi all,
It seems like there is an implementation provided by @KinWaiCheuk. Do you plan to add this to the library?

vincentqb · 2021-03-23T22:22:20Z

There is no plan currently, but I'd welcome a pull request from the community that implements CQT and its inverse. If you are interested in working on such a pull request, please feel free to do so :)

The pull request does need to test against librosa as a reference.
For performance, we should compare the implementation between librosa and torchaudio using timeit and see what is the speed difference.

d-dawg78 · 2024-06-26T10:19:24Z

Hey everyone,

I am currently wrapping up torchaudio implementations of the VQT, CQT, and iCQT, that test against librosa (torchaudio resampling changes the signal too much compared to librosa after a few iterations, but the first few octaves have the same or similar values; proposed version is also much much quicker than librosa; all details in a PR to come). Do I have the green light to PR? Just wrapping up the last batch of tests 🧪 Let's get these wonderful transforms to torchaudio!

Edit: link to the forked repo with changes is here

d-dawg78 · 2024-06-27T15:36:22Z

Hey everyone,

A quick follow up from the above. The librosa cqt (and vqt and icqt) being matched in my fork is the following:

librosa_vqt = cqt(
    y=y,
    sr=<SAMPLE_RATE>,
    hop_length=<HOP_LENGTH>,
    fmin=<F_MIN>,
    n_bins=<N_BINS>,
    bins_per_octave=<BINS_PER_OCTAVE>,
    sparsity=0.,
    res_type="sinc_best",
    scale=False,
)

Here's a sample figure comparing the proposed and librosa versions using the audio snippet from here, with:

SAMPLE_RATE = 44100
HOP_LENGTH = 512
F_MIN = 32.703
N_BINS = 108
BINS_PER_OCTAVE = 12

The results are pretty much identical 😃 Opening a draft PR for now.

mthrok · 2024-06-28T14:55:52Z

Hi

I no longer maintain this library, so I'm in a bit awkward position, but with the unit testing and such, this looks low risk/low maintenance cost addition.

@nateanl thoughts?

nateanl · 2024-06-28T16:17:11Z

I'm down to adding this feature to TorchAudio. Although librosa already has implementation of it, enabling the feature with GPU computation can boost the training speed.

d-dawg78 · 2024-06-28T22:56:17Z

Cool, thanks for the quick answers! I'll finish up the last few details and request your review in the coming days.

vincentqb added the help wanted label Apr 26, 2020

mthrok pushed a commit to mthrok/audio that referenced this issue Feb 26, 2021

Merge pull request pytorch#588 from SethHWeidman/flask_tutorial

e8cf0c5

Flask tutorial

vincentqb mentioned this issue Mar 22, 2021

librosa.cqt for torchaudio #1403

Closed

nateanl added the contributions welcome label Sep 1, 2021

d-dawg78 mentioned this issue Jun 27, 2024

CQT, iCQT, and VQT implementations and testing #3804

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Constant Q-Transform #588

Constant Q-Transform #588

vincentqb commented Apr 26, 2020 •

edited

Loading

dhgrs commented Sep 20, 2020

KinWaiCheuk commented Nov 1, 2020

ktatar commented Mar 23, 2021

vincentqb commented Mar 23, 2021

d-dawg78 commented Jun 26, 2024 •

edited

Loading

d-dawg78 commented Jun 27, 2024 •

edited

Loading

mthrok commented Jun 28, 2024

nateanl commented Jun 28, 2024

d-dawg78 commented Jun 28, 2024

Constant Q-Transform #588

Constant Q-Transform #588

Comments

vincentqb commented Apr 26, 2020 • edited Loading

dhgrs commented Sep 20, 2020

KinWaiCheuk commented Nov 1, 2020

ktatar commented Mar 23, 2021

vincentqb commented Mar 23, 2021

d-dawg78 commented Jun 26, 2024 • edited Loading

d-dawg78 commented Jun 27, 2024 • edited Loading

mthrok commented Jun 28, 2024

nateanl commented Jun 28, 2024

d-dawg78 commented Jun 28, 2024

vincentqb commented Apr 26, 2020 •

edited

Loading

d-dawg78 commented Jun 26, 2024 •

edited

Loading

d-dawg78 commented Jun 27, 2024 •

edited

Loading