Fast Legendre Transform #206

ioannisPApapadopoulos · 2024-09-03T08:30:34Z

@dlfivefifty
The performance of Legendre's plan_transform sometimes seems to degrade once the number of grid points becomes large. In particular I noticed that:

Synthesis \ is slower than analysis *. After some profiling @dlfivefifty noticed that plan_th_leg2cheb! in FastTransforms is being called during the run of synthesis. This is likely the cause.
Sometimes bundling multiple transforms together is not faster than running the transform sequentially in a for loop. Some of this is probably attributed to the call to plan_th_leg2cheb! in the synthesis but this behaviour also occurs during analysis. E.g.

F = plan_transform(P, (2000, 1000), 1)
F_1 = plan_transform(P, 2000, 1)
c=rand(2000,1000)
@time c̃ = F * c; # 2.986899 seconds (15 allocations: 15.259 MiB)
d = similar(c);
@time for k in axes(c,2)
    d[:,k] = F_1 * c[:,k]
end # 2.758650 seconds (10.93 k allocations: 30.913 MiB)

And the effect is worse in 2D (as the size of the arrays increase)

F = plan_transform(P, (100, 100, 400), (1,2))
F_1 = plan_transform(P, (100,100), (1,2))
c=rand(100,100,400)
@time c̃ = F * c; # 3.856632 seconds (30 allocations: 30.519 MiB)
d = similar(c);
@time for k in axes(c,3)
    d[:,:,k] = F_1 * c[:,:,k]
end # 3.332100 seconds (14.00 k allocations: 61.450 MiB, 0.11% gc time)

@MikaelSlevinsky Is the Legendre transform in FastTransforms going to change or is it currently stable?

The text was updated successfully, but these errors were encountered:

MikaelSlevinsky · 2024-09-03T16:31:36Z

Yep! I just need to go through the process of creating new binaries for https://github.com/MikaelSlevinsky/FastTransforms/releases/tag/v0.6.3

MikaelSlevinsky · 2024-09-03T16:46:55Z

Let's see how this goes JuliaPackaging/Yggdrasil#9359

MikaelSlevinsky · 2024-09-03T17:17:28Z

Looks like the binaries are cooked! Just waiting for the merge, then we can work on the Julia-side of things

ioannisPApapadopoulos · 2024-09-04T07:18:01Z

Amazing, thank you @MikaelSlevinsky!

MikaelSlevinsky · 2024-09-04T17:26:10Z

So, I've got a simple prototype built in this PR (JuliaApproximation/FastTransforms.jl#251) for using the new code, but there must be some overhead somewhere in your code above. Using the current plan_leg2cheb, your first block of code executes like so:

julia> using FastTransforms

julia> c = randn(2000, 1000);

julia> p = plan_leg2cheb(c);

julia> @time p*c; # 18 cores
  0.058025 seconds (2 allocations: 15.259 MiB)

julia> FastTransforms.ft_set_num_threads(1)

julia> @time p*c; # 1 core
  1.141138 seconds (2 allocations: 15.259 MiB, 12.43% gc time)

ioannisPApapadopoulos · 2024-09-04T20:21:32Z

Yes the overhead seems to come from the choice of algorithm for cheb2leg in ClassicalOrthogonalPolynomials.

c = randn(2000, 1000);
F = plan_transform(P, c, 1)
@time r1 = F * c; # 2.997582 seconds (15 allocations: 15.259 MiB)
th_p = FastTransforms.plan_th_cheb2leg!(c, 1);
p = plan_cheb2leg(c)
pT = plan_transform(Chebyshev(), c, 1)
@time r2 = th_p*(pT*copy(c)); # 2.837106 seconds (17 allocations: 30.518 MiB, 1.86% gc time)
@time r3 = p*(pT*c); # 0.097129 seconds (8 allocations: 45.777 MiB, 2.90% gc time)
r1 ≈ r2 ≈ r3 # true

The plan_transform in ClassicalOrthogonalPolynomials uses the Toeplitz-Hankel route which really slows things down. I guess because of the large array allocation that @dlfivefifty has mentioned before.

Looking at FastTransforms.jl I see that FastTransforms.plan_th_cheb2leg! is implemented in Julia and does not go into libfasttransforms, is that right? That would explain why the run times are roughly the same no matter what I pick for FastTransforms.ft_set_num_threads whereas they do vary if I use plan_cheb2leg .

What's the default algorithm for plan_cheb2leg? Is it a direct algorithm?

dlfivefifty · 2024-09-04T20:34:29Z

I used th only because the old plans had bad complexity, now that’s fixed we should switch to the default

it uses h-matrices/ FMM I think

ioannisPApapadopoulos · 2024-09-04T21:01:40Z

Ok! so as far as I can tell FastTransforms.plan_cheb2leg does not support the dims argument, is that right? So that is something that would need to be added.

ioannisPApapadopoulos · 2024-09-04T21:22:01Z

What's the history with plan_cheb2leg? When did it change from bad to good complexity?

dlfivefifty · 2024-09-05T09:06:46Z

Yesterday?

dlfivefifty · 2024-09-05T09:07:31Z

Ok! so as far as I can tell FastTransforms.plan_cheb2leg does not support the dims argument, is that right? So that is something that would need to be added.

We could do this in Julia by just looping over the columns etc.

ioannisPApapadopoulos · 2024-09-05T09:36:46Z

Yesterday?

That would be FastTransforms.plan_fmm_leg2cheb not FastTransforms.plan_leg2cheb. I am saying I still get much better performance using the original FastTransforms.plan_cheb2leg.

dlfivefifty · 2024-09-05T09:54:10Z

What’s your p? If it’s low it’ll be faster

try p = 1 million

ioannisPApapadopoulos · 2024-09-05T10:19:20Z

Yeah I had p=2000 and it's being applied to 1000 vectors and it's 30 times faster. But you're right, I can not even create the plan_cheb2leg with p = 1 million (but I can with FastTransforms.plan_th_cheb2leg!)

dlfivefifty · 2024-09-05T10:49:53Z

Right. So the choice was made for 1D adaptive transforms where one would need lots of plans.

ioannisPApapadopoulos mentioned this issue Sep 6, 2024

Extend FTPlans to Arrays and higher dimensions JuliaApproximation/FastTransforms.jl#252

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fast Legendre Transform #206

Fast Legendre Transform #206

ioannisPApapadopoulos commented Sep 3, 2024

MikaelSlevinsky commented Sep 3, 2024

MikaelSlevinsky commented Sep 3, 2024

MikaelSlevinsky commented Sep 3, 2024

ioannisPApapadopoulos commented Sep 4, 2024

MikaelSlevinsky commented Sep 4, 2024

ioannisPApapadopoulos commented Sep 4, 2024

dlfivefifty commented Sep 4, 2024

ioannisPApapadopoulos commented Sep 4, 2024 •

edited

Loading

ioannisPApapadopoulos commented Sep 4, 2024

dlfivefifty commented Sep 5, 2024

dlfivefifty commented Sep 5, 2024

ioannisPApapadopoulos commented Sep 5, 2024

dlfivefifty commented Sep 5, 2024

ioannisPApapadopoulos commented Sep 5, 2024

dlfivefifty commented Sep 5, 2024

Fast Legendre Transform #206

Fast Legendre Transform #206

Comments

ioannisPApapadopoulos commented Sep 3, 2024

MikaelSlevinsky commented Sep 3, 2024

MikaelSlevinsky commented Sep 3, 2024

MikaelSlevinsky commented Sep 3, 2024

ioannisPApapadopoulos commented Sep 4, 2024

MikaelSlevinsky commented Sep 4, 2024

ioannisPApapadopoulos commented Sep 4, 2024

dlfivefifty commented Sep 4, 2024

ioannisPApapadopoulos commented Sep 4, 2024 • edited Loading

ioannisPApapadopoulos commented Sep 4, 2024

dlfivefifty commented Sep 5, 2024

dlfivefifty commented Sep 5, 2024

ioannisPApapadopoulos commented Sep 5, 2024

dlfivefifty commented Sep 5, 2024

ioannisPApapadopoulos commented Sep 5, 2024

dlfivefifty commented Sep 5, 2024

ioannisPApapadopoulos commented Sep 4, 2024 •

edited

Loading