Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new DFT api #12087

Closed
wants to merge 1 commit into from
Closed

new DFT api #12087

wants to merge 1 commit into from

Conversation

stevengj
Copy link
Member

@stevengj stevengj commented Jul 9, 2015

This is a rewritten DFT API that provides:

  • p = plan_fft(x) (and similar) now returns a subtype of Base.DFT.Plan. It acts like a linear operator: you can apply it to an existing array with p * x, and apply it to a preallocated output array with A_mul_B!(y, p, x). You can apply the inverse plan with p \ x or inv(p) * x.
  • It is easy to add new FFT algorithm implementations for specific numeric and array types...

Unlike #6193, this patch contains only the API changes, not the pure-Julia FFT implementation (which can easily be added later). Like #11994, this should give a non-negligible performance improvement (by eliminating anonymous functions and allowing for pre-allocated output).

cc: @yuyichao, @ViralBShah.

@stevengj
Copy link
Member Author

stevengj commented Jul 9, 2015

(Note that the patch is large, but most of this is due merely to rearranging the FFT stuff into an fft/ subdirectory. Unfortunately, git does not track this kind of change well.)

@stevengj
Copy link
Member Author

stevengj commented Jul 9, 2015

Also, it was convenient to have real(Complex{Float64}) == real(Float64) == Float64 and similar, so this exports a new method of the real function that works on types rather than values.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

I'll definitely prefer this than #11994.

Is there a summary of the API changes and should this be accompanied by the document update? (I only see very small update in math.rst and I doubt that describe all changes in this PR).

Without reading through the code and the discussion in the old PR (#6193), I'd like to know if this contain an interface that I found the most useful with my simulation. Is there a way to do fft!(A, B) (either plan'ed or not) such that the transformation of A is stored in B?

Edit:

allowing for pre-allocated output

I guess this is what I want?

@stevengj
Copy link
Member Author

stevengj commented Jul 9, 2015

@yuyichao, I believe that I documented all of the user-visible API changes. There weren't many.

I didn't provide an fft!(x, y) function, but you can do p = plan_fft(x) followed by A_mul_B!(y, p, x) to perform a transformation with a pre-allocated output array (similar to matrix multiply with preallocated output).

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

@yuyichao, I believe that I documented all of the API changes. There weren't many.

Ah. for some reason, I thought the second paragraph in math.rst was irrelevant.... Sorry about that.

followed by A_mul_B!(y, p, x) to perform a transformation with a pre-allocated output array (similar to matrix multiply with preallocated output).

That's good enough. I don't really care about function names. Thanks!

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

Hmm, this seems to break PyPlot???

 socket.jl:35 overwritten in module Compat at /usr/share/julia/site/v0.4/Compat/src/Compat.jl:41.
Warning: Method definition call(Type{Base.IPv6}, AbstractString) in module Base at socket.jl:78 overwritten in module Compat at /usr/share/julia/site/v0.4/Compat/src/Compat.jl:42.
Warning: Method definition isless(#T<:Base.IPAddr, #T<:Base.IPAddr) in module Base at socket.jl:6 overwritten in module Compat at /usr/share/julia/site/v0.4/Compat/src/Compat.jl:47.
Warning: Method definition call(Type{Base.Dict}, Any) in module Base at dict.jl:430 overwritten in module Compat at /usr/share/julia/site/v0.4/Compat/src/Compat.jl:59.
ERROR: LoadError: LoadError: LoadError: LoadError: TypeError: apply_type: in Type, expected Type{T}, got Tuple{TypeVar,TypeVar}
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in reload_path at ./loading.jl:157
 in _require at ./loading.jl:69
 in require at ./loading.jl:55
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in reload_path at ./loading.jl:157
 in _require at ./loading.jl:69
 in require at ./loading.jl:55
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in reload_path at ./loading.jl:157
 in _require at ./loading.jl:69
 in require at ./loading.jl:52
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in process_options at ./client.jl:306
 in _start at ./client.jl:406
while loading /usr/share/julia/site/v0.4/Compat/src/Compat.jl, in expression starting on line 60
while loading /usr/share/julia/site/v0.4/PyCall/src/PyCall.jl, in expression starting on line 25
while loading /usr/share/julia/site/v0.4/PyPlot/src/PyPlot.jl, in expression starting on line 3
while loading /home/yuyichao/projects/nacs-lab/yyc-data/calculations/single-atom/ode-test5.jl, in expression starting on line 5

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

Actually it breaks Compat

Warning: Method definition call(Type{Base.IPv4}, AbstractString) in module Base at socket.jl:35 overwritten in module Compat at /usr/share/julia/site/v0.4/Compat/src/Compat.jl:41.
Warning: Method definition call(Type{Base.IPv6}, AbstractString) in module Base at socket.jl:78 overwritten in module Compat at /usr/share/julia/site/v0.4/Compat/src/Compat.jl:42.
Warning: Method definition isless(#T<:Base.IPAddr, #T<:Base.IPAddr) in module Base at socket.jl:6 overwritten in module Compat at /usr/share/julia/site/v0.4/Compat/src/Compat.jl:47.
Warning: Method definition call(Type{Base.Dict}, Any) in module Base at dict.jl:430 overwritten in module Compat at /usr/share/julia/site/v0.4/Compat/src/Compat.jl:59.
ERROR: LoadError: TypeError: apply_type: in Type, expected Type{T}, got Tuple{TypeVar,TypeVar}
 in include at ./boot.jl:254
 in include_from_node1 at ./loading.jl:133
 in reload_path at ./loading.jl:157
 in _require at ./loading.jl:69
 in require at ./loading.jl:52
while loading /usr/share/julia/site/v0.4/Compat/src/Compat.jl, in expression starting on line 60

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

It also breaks the old api p_fft! = plan_fft!(tmp, 1:1, FFTW.MEASURE)

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

It's also a little strange to use the p * A syntax for inplace transformation.

@stevengj
Copy link
Member Author

stevengj commented Jul 9, 2015

@yuyichao, I can't reproduce the Compat problem: using Compat works for me.

@stevengj
Copy link
Member Author

stevengj commented Jul 9, 2015

Right, I forgot that I changed the flags and timelimit to keywords.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

Other than the issues mentioned above, this branch yields the same performance with #11994.

A summary of the issues I've seen:

  1. Somehow Compat doesn't work, don't understand why. Disappeared after rebuild
  2. p * A and p \ A syntax is a little bit wierd for in place operation. Not too big a deal but maybe we don't need to deprecate the old interface?
  3. A_mul_B! (and A_ldiv_B!) allocate about twice as much memory as * (and \) for inplace operation, is this expected? Caused by type instability of assert_applicable

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

Oh, yeah, and the keyword argument makes the compilation slower (~200ms) but doesn't matter too much for real calcultion.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

@yuyichao, I can't reproduce the Compat problem: using Compat works for me.

I did a git clean and compiled again and it also works for me. It didn't make sense from the error message for me either so I guess it mush be something weird during my first compilation.... no idea what...

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

@stevengj The allocation seems to come from the type instability of cFFTWPlan.{sz,osz,istride,ostride}. Their type is Tuple{Vararg{Int64}} so julia emitted a boxing of the array size and a jl_apply_generic(!) for the size comparision. Is it possible to parametrized the plan on the dimensions as well?

(The allocation does not increase with this PR so I guess this issue isn't new).

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

Relavant code_typed

        GenSym(2) = (top(tuple))((top(arraysize))(X::Array{Complex{Float64},1},1)::Int64)::Tuple{Int64}
        GenSym(1) = (top(getfield))(p::Base.DFT.FFTW.cFFTWPlan{Complex{Float64},-1,true},:sz)::Tuple{Vararg{Int64}}
        GenSym(3) = GenSym(2) == GenSym(1)::Bool

and relavant code_llvm

define void @julia_assert_applicable_21150(%jl_value_t*, %jl_value_t*) {
top:
  %const = bitcast i64 140728651486960 to i64
  %2 = alloca [6 x %jl_value_t*], align 8
  %.sub26 = bitcast [6 x %jl_value_t*]* %2 to %jl_value_t**
  %3 = getelementptr [6 x %jl_value_t*]* %2, i64 0, i64 2
  %4 = getelementptr [6 x %jl_value_t*]* %2, i64 0, i64 4
  %5 = bitcast [6 x %jl_value_t*]* %2 to i64*
  store i64 8, i64* %5, align 8
  %6 = getelementptr [6 x %jl_value_t*]* %2, i64 0, i64 1
  %7 = bitcast %jl_value_t** %6 to %jl_value_t***
  %8 = load %jl_value_t*** @jl_pgcstack, align 8
  store %jl_value_t** %8, %jl_value_t*** %7, align 8
  store %jl_value_t** %.sub26, %jl_value_t*** @jl_pgcstack, align 8
  store %jl_value_t* null, %jl_value_t** %3, align 8
  %9 = getelementptr [6 x %jl_value_t*]* %2, i64 0, i64 3
  store %jl_value_t* null, %jl_value_t** %9, align 8
  store %jl_value_t* null, %jl_value_t** %4, align 8
  %10 = getelementptr [6 x %jl_value_t*]* %2, i64 0, i64 5
  store %jl_value_t* null, %jl_value_t** %10, align 8
  %11 = getelementptr inbounds %jl_value_t* %1, i64 3, i32 0
  %12 = bitcast %jl_value_t** %11 to i64*
  %13 = load i64* %12, align 8
  %14 = insertvalue [1 x i64] undef, i64 %13, 0
  %15 = getelementptr inbounds %jl_value_t* %0, i64 1, i32 0
  %16 = load %jl_value_t** %15, align 8
  store %jl_value_t* %16, %jl_value_t** %3, align 8
  %17 = call %jl_value_t* @jl_gc_allocobj(i64 8)
  %18 = getelementptr inbounds %jl_value_t* %17, i64 -1, i32 0
  %const_mat19 = add i64 %const, -1036320
  %19 = inttoptr i64 %const_mat19 to %jl_value_t*
  store %jl_value_t* %19, %jl_value_t** %18, align 8
  %20 = bitcast %jl_value_t* %17 to [1 x i64]*
  store [1 x i64] %14, [1 x i64]* %20, align 16
  store %jl_value_t* %17, %jl_value_t** %4, align 8
  store %jl_value_t* %16, %jl_value_t** %10, align 8
  %const_mat24 = add i64 %const, 4382528
  %21 = inttoptr i64 %const_mat24 to %jl_value_t*
  %22 = call %jl_value_t* @jl_apply_generic(%jl_value_t* %21, %jl_value_t** %4, i32 2)
  %23 = bitcast %jl_value_t* %22 to i8*
  %24 = load i8* %23, align 1
  %25 = and i8 %24, 1
  %26 = icmp eq i8 %25, 0
  br i1 %26, label %if, label %L

%21 is ==

@stevengj
Copy link
Member Author

stevengj commented Jul 9, 2015

FFTWPlan could be parameterized on the dimension N of the array. I guess sz etc. would then be NTuple{N,Int}; would that be sufficient to fix the problem?

I don't quite see why Julia can't emit fast code for Vararg{Int} tuple comparisons. What happens if you just inline the == comparison into a loop?

@stevengj
Copy link
Member Author

stevengj commented Jul 9, 2015

@yuyichao, I just pushed a modified patch that documents the change of flags and timelimit to keyword arguments and adds a deprecation so that plan_fft!(tmp, 1:1, FFTW.MEASURE) works (with a warning).

@stevengj
Copy link
Member Author

stevengj commented Jul 9, 2015

The main problem with not deprecating the old plan(x) interface (vs. plan * x) is then you are supporting two syntaxes for the same thing, which seems odd as well.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

FFTWPlan could be parameterized on the dimension N of the array. I guess sz etc. would then be NTuple{N,Int}; would that be sufficent to fix the problem?

Depending on what kind of input the plan accept, it might be necessary to have more than one parameters (or not). I think sth like this should be sufficient.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

I was going to implement it myself and then I hit #12089

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

BTW. why is the type assersion in the plan_* functions necessary. i.e. which type parameter the type inference could not figure out?

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

With this patch (and after reverting #11996 ) I can get the tests all pass and the allocation down by a lot. plan_ifft! still seems to be unstable due to ScaledPlan though.

@yuyichao
Copy link
Contributor

yuyichao commented Jul 9, 2015

Given this (type stability) is blocked by other bug and it doesn't give a huge performance gain. I think it's not that necessary to include that in this PR....

@yuyichao
Copy link
Contributor

@stevengj I pushed a rebased version with a fix of the type stability of ScaledPlan on my yyc/dftnew_api branch. Could you please check if the change is correct? Please also feel free to take it.

@yuyichao
Copy link
Contributor

The yyc/dftnew_api branch now runs my benchmark script without any allocation!

@ViralBShah ViralBShah added this to the 0.4.0 milestone Jul 12, 2015
@yuyichao
Copy link
Contributor

So the reason I think we can keep the old interface plan_fft(A)(A) is that

  1. The new one is a little strange for in place operation
  2. It'll be harder to support both 0.3 and 0.4 at the same time. (depwarn is really expensive, even if it's not printing anything).

The first one I guess I can get used to. The second one depends on how many public packages are using it (assuming other usage usually don't require compatibility for multiple julia versions).

I just did a grep on all registered packages and the only direct user of the plan_* functions is ApproxFun. So it might not be too bad.

Other than this, I vote for rebase and merge this one (and I'll submit another PR on top of this to fix the left over type stability issues).

@ViralBShah
Copy link
Member

@stevengj Does this sound like a reasonable plan? Seems good to me.

@yuyichao
Copy link
Contributor

@stevengj I'd like to merge this PR the coming weekend if there's no objection. (Note that since this is not in the julialang repo, I would need to open a new PR with the rebased commit to trigger CI builds)

As for the breakage of packages, I had a look at the ApproxFun and it seems that the bigger issue there is that it is only expecting a Function (which means even my more conservative PR breaks it). I've opened an issue there JuliaApproximation/ApproxFun.jl#216 and proposed a temporary solution (wrap with anonymous function) which I might submit as a PR depending on the reply there.

yuyichao added a commit to yuyichao/ApproxFun.jl that referenced this pull request Jul 17, 2015
@yuyichao
Copy link
Contributor

With the ApproxFun.jl fix ready and my pure rebased branch of this PR in place, I'm going to submit a PR to replace this one tomorrow and merge it once the CI is happy.

@yuyichao yuyichao mentioned this pull request Jul 18, 2015
@ViralBShah
Copy link
Member

I think this is conservative and good to do.

@yuyichao
Copy link
Contributor

Merged as #12201

@stevengj
Copy link
Member Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants