Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keyword arguments are very slow #9551

Closed
yuyichao opened this issue Jan 2, 2015 · 18 comments · Fixed by #25936
Closed

keyword arguments are very slow #9551

yuyichao opened this issue Jan 2, 2015 · 18 comments · Fixed by #25936
Assignees
Labels
keyword arguments f(x; keyword=arguments) performance Must go faster

Comments

@yuyichao
Copy link
Contributor

yuyichao commented Jan 2, 2015

The script that demonstrate this can be found here. Pasted below

#!/usr/bin/julia -f

f() = nothing

forward_nothing(func) = func()
forward_pos_only(func, args...) = func(args...)
forward_kw(func, args...; kws...) = func(args...; kws...)

do_sth(a, b, c) = begin
    global g = a * b * c
end

macro time_forward(ex)
    quote
        println($(Expr(:quote, ex)))
        gc()
        @time for i in 1:10000000
            $(esc(ex))
        end
    end
end

@time_forward forward_nothing(f)
@time_forward forward_pos_only(f)
@time_forward forward_kw(f)
@time_forward do_sth("a", "b", "c")
@time_forward do_sth(1, 2, 3)
@time_forward do_sth(1.2, 2.3, 3.4)
@time_forward (do_sth(1, 2, 3); do_sth(1.2, 2.3, 3.4))

The output of the script on my laptop

forward_nothing(f)
elapsed time: 0.019369954 seconds (0 bytes allocated)
forward_pos_only(f)
elapsed time: 6.2e-8 seconds (0 bytes allocated)
forward_kw(f)
elapsed time: 0.95345393 seconds (960000000 bytes allocated, 33.34% gc time)
do_sth("a","b","c")
elapsed time: 1.014413397 seconds (800000000 bytes allocated, 28.80% gc time)
do_sth(1,2,3)
elapsed time: 0.04512597 seconds (0 bytes allocated)
do_sth(1.2,2.3,3.4)
elapsed time: 0.125168819 seconds (160000000 bytes allocated, 37.53% gc time)
begin 
    do_sth(1,2,3)
    do_sth(1.2,2.3,3.4)
end
elapsed time: 0.157331348 seconds (160000000 bytes allocated, 30.24% gc time)

There are several interesting things here,

  1. It seems that the compiler can inline f(args...) and figure out it is useless and eliminate the loop altogether (I don't think my cpu can excute a million loops in tens of nanosecond) but it cannot do it for f(). This particular case isn't very interesting but I'm wondering if there's any missing optimizations here.
  2. Forwarding keyword argument is very costy, even when there's nothing to be forwarded (probably a big part of it is memory allocation?). Although it would be nice to make keyword argument fast in general, the slowness of forwarding empty keyword argument means that it is very costy to support keyword arguments. One example (also how I discover this issue) is a version of invoke that supports keyword arguments. This version is ~3 times slower than the builtin invoke and a version without keyword argument support and simply forward to the builtin invoke is as fast as the builtin one.
  3. Just for comparison, I've also included some other random functions and it seems that keyword arguments are much more expensive than global variables and is as expensive as concatenating strings. What I don't quite understand either is that why is it necessary to allocate memory when assigning floating point numbers to a global variable but not for integers? (Edit: it seems that the integer version does no allocation because of the cache of boxed value.)

Anothing thing I've noticed is that the extra keyword argument got by the functions is in a different format (Array{(Symbol, Any), 1}) from the format of argument (Array{Any, 1}) to the .env.kwsorter. Is it necessary to do this conversion? This probably won't affect the performance for empty forwarding but feels inefficient in general.

Test done on current master as of a few hours ago.

julia> versioninfo()
Julia Version 0.4.0-dev+2369
Commit bab9ec5* (2015-01-02 01:47 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Core(TM) i7-4702HQ CPU @ 2.20GHz
  WORD_SIZE: 64
  BLAS: libblas
  LAPACK: liblapack
  LIBM: libm
  LLVM: libLLVM-3.3

The llvm version installed is 3.5 and I have no idea why it prints llvm-3.3....

@jiahao
Copy link
Member

jiahao commented Jan 2, 2015

The performance penalty associated with keyword arguments is documented as a performance tip and is a duplicate of #8843.

@yuyichao
Copy link
Contributor Author

yuyichao commented Jan 2, 2015

I see. Now I remember that I've read those tip before but didn't really understand/remember them very well because I didn't really know how kwargs work at that time......

@yuyichao
Copy link
Contributor Author

yuyichao commented Jan 2, 2015

Please feel free to rename or open a new one for the other minor issues mentioned above if you think it is not the time yet to optimize keyword argument performance.

@yuyichao
Copy link
Contributor Author

yuyichao commented Jan 2, 2015

Although actually I think it might be possible to get a lot better performance if the argument to the kwsorter changes to ::Array{(Symbol, Any), 1}. I just hacked my version of invoke that support keyword argument by extending .env.kwsorter directly and the performance is much better (3x).

@JeffBezanson JeffBezanson added the performance Must go faster label Jan 2, 2015
@tkelman
Copy link
Contributor

tkelman commented Jan 5, 2015

The llvm version installed is 3.5 and I have no idea why it prints llvm-3.3....

I fixed this bit in #9566

@yuyichao
Copy link
Contributor Author

yuyichao commented Jan 5, 2015

@tkelman Thanks. I noticed that it prints the right version yesterday but didn't figured out why..... =)

@mauro3
Copy link
Contributor

mauro3 commented Feb 7, 2015

Slightly off topic: I just stumbled across this as well and, as recommended above, had a look at the performance tips. I found this sentence confusing: "Functions are specialized on the types of keyword arguments, so these declarations will not affect performance of code inside the function." Is it not contradictory? Is there a "not" missing, although, that wouldn't make sense either. If the function is specialized on the types of the keyword it should be faster, no? Anyways, maybe something to clean up.

@yuyichao
Copy link
Contributor Author

yuyichao commented Feb 7, 2015

AFAIK, what it means is that the function body is specialized so the function body will be compiled to the same level as if those were normal arguments.

julia> f(;a=0, b=1) = a + b
f (generic function with 1 method)

julia> @code_llvm f()

define i64 @julia_f_42853() {
top:
  ret i64 1
}

The overhead here is that when the function is called, it goes through another dispatch function which is slow but that's not part of the "function body" and you don't need to write a separate function with only normal arguments to let julia do type inference and increase performance.

I don't know if this is really what it mean and I do find that confusing too...

@mauro3
Copy link
Contributor

mauro3 commented Feb 7, 2015

Thanks, that kind of makes sense. But I don't think that sentence helps anyone who doesn't know about the internals of kw-arg functions. And those who know probably don't need to read the performance tips.

(BTW, @code_llvm and friends often don't work properly with kw-args, see #9818, but should be ok for your example)

@yuyichao
Copy link
Contributor Author

yuyichao commented Feb 7, 2015

Yeah, I figured what I understand now after I figured out how kw-args works.

And yeah, I know @code_* and code_* doesn't work for keyword arguments. Here's a more "complete" example for that...

julia> f(;a=0, b=1) = if a > 0 && b < 1
       a * b + b + a
       else
       a - b * a - b
       end
f (generic function with 1 method)

julia> @code_llvm f()

define i64 @julia_f_42805() {
top:
  %0 = call i64 @"julia___f#0___42806"(i64 0, i64 1)
  ret i64 %0
}                                                                                  

julia> @code_typed f.env.kwsorter(Any[])
1-element Array{Any,1}:                                                            
 :($(Expr(:lambda, Any[symbol("#s2")], Any[Any[:a,:b,symbol("#s8"),symbol("#s9"),symbol("#s10"),symbol("#s1"),symbol("#s3"),symbol("#s4"),:_var1,:_var2,:_var3,:_var0,:_var4],Any[Any[symbol("#s2"),Array{Any,1},0],Any[:a,Any,2],Any[:b,Any,2],Any[symbol("#s8"),UnitRange{Int64},18],Any[symbol("#s9"),Int64,2],Any[symbol("#s10"),(Int64,Int64),18],Any[symbol("#s1"),Int64,18],Any[symbol("#s3"),Int64,18],Any[symbol("#s4"),Any,18],Any[:_var1,Int64,18],Any[:_var2,Int64,18],Any[:_var3,(ASCIIString,Any,ASCIIString),18],Any[:_var0,Int64,18],Any[:_var4,Int64,18]],Any[]], :(begin $(Expr(:line, 0, symbol(""), symbol("")))                                                    
        a = 0                                                                      
        b = 1                                                                      
        _var1 = (top(arraylen))(#s2::Array{Any,1})::Int64                          
        _var2 = (top(box))(Int64,(top(ashr_int))(_var1::Int64,(top(box))(Int32,(top(checked_trunc_sint))(Int32,1))))                                                  
        #s8 = $(Expr(:new, UnitRange{Int64}, 1, :(((top(getfield))(Intrinsics,:select_value))((top(sle_int))(1,_var2::Int64)::Bool,_var2::Int64,(top(box))(Int64,(top(sub_int))(1,1)))::Int64)))                                                         
        #s9 = (top(getfield))(#s8::UnitRange{Int64},:start)::Int64                 
        unless (top(box))(Bool,(top(not_int))(#s9::Int64 === (top(box))(Int64,(top(add_int))((top(getfield))(#s8::UnitRange{Int64},:stop)::Int64,1))::Bool)) goto 1   
        2:                                                                         
        _var0 = #s9::Int64
        _var4 = (top(box))(Int64,(top(add_int))(#s9::Int64,1))
        #s1 = _var0::Int64
        #s9 = _var4::Int64
        #s3 = (top(box))(Int64,(top(sub_int))((top(box))(Int64,(top(mul_int))(#s1::Int64,2)),1))
        #s4 = (top(arrayref))(#s2::Array{Any,1},#s3::Int64)::Any
        unless #s4 === :b::Bool goto 4
        b = (top(arrayref))(#s2::Array{Any,1},(top(box))(Int64,(top(add_int))(#s3::Int64,1)))::Any
        goto 6
        4: 
        unless #s4 === :a::Bool goto 5
        a = (top(arrayref))(#s2::Array{Any,1},(top(box))(Int64,(top(add_int))(#s3::Int64,1)))::Any
        goto 6
        5: 
        _var3 = $(Expr(:call1, :(top(tuple)), "unrecognized keyword argument \"", symbol("#s4"), "\""))
        throw(ErrorException((top(_apply))(call,string,_var3::(ASCIIString,Any,ASCIIString)))::ErrorException)
        6: 
        3: 
        unless (top(box))(Bool,(top(not_int))((top(box))(Bool,(top(not_int))(#s9::Int64 === (top(box))(Int64,(top(add_int))((top(getfield))(#s8::UnitRange{Int64},:stop)::Int64,1))::Bool)))) goto 2
        1: 
        0: 
        return __f#0__(a,b)::Any
    end::Any))))

julia> code_typed(eval(symbol("__f#0__")), (Int, Int))
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:a,:b], Any[Any[symbol("#s13")],Any[Any[:a,Int64,0],Any[:b,Int64,0],Any[symbol("#s13"),Bool,2]],Any[]], :(begin $(Expr(:line, 1, :none, :f))
        unless (top(slt_int))(0,a::Int64)::Bool goto 0
        #s13 = (top(slt_int))(b::Int64,1)::Bool
        goto 1
        0: 
        #s13 = false
        1: 
        unless #s13::Bool goto 2 # line 2:
        return (top(box))(Int64,(top(add_int))((top(box))(Int64,(top(add_int))((top(box))(Int64,(top(mul_int))(a::Int64,b::Int64)),b::Int64)),a::Int64))
        goto 3
        2:  # line 4:
        return (top(box))(Int64,(top(sub_int))((top(box))(Int64,(top(sub_int))(a::Int64,(top(box))(Int64,(top(mul_int))(b::Int64,a::Int64)))),b::Int64))
        3: 
    end::Int64))))

@vtjnash
Copy link
Member

vtjnash commented Sep 15, 2016

This would be a breaking feature, so it's not eligible for a point release

@diegozea
Copy link
Contributor

diegozea commented Feb 3, 2017

@vtjnash said that:

This would be a breaking feature, so it's not eligible for a point release

So, maybe this issue should change its actual milestone (0.6.x).

@JeffBezanson
Copy link
Member

What here would be a breaking feature? I'm confused.

@tkelman
Copy link
Contributor

tkelman commented Feb 4, 2017

Depends how it gets implemented. If we can make them faster without changing semantics at all then it could be backported, but who knows how likely that is at this point.

@JeffBezanson
Copy link
Member

Keyword args need a redesign anyway, so I don't think it would be worth the effort to optimize this for 0.6, then again after the redesign, so moving milestone.

@ViralBShah
Copy link
Member

ViralBShah commented Nov 18, 2017

Given this is a performance issue, perhaps it can be addressed post 1.0.

@ViralBShah ViralBShah added the triage This should be discussed on a triage call label Nov 18, 2017
@rfourquet
Copy link
Member

This would be one of the few things I had wished would make it for 1.0: indeed, many APIs have been designed with their slowness constraint, so they have been often avoided. I occasionally see a function in base which would benefit from a redesign to use keywords, so a pass could be done before freezing the APIs to update such functions (if keywords get faster).

@StefanKarpinski
Copy link
Member

We'll improve keyword argument performance in the future. This may open up some API changes, at which point we can add keyword methods of those functions and keep the positional argument versions around as legacies. We'll probably want an opt-in mechanism to make sure your code is up to the standards of your current Julia version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
keyword arguments f(x; keyword=arguments) performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants