Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid undefined checks for fields that are always initialized #8827

Merged
merged 2 commits into from
Nov 10, 2014

Conversation

simonster
Copy link
Member

The approach is as follows:

  • Types get an undeffree field that specifies whether they may contain undefined references.
  • The front-end checks for calls to new in the type block where the number of arguments is not equal to the number of fields. (Not sure I did this right, since I don't really know Scheme.) If no such cases exist (or the type is a bits type), undeffree is set to true.
  • In the codegen, getfield skips the undefined check for pointer fields if undeffree is true.

Another option would be to count the minimum number of arguments passed to new and skip undefined checks for fields at indices <= the minimum the type can be instantiated with. I could probably do that (as long as the rest of this approach seems reasonable).

Ref #3440

@JeffBezanson JeffBezanson mentioned this pull request Oct 27, 2014
65 tasks
@JeffBezanson
Copy link
Member

This is great. I have almost nothing to add.

You might want to use contains instead of pattern-lambda and manual recursion; it's simpler.

I think we will eventually want to use the minimum number of initialized fields instead of a boolean, but we can start with what you have here.

@StefanKarpinski
Copy link
Member

If boolean flags were passed to jl_new_datatype using power-of-two enum then the signature wouldn't change every time you add some kind of metadata like this. Of course, if you were tracking minimum number of initialized fields, then this wouldn't be a boolean flag. So not sure if it's really a win.

@jakebolewski
Copy link
Member

Did you run any simple benchmarks? I'm curious what the performance overhead of this check is.

@StefanKarpinski
Copy link
Member

It's pretty bad sometimes – it often forces a GC frame because there might be an exception.

@JeffBezanson
Copy link
Member

The UndefRefError is a singleton so throwing it doesn't allocate:

julia> type Foo
       x
       end

julia> f(x) = x.x
f (generic function with 2 methods)

julia> code_llvm(f,(Foo,))

define %jl_value_t* @"julia_f;40589"(%jl_value_t*) {
top:
  %1 = getelementptr inbounds %jl_value_t* %0, i64 1, i32 0, !dbg !330
  %2 = load %jl_value_t** %1, align 8, !dbg !330
  %3 = icmp eq %jl_value_t* %2, null, !dbg !330
  br i1 %3, label %fail, label %pass, !dbg !330

fail:                                             ; preds = %top
  %4 = load %jl_value_t** @jl_undefref_exception, align 8, !dbg !330, !tbaa %jtbaa_const
  call void @jl_throw_with_superfluous_argument(%jl_value_t* %4, i32 1), !dbg !330
  unreachable, !dbg !330

pass:                                             ; preds = %top
  ret %jl_value_t* %2, !dbg !330
}

@jakebolewski
Copy link
Member

That is what I thought, so it is just the cost of a pointer load / comparison? This is usually not that expensive.

@JeffBezanson
Copy link
Member

It will get rid of a lot of branches and calls, which is good for code size if nothing else. Several changes like this could eventually make a dent in compilation time and/or memory use.

@simonster
Copy link
Member Author

Here's an example where it makes a difference:

 type A
    x::Vector{Int}
end # undef-free

type B
    x::Vector{Int}
    B() = new()
    B(x) = new(x)
end # not undef-free

function f(x)
    y = 0
    for i = 1:length(x.x)
        @inbounds y += x.x[i]
    end
    y
end

a = rand(0:1000, 100000000);

julia> @time for i = 1:10; f(A(a)); end
elapsed time: 0.637712013 seconds (160 bytes allocated)

julia> @time for i = 1:10; f(B(a)); end
elapsed time: 0.889516562 seconds (160 bytes allocated)

In this case, getting rid of the undefined check allows the array pointer load to be hoisted, which lets the loop vectorizer do its magic.

@JeffBezanson
Copy link
Member

Yes, this can help a lot in simple loops on "wrapped" types.

@jakebolewski
Copy link
Member

Cool! Just curious.

@timholy
Copy link
Member

timholy commented Oct 27, 2014

In fact, #8809 might be a good real-world testcase...

@simonster
Copy link
Member Author

This alone is not quite enough to achieve good performance with ArrayViews in the test case from #8809. The other necessary optimization is "hoist access to fields in immutable values" from #3440. LLVM can hoist the pointer load in the case above because there are no stores, but it can't hoist the pointer load in the case from #8809 because it thinks the array can alias the ContiguousView object.

@simonster
Copy link
Member Author

I tried to add an int32_t ninitialized field to jl_datatype_t, but when I do that bootstrapping breaks with error during bootstrap: LoadError(at "sysimg.jl" line 30: LoadError(at "reflection.jl" line 143: ErrorException("invalid type for argument m in method definition for next at reflection.jl:143"))), unless I either don't add ninitialized to julia_datatype_type or add an extra field of padding to jl_datatype_t. That has me confused.

OTOH, maybe both size and ninitialized can be uint16_t? It doesn't seem like size can really be larger, since field offsets are uint16_t. Then we'd avoid making DataTypes take more memory than they do now, although this would appear to require another special case bitstype in alloc.c/dump.c.

@simonster
Copy link
Member Author

Bump. @JeffBezanson, what's the right approach here?

@JeffBezanson
Copy link
Member

I don't see why adding int32_t ninitialized wouldn't work. If necessary, you can leave it out of julia_datatype_type and then some time later I can look at why that part doesn't work.

@simonster
Copy link
Member Author

Done.

@simonster simonster changed the title Avoid undefined checks for types that are always fully initialized Avoid undefined checks for fields that are always initialized Oct 31, 2014
@StefanKarpinski
Copy link
Member

Let's merge this unless there are objections.

@StefanKarpinski StefanKarpinski merged commit 2740bc8 into master Nov 10, 2014
@timholy
Copy link
Member

timholy commented Nov 10, 2014

👍. But I am getting some warnings:

codegen.cpp: In function ‘llvm::Value* emit_getfield(jl_value_t*, jl_sym_t*, jl_codectx_t*)’:
codegen.cpp:1494:37: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
                     if (idx >= sty->ninitialized) {
                                     ^
codegen.cpp:1519:64: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
                 else if (sty->fields[idx].isptr && idx >= sty->ninitialized) {
                                                                ^
In file included from codegen.cpp:147:0:
codegen.cpp: In function ‘llvm::Value* emit_known_call(jl_value_t*, jl_value_t**, size_t, jl_codectx_t*, llvm::Value**, jl_function_t**, jl_value_t*)’:
julia.h:435:54: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
 #define jl_tuple_len(t)   (((jl_tuple_t*)(t))->length)
                                                      ^
codegen.cpp:2149:50: note: in expansion of macro ‘jl_tuple_len’
                         if (stt->ninitialized != jl_tuple_len(stt->types)) {

@simonster simonster deleted the sjk/undef branch November 10, 2014 16:42
simonster added a commit that referenced this pull request Nov 10, 2014
@simonster
Copy link
Member Author

@timholy Huh, I get those warnings with gcc but not clang. Fixed in e2500db

@timholy
Copy link
Member

timholy commented Nov 10, 2014

Thanks!

waTeim pushed a commit to waTeim/julia that referenced this pull request Nov 23, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants