A more diligent name resolution? #10403

ninegua · 2015-03-05T07:16:32Z

Before I raise the problem, please help clarifying two of my understandings:

Julia is strictly a lexically scoped language. That is, a variable or function name only resolves to a declaration in its enclosing (or parent or ancestor) lexical scope.
Julia doesn't do any late binding. That is, once a name is resolved, it shall always point to the same object (or reference).

If both the above are true, then the following shall be their logical consequences:

It is safe to always resolve a name "statically", i.e., at the compilation time of its enclosing function.
If a name cannot be resolved "statically", then it is an error.

I put "statically" in quotes because Julia is certainly dynamic, but on the other hand, in the current implementation, once a Julia function is compiled, it stays compiled, and a compiled function does not attempt to resolve names at runtime.

With the above understanding, may I request the following feature to be implemented?

All names (of Symbol type) shall be resolved during (or before) type inference, and it is an error to have unresolved names remaining in typed AST.

This will save a lot of trouble of implementing optimizations leveraging type information available in typed AST, because typed AST will no longer contain unresolved names, and all resolved names will only be of 3 forms: TopNode (for system definitions), GetfieldNode (for module-level definitions), SymbolNode (for local definitions). In all cases, it would be straightforward to obtain the type of a name.

JeffBezanson · 2015-03-05T09:15:57Z

There are non-constant global variables. Storage locations for such variables are resolved early, but their values (including type) can change at run time.

Given a Symbol or SymbolNode in an AST, you need to look at the function's list of local and closed variables. If the symbol does not appear there, static parameters are examined next. These are stored in the sparams field of LambdaStaticData. If the symbol isn't there either, then it is a global variable available in the module referenced by the module field of the function's LambdaStaticData object.

Using that procedure, I believe all variable references can be resolved. In case I haven't understood, it would help to provide an example of an "unresolved" name.

ninegua · 2015-03-06T05:03:18Z

I'm not saying that names cannot be resolved following the procedure you described, I'm suggesting that all names (of Symbol type) be resolved to either SymbolNode or GetfieldNode when Julia first produce typed ASTs, so that no Symbol remains in the them (except in Expr head positions).

Since type inference already can find out exact location (which module) of a name, it can save future effort of repeating this process, especially when dealing with situations where in a AST rewriting function, we don't know which module a piece of AST comes from. It is also quite troublesome to safely copy ASTs from one place to another.

The current implementation is not very consistent about resolving names. For example:

julia> f(x,y)=sin(x) * sin(y)
f (generic function with 2 methods)
julia> code_typed(f, (Array{Float32,1}, Array{Float32,1}))
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:x,:y], Any[Any[:_var0,:_var1,:_var8,:_var9,:_var10,:_var11,:_var12,:_var13,:_var14],Any[Any[:x,Array{Float32,1},0],Any[:y,Array{Float32,1},0],Any[:_var0,(Array{Float32,1},Array{Float32,1}),0],Any[:_var1,(Array{Float32,1},Array{Float32,1}),0],Any[:_var8,(Array{Float32,1},),0],Any[:_var9,(Int64,),18],Any[:_var10,Array{Float32,1},18],Any[:_var11,Array{Float32,1},18],Any[:_var12,Function,18],Any[:_var13,Array{Float32,1},18],Any[:_var14,Array{Float32,1},18]],Any[]], :(begin  # none, line 1:
        _var14 = sin(x::Array{Float32,1})::Array{Float32,1}
        _var13 = sin(y::Array{Float32,1})::Array{Float32,1}
        _var12 = (*)::Any
        _var11 = _var14::Array{Float32,1}
        _var10 = _var13::Array{Float32,1}
        _var9 = (GetfieldNode(Base.Broadcast,:broadcast_shape,Any))(_var11::Array{Float32,1},_var10::Array{Float32,1})::(Int64,)
        return broadcast!(_var12::F,(top(ccall))(:jl_new_array,$(Expr(:call1, :(top(apply_type)), :Array, Float32, 1)),$(Expr(:call1, :(top(tuple)), :Any, :Any)),Array{Float32,1},0,_var9::(Int64,),0)::Array{Float32,1},_var11::Array{Float32,1},_var10::Array{Float32,1})::Array{Float32,1}
    end::Array{Float32,1}))))

We can see a few inconsistencies above:

both sin are not resolved.
(*) is not resolved, and is given Any type, which doesn't match the type of _var12.
broadcast! is not resolved, but broadcast_shape is resolved to GetfieldNode.

Another example would be:

f(x)=copy(x)
g(x)=begin copy!(x)=x; copy(x) end

If we dump the typed AST of both, we can see that f contains an unresolved copy! call, while the AST of g contains a fully resolved copy! as GetfieldNode. So it appears that somehow name is only resolved to avoid conflict? Why not always resolve them?

PS: I don't quite understand the decision of allowing globals to change type, though this is a different topic. In an interactive shell, newer definitions of a non-function variable should shadow previous definitions, which I believe is a different case from running code defined in a fully enclosed module. If a user really want a variable's type to be anything, he/she should explicitly say it is of Any type.

JeffBezanson · 2015-03-06T05:32:22Z

The reason we do this is efficiency. The number of symbols over all ASTs is very large. If a function contains, say, 100 function calls, we would have to create 100 extra nodes that simply repeat the information that all of these names are found in the same module as the enclosing function. Instead we can just refer to an AST, and a single pointer to a module. It's much more efficient to remember what module the AST came from than to repeat the information in every global symbol reference.

JeffBezanson · 2015-03-06T05:37:15Z

Also there is a function resolve_globals in inference.jl that adds GetfieldNodes as needed to move an AST from one place to another. With suitable arguments (e.g. an empty destination module) it would qualify everything.

ninegua · 2015-03-06T05:46:28Z

If GetfieldNode is immutable, then we can use techniques like hashconsing to reduce memory overhead. A similar situation is with type annotation, is it really necessary to introduce SymbolNode when Symbol's type can be looked up in the environment? There are always trade-offs, and more convincing arguments can be made by measuring real overheads.

JeffBezanson · 2015-03-06T06:01:36Z

SymbolNodes are used because a variable's type can be different at different points.

We keep lots of compiler data structures in memory during execution, and it's kind of a problem already. Better than hash-consing is to eliminate unnecessary objects entirely. Why is it so hard to keep track of what module an AST came from?

If you want I can write a small wrapper around resolve_globals that will add GetfieldNodes everywhere.

ninegua · 2015-03-06T06:57:52Z

Like I said, sometimes it is very difficult to figure out where a piece of code comes from. Here is a short example using macros:

julia> macro expand(expr)
         @eval function g()
           $(esc(expr))
         end
         return quote g() end
       end

julia> module A
         import ..@expand
         sin(x) = x + 1.0f
         f() = @expand sin(100)
       end

julia> A.f()
ERROR: error compiling f: unsupported or misplaced expression "escape" in function f

julia> code_typed(A.f, ())
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[], Any[Any[],Any[],Any[]], :(begin  # none, line 4: # none, line 5:
        return $(Expr(:escape, :((GetfieldNode(Main,:sin,Any))(100))))
    end::Any))))

If I understand it correctly, the purpose of using esc(x) is to allow the names in the AST (value of x) resolve in the call site of applying the macro. It is apparently not working (causing an error), and sin didn't resolve to the intended A.sin function.

I admit the example was abusing the macro facility a bit. But situations like this do come up often in our use cases because names in an AST are not always accompanied by their locations.

JeffBezanson · 2015-03-06T07:23:53Z

esc only serves a purpose when it appears in the expression returned from a macro. It's used by the macro expander, and not anything else. Macros are not supposed to call eval or have side effects; they're just supposed to return a transformed expression.

Once you have the A.f object, you can find out what module the function is in.

ninegua · 2015-03-06T07:26:11Z

Actually after thinking about it more, I'd argue that resolved names should stay in AST, and resolved types should stay out of AST. Once you know the location of a variable (local or module), it is easy to look up its type. But the reverse is not as easy, because you have to walk the module hierarchy and perform a lookup in every module you visit.

Also, after inference, type castings become explicit, so it is wasteful to annotate every variable occurrences with a type.

JeffBezanson · 2015-03-06T07:41:46Z

Also, after inference, type castings become explicit, so it is wasteful to annotate every variable occurrences with a type.

I'm pretty sure this is not true. What do you mean by "type castings"? Probably some SymbolNodes are redundant, but certainly not all. A julia local variable does not always have a fixed type; different occurrences can have different types.

In fact, we already avoid creating SymbolNodes in many cases, for example for constant globals, since their types can easily be looked up just as you say.

I don't believe you ever need to walk the module hierarchy yourself. A module and a symbol are sufficient, e.g. to call eval(module, sym). And if a bare symbol occurs in an AST, that means it will be looked up in the module in that AST's LambdaStaticData. Imports and so on are handled automatically. For example you can do eval(Main, :sin) even though sin is actually defined in Base, not Main.

ninegua · 2015-03-06T07:54:20Z

Sorry, I wasn't very clear. I was talking about the cost of doing name resolution, not what Julia function one has to call to resolve names.

Like I said, if a local variable can be assigned different types within the same body, it should either be declared as an union type or Any type. Its use can be accompanied by an explicit casting (for now, you can think of a SymbolNode where the type is different than the declared type of this symbol).

If you are concerned about space usage, adding one GetfieldNode per definition (in a module) and just returning the same object in every successful lookup sounds to me like a reasonable solution. I doubt it will blow up space usage compared to the current approach. If you are willing to give it a try, we can find someone who has free cycles to do this kind of experiment, and back it up with real data.

JeffBezanson · 2015-03-06T08:03:06Z

Removing the SymbolNodes for variables that are always the same type is a good idea. I'll plan to do that.

I still don't see why the GetfieldNodes are needed. Given an AST's module, putting a GetfieldNode with that module on every symbol inside it seems equally silly to me as putting a SymbolNode with the same type on every use of a variable.

ninegua · 2015-03-06T08:17:26Z

At least in the language implementations that I'm familiar with, both MLton and GHC do this in their ASTs. Local definitions have short names, but otherwise are always qualified with where it comes from. In addition to making things more consistent (name resolution is done once and for all), I think it is also a nice trade off between space (actually quite minimum if same name objects are reused) and efficiency (having to walk module hierarchy for a lookup). Overall it reduces complexity of working with ASTs.

I understand that Julia strives to bring JIT performance, and perhaps it has not required sophisticated transformations over typed ASTs.

JeffBezanson · 2015-03-06T08:24:35Z

Is the problem that you want the information at macro-expansion time?

ninegua · 2015-03-06T08:27:07Z

Yes, we are abusing macros to quickly generate functions like the example above, and want to be safe at handling AST pieces going into it.

ninegua · 2015-03-06T08:29:28Z

Whether we should have all names resolved at macro expansion time is kind of debatable. But I do think it is a bit more hygiene, especially when used with esc.

JeffBezanson · 2015-03-06T08:46:35Z

I think you should try to find a different approach than using eval in a macro. This gets very tricky. For example the argument to a macro might contain un-expanded macro calls, and you can't be sure what those symbols will ultimately refer to without doing the expansions. Before the lowering pass, it is even difficult to tell what will be a local or global variable. Macro-expand time might simply be too early for the kind of analysis you want to do.

As an example, we might have

function f()
    @expand sin(100)
    @foo sin
end

where @foo sin can expand into sin = x, making sin a local variable. This makes it very hard for @expand to reason about what sin in its argument means.

ninegua · 2015-03-06T08:53:34Z

Points taken. Name resolution should be after lowering. Actually the perfect place is in type inference, because it has to resolve all names anyway to figure out their definitions and types.

In referring to the specific example of using esc in macros that I gave above, a solution would be to keep track of the location (call site of a macro application) in the :escape Expr, so that at the type inference stage, it knows where to look up the escaping names.

JeffBezanson · 2015-03-06T09:01:51Z

IIUC, in your example @expand macro you want g() to contain A.sin. However that assumes that sin inside f refers to the global sin, but my example shows this is not necessarily so. I believe you will always have this problem if you try to move macro argument expressions out of their original context, into the separate g() function.

ninegua · 2015-03-06T09:16:39Z

It indeed depends on what should be the intended semantics of using esc with a macro argument expression out of its original context. Here is my attempt at defining this behavior. When used in a left hand side like you have pointed out, maybe it should not resolve at all and become a local variable, maybe it should just resolve and then become an error. When used in a non-LHS position, it resolve as if it were in the original context and again two situations would follow: (1) if the name resolves to a local definition in the original context, it can either become an error or cause the local definition to become an environment variable (as if captured by a closure); (2) if the name resolves to module level definition, it becomes a GetfieldNode.

ninegua · 2015-03-06T09:47:13Z

Actually thinking about it again, in the LHS case, it should always resolve to either a local or global definition in the original context, the same as in the non-LHS case. It is just that when it resolves to a local, the original definition should now become something that can be captured by closures. Now, to implement such semantics may not be very straightforward, and it is pretty much a corner case anyway.

ninegua · 2015-03-06T10:05:31Z

I modified the example to not use @eval, but it is still not supported:

macro expand(expr)
  quote
  function g() 
    $(esc(expr))
  end
  g() 
  end
end

module A
  import ..@expand
  sin(x) = x + 1.0f
  f() = @expand sin(100)
end

julia> A.f()
ERROR: MethodError: `*` has no method matching *(::Float64, ::Function)
Closest candidates are:
  *(::Any, ::Any, ::Any)
  *(::Any, ::Any, ::Any, ::Any...)
  *(::Number, ::Bool)
  ...

 in #3#g at /mnt/home/hliu54/PSE/tmp/julia/test.jl:4
 in f at /mnt/home/hliu54/PSE/tmp/julia/test.jl:6

julia> code_typed(A.f, ())
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[], Any[Any[symbol("#3#g")],Any[Any[symbol("#3#g"),Function,2]],Any[]], :(begin  # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 13: # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 3:
        $(Expr(:method, symbol("#3#g"), :((top(tuple))((top(tuple))()::Any,(top(tuple))()::Any)::Any), AST(:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[]], :(begin  # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 4:
        return sin(100)::Any
    end::Any))))), false)) # line 6:
        return (#3#g::F)()::Any
    end::Any))))

If I remove the esc, it becomes more interesting:

julia> code_typed(A.f, ())
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[], Any[Any[symbol("#3#g")],Any[Any[symbol("#3#g"),Function,2]],Any[]], :(begin  # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 13: # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 3:
        $(Expr(:method, symbol("#3#g"), :((top(tuple))((top(tuple))()::Any,(top(tuple))()::Any)::Any), AST(:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[]], :(begin  # /mnt/home/hliu54/PSE/tmp/julia/test.jl, line 4:
        return ((top(getfield))(Main,:sin)::Any)(100)::Any
    end::Any))))), false)) # line 6:
        return (#3#g::F)()::Any
    end::Any))))

julia> A.f()
-0.5063656411097588

Apparently it didn't resolve sin to A.sin. Is it a bug? I believe macro expansions are allowed to contain local function definitions, but somehow the name resolution is messed up when splicing ASTs into their body. UPDATE: my bad, this is not a bug (the one above it is), because without esc, names should resolve with respect to the macro's environment.

JeffBezanson · 2015-03-06T17:01:06Z

1.0f is 1.0 * f. Maybe you mean 1.0f0 (Float32 1.0) instead?

ninegua · 2015-03-06T17:18:21Z

My bad. So it is really only a problem when used out of context, like @eval.

Anyway, I hope Julia developers can consider the option of having name resolved once and for all.

quinnj · 2015-03-06T17:20:51Z

@ninegua Please be careful when talking about macros on github; writing @eval without the backticks will actually notify the github user "eval" that he/she is being mentioned here. Just always put backticks around macros.

JeffBezanson · 2015-03-06T17:23:54Z

I think having names fully resolved at macro-expand time is inherently incompatible with the kind of macros we have.

ninegua · 2015-03-06T18:12:53Z

@quinnj I didn't know about this, thanks for the tip.

@JeffBezanson I wasn't saying at macro expansion time, I meant it to be during type inference, and all typed ASTs will then have either fully resolved names, or locally defined names.

JeffBezanson · 2015-03-06T18:22:28Z

Ok, how about the utility function I offered (wrapping resolve_globals) to convert an AST to that form?

JeffBezanson · 2015-03-06T18:28:09Z

Though I still don't really see the problem. I agree with your argument about type annotations, but I think the same argument applies to GetfieldNodes. Somebody else might find it very convenient to have the types marked everywhere, to save a lookup step. Given a symbol, you will have to do a lookup to find its type. By doing that, you also discover whether it is a local variable. If it's not a local variable, you know what module to look in. I just don't think this procedure is difficult or expensive.

JeffBezanson · 2015-03-06T18:34:55Z

To be clear, when you see sin in the AST, its location is specified by the module field of the LambdaStaticData the AST came from. Any time that doesn't hold, there will be a GetfieldNode already.

JeffBezanson · 2015-03-06T19:37:39Z

However what we could do is rename GenSym to LocalVar, and use it for all locals, replacing all symbols with indexes early on. A few LocalVars would need to be wrapped with a HasType node to handle variables with changing types.

timholy · 2015-03-06T19:41:39Z

If there's going to be some effort put into this, is there any potential synergy in addressing #1334 simultaneously? At one point there was the suggestion that we should just include complete function/file info for each line number, rather than only for the first line.

JeffBezanson · 2015-03-06T19:44:01Z

I think it's separate from line numbers, but very much related to #9023 (variable renaming).

part of #10403

part of JuliaLang#10403

… params part of #10403 ast.args[2][1] has been removed since all of its information was also in the varinfo lists ast.args[2][2]. the varinfo lists are now at ast.args[2][1]. args[2][2] is captured var info, args[2][3] is gensym info, and now args[2][4] is a list of static parameter names. GlobalRef is now built in so it can be used more extensively.

part of #10403

ninegua · 2015-07-11T03:25:00Z

Thanks, Jeff! I'm trying out the latest master, and I'm glad to see the changes in typed ASTs. However, I notice the following expression is still not named resolved:

julia> f(x)= x .* 0.5
f (generic function with 1 method)

julia> code_typed(f, (Array{Int,1},))
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:x], Any[Any[Any[:x,Array{Int64,1},0]],Any[],Any[],Any[]], :(begin  # none, line 1:
        return x::Array{Int64,1} .* 0.5::Array{Float64,1}
    end::Array{Float64,1}))))

Would appreciate if .* can be resolved to Base module. Actually a related question that I have is, how do you write something like methods(Base..*) with correct syntax?

EDIT:

After a closer look, actually the above .* is pretty printed as ., but the real AST is Main... Why is it not Base..*?

JeffBezanson · 2015-07-11T05:20:37Z

I decided to print operators as just their names, since otherwise the printed ASTs are hard to read.

I admit it's awkward, but you can write Base.(:(.*)).

I'd like to resolve it to Base..*, but currently in some cases this forces bindings to be resolved too early. I'll keep working on it.

ninegua · 2015-07-14T18:07:44Z

It looks like resolution for operators is different than normal names. Here is a program that demonstrate this issue. I attempt to re-define sin and - (negation) in a separate module:

julia> module Test
         for f in (:sin, :-)
           g = QuoteNode(f)
           @eval function ($f)(x::Float64)
             getfield(Base, $g)(x)
           end
         end
       end
Warning: Method definition -(Float64) in module Base at float.jl:201 overwritten in module Test at none:5.
Test

julia> -(10.0)
ERROR: StackOverflowError:
 in - at none:5 (repeats 79996 times)

julia> Test.(:(-))(10.0)
ERROR: StackOverflowError:
 in - at none:5 (repeats 79996 times)

julia> sin(10.0)
-0.5440211108893698

julia> Test.sin(10.0)
-0.5440211108893698

julia> code_typed(Test.sin, (Float64,))
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:x], Any[Any[Any[:x,Float64,0]],Any[],Any[],Any[]], :(begin  # none, line 5:
        return ((Test.getfield)(Test.Base,:sin))(x::Float64)
    end))))

julia> code_typed(Test.(:(-)), (Float64,))
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[:x], Any[Any[Any[:x,Float64,0]],Any[],Any[],Any[]], :(begin  # none, line 5:
        return ((Test.getfield)(Test.Base,:-))(x::Float64)
    end))))

As seen above, the definition of - in Test module overrides top-level definition (which then creates an infinite loop when called), but this is not the case for sin (which is what I'd expect).

An additional question is, why isn't getfield resolved to top(getfield)? There is obviously no function named getfield in either Test module, or Base.

JeffBezanson · 2015-07-14T18:19:31Z

This is because modules do importall Base.Operators by default. This behavior is slightly controversial, but it's there because people basically always want to extend operators rather than define them anew.

The AST contains GlobalRef(Test, :getfield) because at a certain point, all we know is that the variable reference occurs in the Test module. We don't yet always know where this variable will ultimately resolve to. At some point, evaluating that GlobalRef will give you the correct value, however doing so also forces the binding to be resolved. There is a small possibility that the module was going to do e.g. a using statement that would have changed where the variable resolves to.

ninegua · 2015-07-14T19:18:22Z

Thanks for the explanation. I'd appreciate if there is a way to turn off importall for operators so that we can define operators that reside locally in a module rather than overriding existing definitions in Base.

On the second issue, I believe here is what you meant:

julia> module Test
         module B
           export g
           function g(x)
             println(x, " world!")
           end
         end
         module A
           function f()
             g("hello")
           end
           println(code_typed(A.f, ()))
           using ..B
         end
       end 
Any[:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[],Any[]], :(begin  # none, line 10:
        return (Test.A.g)("hello")
    end))))]
Test

julia> Test.A.f()
hello world!

At printing, the AST of Test.A.f contains a call to Test.A.g, which is yet to be defined. Once the code for module Test.A is completely loaded, we can safely resolve Test.A.g to Test.B.g.

So, am I correct to assume that once a module is completely loaded, evaluating all GlobalRef in the ASTs of this module should give me the final resolution? Do we have a function that does this? I'm thinking something of type GlobalRef -> Union(GlobalRef, TopNode) would do.

quinnj · 2015-07-14T19:22:13Z

Paul,

Check out baremodules:
http://docs.julialang.org/en/latest/manual/modules/#default-top-level-definitions-and-bare-modules
For cases where you need just the bare minimum.

-Jacob

On Tue, Jul 14, 2015 at 1:18 PM, Paul Liu [email protected] wrote:

Thanks for the explanation. I'd appreciate if there is a way to turn off
importall for operators so that we can define operators that reside
locally in a module rather than overriding existing definitions in Base.

On the second issue, I believe here is what you meant:

julia> module Test
module B
export g
function g(x)
println(x, " world!")
end
end
module A
function f()
g("hello")
end
println(code_typed(A.f, ()))
using ..B
end
end
Any[:($(Expr(:lambda, Any[], Any[Any[],Any[],Any[],Any[]], :(begin # none, line 10:
return (Test.A.g)("hello")
end))))]
Test

julia> Test.A.f()
hello world!

At printing, the AST of Test.A.f contains a call to Test.A.g, which is
yet to be defined. Once the code for module Test.A is completely loaded,
we can safely resolve Test.A.g to Test.B.g.

So, am I correct to assume that once a module is completely loaded,
evaluating all GlobalRef in the ASTs of this module should give me the
final resolution? Do we have a function that does this? I'm thinking
something of type GlobalRef -> Union(GlobalRef, TopNode) would do.

—
Reply to this email directly or view it on GitHub
#10403 (comment).

ninegua · 2015-07-14T20:19:39Z

@quinnj Thanks for the suggestion, Here is what I tried:

julia> baremodule Test
         import Base, Base.warn, Base.call, Base.@eval
         export sin, -
         eval(x) = Core.eval(Test, x)
         for f in (:sin, :-)
           g = QuoteNode(f)
           @eval function ($f)(x::Float64)
             warn("in Test!")
             getfield(Base, $g)(x)
           end
         end
       end
Test

julia> importall Test
Warning: ignoring conflicting import of Test.- into Main

julia> sin(10.0)
WARNING: in Test!
-0.5440211108893698

julia> -(10.0)
-10.0

julia> Test.sin(10.0)
WARNING: in Test!
-0.5440211108893698

julia> Test.(:(-))(10.0)
WARNING: in Test!
-10.0

So I've managed to define a - operator in Test module that doesn't override its default definition in Base, but neither can I import it and run it without specifically writing Test.(:(-)). Any suggestion in getting around this?

ninegua · 2015-07-14T22:53:22Z

I'm trying eval(GlobalRef(...)), it's close to what I wanted. Let me rephrase what I want to achieve:

Being able to define and use both operators and functions local to a module that override definitions in Base.
Being able to locate where a definition comes from.

Both points can be illustrated by the following program:

julia> baremodule Test
                import Base, Base.warn, Base.call, Base.@eval, Base.@noinline
                export sin, -, abc
                eval(x) = Core.eval(Test, x)
                for f in (:sin, :-)
                  g = QuoteNode(f)
                  @eval @noinline function ($f)(x::Float64)
                    warn("in Test!")
                    getfield(Base, $g)(x)
                  end
                end
                abc = 10
              end
Test

julia> module Prog
                importall ..Test
                f() = sin(0.5) + abc
                g() = -(0.5)
              end
Warning: ignoring conflicting import of Test.- into Prog
Prog

julia> code_typed(Prog.g, ())
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[], Any[Any[],Any[],Any[],Any[]], :(begin  # none, line 4:
        return (Base.box)(Base.Float64,(Base.neg_float)(0.5))::Float64
    end::Float64))))

julia> code_typed(Prog.f, ())
1-element Array{Any,1}:
 :($(Expr(:lambda, Any[], Any[Any[],Any[],Any[],Any[]], :(begin  # none, line 3:
        return (Prog.sin)(0.5) + Prog.abc
    end))))

We can see that sin is resolved to Prog.sin, but - is somehow resolved to Base.(:(-)) (and inlined). I guess the warning indicates something is preventing this from happening. I guess the solution is to make Prog also a baremodule, but this would discourage people from using operators defined in modules other than Base (but with the same name as those in Base). Is there a possible fix for this?

What I had hoped for was to resolve both to module Test, but as explained by Jeff, I now understand it cannot be done until compile time. That being said, I can always find the real location (of a GlobalRef to a function) like below:

julia> eval(code_typed(Prog.f, ())[1].args[3].args[2].args[1].args[1]).env.defs.func.code.module
Test

julia> eval(code_typed(Prog.f, ())[1].args[3].args[2].args[1].args[1]).env.defs.func.code.name
:sin

For non-functions such as the use of abc in function Prog.f, I'd appreciate a way to find their exact locations.

The fact that Julia's name resolution is only "final" until compile time is also a bit surprising. I don't know if there is a solution to make everybody happy, but the following demonstrates what is going on:

julia> module A
         export abc
         abc = 10
       end
A

julia> 

julia> module B
         using ..A
         f() = abc
         println(f())
         abc = 20
       end
10
Warning: imported binding for abc overwritten in module B
B

julia> 

julia> module C
         using ..A
         f() = abc
         abc = 20
         println(f())
       end
20
C

I find it intriguing that there is a warning for module B, but not for module C.

ninegua · 2015-07-14T23:15:11Z

Actually here is what happens when I use using instead of importall:

julia> module Prog
         using ..Test
         f() = sin(0.5) + abc
       end    
Warning: using Test.- in module Prog conflicts with an existing identifier.
Prog

julia> Prog.f()
Warning: both Test and Base export "sin"; uses of it in module Prog must be qualified
ERROR: UndefVarError: sin not defined
 in f at none:3

If I understand it correctly, the intended use of import and importall is the ability to extend functions that are being imported. But what happens when the imported names are a conflict with existing names? This wasn't explained in documentation.

The same goes to using. The above gives a warning message and eventually an error when Prog.f is called. However, changing using to importall would make the same call successful. I'd appreciate some clarification on this difference in behavior, when both are dealing with name conflicts between what was already defined in current scope and what is defined in the module to be used or imported.

ninegua · 2015-07-16T01:06:27Z

Just want to add a comment that I really do think the name resolution should be final at type inference time (instead of compile time in codgen), otherwise we risk the danger of resolving it to a different type. The follow shows this bug:

julia> module A
         export abc
         @noinline abc(x::Any) = "hello"
       end
A

julia> 

julia> module B
         import ..A.abc
         f() = begin x = abc(1) end
         println(code_typed(f,()))
         @noinline abc(x::Int) = x
       end
Any[:($(Expr(:lambda, Any[], Any[Any[Any[:x,ASCIIString,18]],Any[],Any[ASCIIString],Any[]], :(begin  # none, line 3: # none, line 3:
        GenSym(0) = (B.abc)(1)::ASCIIString
        x = GenSym(0)
        return GenSym(0)
    end::ASCIIString))))]
B

julia> 

julia> B.f()

signal (11): Segmentation fault

One may argue that the above is not a valid program because function abc is being extended after I force the type inference on function f, and the inferred String type caused a conflict with the actual method B.abc that is being called when f is run. However, the following works fine:

julia> module A
         export abc
         @noinline abc(x::Any) = "hello"
       end
A

julia> 

julia> module B
         import ..A.abc
         f() = begin x = abc(1) end
         f()
         @noinline abc(x::Int) = x
       end
B

julia> 

julia> B.f()
"hello"

So instead of merely doing type inference, I forced compilation of f, then it stays to permanently resolve function abc to the method defined in module A.

I might be splitting the hair here, but the above illustrates a name resolution gap between inference and codegen. The thing caused the crash is because type inference is making an assumption that the call to abc resolves to module A when it infers its return type, but the generated code still has GlobalRef(B, :abc), which later during codegen locates a method defined in module B. This is a mismatch, and I would argue it really should produce GlobalRef(A, :abc) because that is all the knowledge that the inferencer has at the time when it works on function f. This would also match the behavior in the second example, so all would be fine.

I understand that forcing inference and compilation can sometimes result in surprising behaviors, but at the same time I hope I've just made the case that name resolution should be final at type inference time, so that GlobalRefs would never be resolved to different things when being compiled later. By "final", I mean it resolves to the exact place where the matching method is defined. This is where inferencer finds the type, and where codegen finds the method.

JeffBezanson added a commit that referenced this issue Mar 8, 2015

rename GetfieldNode to GlobalRef and simplify it a bit. part of #10403

b80f7db

vtjnash mentioned this issue May 16, 2015

"UndefVarError: call not defined" when doing call = Expr(x...) #11295

Closed

JeffBezanson self-assigned this Jun 5, 2015

JeffBezanson added a commit that referenced this issue Jun 8, 2015

allow GlobalRef nodes in more places (mostly assignments)

39a369e

part of #10403

tkelman pushed a commit to tkelman/julia that referenced this issue Jun 8, 2015

allow GlobalRef nodes in more places (mostly assignments)

6379704

part of JuliaLang#10403

JeffBezanson added a commit that referenced this issue Jun 11, 2015

resolve all globals to GlobalRef very early

6ad9892

part of #10403

JeffBezanson mentioned this issue Jun 11, 2015

WIP: resolve all globals to GlobalRef very early #11672

Merged

JeffBezanson added a commit that referenced this issue Jun 15, 2015

resolve all globals to GlobalRef very early

80911c1

part of #10403

JeffBezanson added a commit that referenced this issue Jun 22, 2015

resolve all globals to GlobalRef very early

706b082

part of #10403

JeffBezanson added a commit that referenced this issue Jun 22, 2015

resolve all globals to GlobalRef very early

904e7cd

part of #10403

JeffBezanson added a commit that referenced this issue Jun 30, 2015

resolve all globals to GlobalRef very early

8578b15

part of #10403

JeffBezanson added a commit that referenced this issue Jul 10, 2015

resolve all globals to GlobalRef very early

9c6a71e

part of #10403

JeffBezanson added a commit that referenced this issue Jul 10, 2015

resolve all globals to GlobalRef very early

7908246

part of #10403

JeffBezanson closed this as completed Jul 10, 2015

JeffBezanson added a commit that referenced this issue Jul 21, 2015

replace getfield calls with GlobalRef in more cases. ref #10403

2502f27

vtjnash mentioned this issue Dec 6, 2016

Fix macro hygiene when calling the macro in the same module. #15850

Merged

A more diligent name resolution? #10403

A more diligent name resolution? #10403

Comments

ninegua commented Mar 5, 2015

JeffBezanson commented Mar 5, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

ninegua commented Mar 6, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

quinnj commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

timholy commented Mar 6, 2015

JeffBezanson commented Mar 6, 2015

ninegua commented Jul 11, 2015

JeffBezanson commented Jul 11, 2015

ninegua commented Jul 14, 2015

JeffBezanson commented Jul 14, 2015

ninegua commented Jul 14, 2015

quinnj commented Jul 14, 2015

ninegua commented Jul 14, 2015

ninegua commented Jul 14, 2015

ninegua commented Jul 14, 2015

ninegua commented Jul 16, 2015