Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rename ref etc. to something less likely to conflict #1484

Closed
kmsquire opened this issue Oct 31, 2012 · 24 comments
Closed

rename ref etc. to something less likely to conflict #1484

kmsquire opened this issue Oct 31, 2012 · 24 comments
Labels
breaking This change will break code needs decision A decision on this change is needed
Milestone

Comments

@kmsquire
Copy link
Member

I just got bit by redefining ref as a string:

julia> line = "1\t2\t3\t4\t5\t6"
"1\t2\t3\t4\t5\t6"

julia> (name, pos, ref, depth, bases, quals) = split(line)
6-element String Array:
 "1"
 "2"
 "3"
 "4"
 "5"
 "6"

julia> name[1]
type error: apply: expected Function, got ASCIIString
@toivoh
Copy link
Contributor

toivoh commented Nov 1, 2012

That example is a bit troubling when it comes to namespace isolation. Normally, I would expect that in a namespace I create myself, I will be able to (re)define the names of anything that I don't import and explicitly use from another namespace.

Of course, the code does use ref, because name[1] is just shorthand for ref(name,1), but that's not obvious when you read the code. There might be a time when many julia users don't even know (or need to know) that indexing with [] is a call to ref, and it's a pretty short name, which is why I guess Kevin happened to pick it.

On the other hand, you want to be able to overload ref in the current scope, and renaming it to __ref__ doesn't seem very compelling either. Or perhaps there should be a list of pseudo-reserved words, such as

ref, assign, tuple, hcat, vcat, hvcat, etc

that you should not redefine unless you really know what you are doing. (Are there others that might get called in the current scope without appearing in the source?)

I have no idea how this should best be handled.

@JeffBezanson
Copy link
Member

One thing I can do is make a[i] mean Base.ref(a,i) instead of ref(a,i). Maybe that would be better.

@staticfloat
Copy link
Member

Perhaps that behavior should be the default? E.g. if I have foo() defined inside a module, and I have code that calls foo() from inside that module, it should always be equivalent to calling Bar.foo() where Bar is the name of the module.

Another way of saying this is that perhaps we should have code located in a module search that module for a binding first, and then move outward. This would be similar to the idea of "scope" discussed in #1342

@JeffBezanson
Copy link
Member

In this situation (at the prompt), we are in Main, not Base. So what you're describing is in fact happening: after ref="3" brackets no longer work since they use the local ref, which is now a string.

@staticfloat
Copy link
Member

I think I was misunderstanding the boundary between parser and standard library.

I was assuming that somewhere in Base, there was an operator overload for the [] operator that subbed out to the function ref that was also defined in Base, and the problem was that the binding was being overwritten by this new ref in Main. However, I think what you're saying is that the translation from the [] operator to ref is done in the parser, before these higher-level rules come into effect?

@toivoh
Copy link
Contributor

toivoh commented Nov 1, 2012

@JeffBezanson: I thought about making it mean Base.ref also. But then you would never be able to replace ref in the local context. Would the same go for +, -, etc? These I think should be replaceable, at least.

@JeffBezanson
Copy link
Member

Yes, a[i] is equivalent to writing ref(a,i). After parsing they can't be distinguished.

The argument I can think of for making it Base.ref is that the word ref is not apparent in a[i], whereas in a+b you can see that + is being called.

@staticfloat
Copy link
Member

Yes, a[i] is equivalent to writing ref(a,i). After parsing they can't be distinguished.

Hmm, that does pose a problem, doesn't it, because as @toivoh states above, if I then have a module Bar that overrides ref() for whatever reason, we have two options: Either it is allowed to break code by overriding Base.ref outside of Bar, or it is inaccessible because Julia must know somehow that barVar[i] should translate to Bar.ref(barVar,i).

Is this a result of the fact that we don't have a [] operator, and must translate the square brackets syntax to a ref() call?

@JeffBezanson
Copy link
Member

We don't have operators at all, only function names and syntax for them. Maybe it just needs to be renamed to something less likely to be used, like __ref__ or bracket_operator. After all, the only reason we don't worry about this in the case of + is that nobody would write (+) = 2, but as we see here you might very well write ref = 2.

@staticfloat
Copy link
Member

I guess the reason this seems strange to me is that I'm still thinking in an "Object-Oriented C++" way, where data and functionality is tied together. E.g. I feel that since the data is a String, which is defined in base/string.jl, then myStr[i] should invoke the ref that was defined in base/string.jl. I think the part that throws me off is the much more C-like separation between functionality and data. (This viewpoint is underscored by the fact that myStr[i] gets translated to ref(myStr, i), regardless of the type of myStr)

@JeffBezanson
Copy link
Member

You seem to be confusing calling ref on a string (myStr[i]), which is unproblematic, with ref itself being a string. If ref is a string, as it is here, then ref(foo,bar) is an attempt to call a string like a function.

@staticfloat
Copy link
Member

I think that's the whole problem; that it is possible for these two to be confused. The issue that we really should be talking about (My rambling about object-orientedness notwithstanding) is the fact that it's surprising to users that ref has anything to do with square brackets. The root problem is that we have two pieces of syntax related to a single concept: We have the syntax myStr[i], and we have the syntax ref( myStr, i ), where the lookup for the binding ref is not performed in the namespace String, but is instead performed in Main. This is in contrast to a function such as + where + directly corresponds to the function box defined in float.jl for instance. This is desirable because the binding to box is locked away in the Base namespace and can't be touched by Main without overriding +.

Ideally, I think it'd be most transparent to have a function called [], E.g.:

[](s::String, i::Int) = next(s,i)[1]

Something like that is probably much easier to figure out what is going on for a user, and "fits in" with the definitions of other overloadable syntax like +, .*, etc... It also removes the "middleman", and hence is "transparent" in that users will know exactly what they are overloading if they want to overload the bracket operators. (next can't be overridden because it's tucked away in the Base module)

I realize this syntax might be really messy to implement, so I'm willing to agree that something like __ref__ might be the next best thing, but I've always felt Julia's syntax to be unusually straightforward, so I'd like to try for that as much as possible.

@JeffBezanson
Copy link
Member

Correct, the problem is simply that ref does not visually appear in a[i]. Nobody would find it odd that the code ref=2;ref() gives an error.

I'm all in favor of a rename if we can figure out a good one. It turns out [] is a bit of a wart as it is, since it gives a None array that is basically useless. Making [] an identifier name would be strange, but it would be a better name than ref. Then we also have to think about assign, as well as hcat, vcat, hvcat, and colon. The cat functions are especially tricky since their syntax is also square brackets! : is in fact a valid identifier, but it is used for the "Colon object" for passing to indexing-like functions e.g. f(a, :).

@kmsquire
Copy link
Member Author

kmsquire commented Nov 2, 2012

I haven't felt that I've had anything useful to add to the discussion until now, but how about using _ref or _ref_, and asserting that variables/functions with names starting with with underscores are discouraged and/or reserved for internal use? I will agree that [] is pretty useless right now though--I was very confused when I tried to use it. The thing is, it's quite likely to be used by someone coming from python hoping to use it as an empty (and expandable) array. Setting my_array = [] might become equally confusing if [] becomes the new identifier for ref!

(It might have been obvious, but ref in the example was short for reference, a common abbreviation in bioinformatics. Not that it really needed more justification for use as an identifier.)

@JeffBezanson
Copy link
Member

Yes, making [] a function would probably be more confusing than anything else mentioned here :)

We will probably have to do something like the underscore convention.

Best thing to do with [] might be to make it equivalent to {} and give an empty Vector{Any}.

@staticfloat
Copy link
Member

Yeap, you're totally right. Having [] as a function would completely screw me up as well. Thanks for the patience guys, still getting used to working without object methods. :)

JeffBezanson added a commit that referenced this issue Nov 21, 2012
consistently wrap all calls generated by the front end in (top ). this
makes them behave like macros defined in Base, which is reasonable.
related to #1484. this at least prevents such collisions for now, but
it may not be the final behavior we settle on. we may want some names
(like ref) evaluated in the context where they occur.
@kmsquire
Copy link
Member Author

kmsquire commented Mar 6, 2013

This can probably be closed. The current situation isn't perfect (one can't import Base.ref and use ref as a variable name), but given scale of the breakage if ref were renamed, and that warnings are properly issued, it's probably not that import any more.

@JeffBezanson
Copy link
Member

The other thing that scares me is that we should technically rename assign to assign!, but that will cause widespread devastation. It might just get grandfathered in.

@toivoh
Copy link
Contributor

toivoh commented Mar 6, 2013

@kmsquire: I don't think that it's just Base.ref. If you define ref in your own module, x[k] will call that one instead. And we want it to, in order to be able to redefine indexing inside a specific namespace (just like you can with e.g. +).

@kmsquire
Copy link
Member Author

kmsquire commented Mar 6, 2013

Of course, sorry, wasn't clear. The original issue was that I used ref as a variable, and it's not possible to import Base.ref and then use it as a variable.

@kmsquire
Copy link
Member Author

kmsquire commented Mar 6, 2013

Which is only true if you try to use ref in global scope. So maybe it's actually fine...

Nothing to see here... move along...

I'm going to close this.

@kmsquire kmsquire closed this as completed Mar 6, 2013
@JeffBezanson
Copy link
Member

I changed a[i] to always call Base.ref to get around this problem. But we probably want it to refer to whatever the global ref in the current context is. Unfortunately there is no way to do that yet.

@JeffBezanson
Copy link
Member

Interactive use is also an issue. If I changed a[i] to just use the current global ref, then ref=1 at the prompt would break all indexing. So we still might want to rename it.

@kmsquire kmsquire reopened this Mar 6, 2013
@kmsquire
Copy link
Member Author

kmsquire commented Mar 6, 2013

Okay, I'll leave it then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests

4 participants