Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

string() not vectorized like other small caps type conversions #8389

Closed
nalimilan opened this issue Sep 17, 2014 · 37 comments
Closed

string() not vectorized like other small caps type conversions #8389

nalimilan opened this issue Sep 17, 2014 · 37 comments

Comments

@nalimilan
Copy link
Member

julia> int([0:2])
3-element Array{Int64,1}:
 0
 1
 2

julia> float([0:2])
3-element Array{Float64,1}:
 0.0
 1.0
 2.0

julia> bool([0:2])
3-element Array{Bool,1}:
 false
  true
  true

julia> string([0:2])
"[0,1,2]"

I'd find it more consistent if string returned what is currently [string(x) for x in 0:2]. And it seems to me the latter operation is much more common than expecting "[0,1,2]", which is useful only in very special contexts. Maybe call it deparse, stringrep or something like that? Or simply use the String constructor, which by definition must return a String object, not an Array{<:String}?

This is related to #1470.

@JeffBezanson
Copy link
Member

This is exhibit A for why "vectorization" is so incredibly awful.

There isn't really any such thing as a "small caps type conversion". There is just a set of functions with that kind of behavior. string is not a member of that set. In any case, i think vectorizing more functions will just make the problem worse. There isn't going to be any general agreement on exactly which functions should be vectorized.

@StefanKarpinski
Copy link
Member

This is the best argument against vectorization that I've seen in a while.

@nalimilan
Copy link
Member Author

Yeah, but then let's just make int(), bool() and float() not vectorized -- and cry! ;-)

More seriously, the debate around vectorization is always the same: there's a feeling that vectorization is evil, and yet many vectorized functions are present and used in Julia. Either they should be consistent, or they should not exist.

here isn't really any such thing as a "small caps type conversion".

I had the impression that in #1470 people tried finding a rule for them, precisely because currently it's not very clear.

@StefanKarpinski
Copy link
Member

In general, numerical functions are vectorized. No string functions are. This is quite consistent already.

@johnmyleswhite
Copy link
Member

FWIW, I wish we didn't have any "small caps type conversion" functions.

@JeffBezanson
Copy link
Member

Wanting to increase consistency points in only one direction: fewer vectorized functions. You can't get more consistency by picking a few more functions to vectorize, because it is not clear where to stop that process.

I would love to remove most uses of vectorization, such as sin. Some cases, like scalar*vector are actually mathematically defensible and should stay.

I would also prefer to remove int, bool, etc. I wonder if there is a workable notation more convenient than map(iround, a) or convert(Array{Int}, a).

@johnmyleswhite
Copy link
Member

Personally, I think it's really helpful to distinguish functions that necessarily apply to vectors (like scalar * vector) and functions that are just the result of applying a scalar function to each element of an array like sin. Only the former should exist; the latter set of functions just leaves you with a Sisyphean task of vectorizing every function you ever write for every container you ever define.

One argument in defense of convert(Array{Int}, a): it generalizes easily to convert(DataArray{Int}, a), whereas the current type converters are just confusing when applied to DataArray objects.

@andreasnoack
Copy link
Member

I'd love that. Then we could use exp(Matrix) and sin(Matrix) for their matrix versions.

@johnmyleswhite
Copy link
Member

@JeffBezanson: The power to kill vectorization forever is in your hands. Just make map fast.

@JeffBezanson
Copy link
Member

I totally agree. convert(Array{Int}, a) is certainly better, since it is so much more general. The sooner you get used to that, the easier it is to use a wide variety of containers. Unfortunately the convenience of int(x) is highly compelling to many people.

@rfourquet
Copy link
Member

Why not:

Base.getindex(f::Function, a::AbstractArray) = map(f, a)
x = [1.1, 2.2]
@assert round[x] == round(x)

@StefanKarpinski
Copy link
Member

Let's pretend that we didn't have the now-widely-regretted space sensitivity in array context. Then there are a lot of things we could do. [T] x for vectorized conversion would be one possible syntax then.

@JeffBezanson
Copy link
Member

That's an interesting idea. Or if you like punctuation, x .|> round is a nice syntax for map. Though I admit .|> is a pain to type.

@StefanKarpinski
Copy link
Member

What about f.(x)? It also has a now-regretted prior meaning, but you can see why it makes sense.

@JeffBezanson
Copy link
Member

That's nice. map is certainly a better use for f.(x) than the current meaning.

@toivoh
Copy link
Contributor

toivoh commented Sep 17, 2014

+1

@toivoh
Copy link
Contributor

toivoh commented Sep 17, 2014

Btw, would that call map or broadcast?

@pao
Copy link
Member

pao commented Sep 17, 2014

That's nice. map is certainly a better use for f.(x) than the current meaning.

Should I plan on this possibility? Revising some things in StrPack right now where this comes up; I'll change them over to setfield()/getfield() if I have to.

@nalimilan
Copy link
Member Author

f.(x) would be great, if that's possible. Then most (all?) explicitly vectorized functions could go away.

@dcjones
Copy link
Contributor

dcjones commented Sep 17, 2014

👍👍 for fewer vectorized functions and more emphasis on higher order functions like map and the like.

A high precedence binary as operator might make convert calls more friendly.

x as Vector{Int}

Taking that further, if we had a unary version like:

function as(T::Type)
    return x -> x as T
end

And used Stefan's syntax for map we could write f.(as Int).

@eschnett
Copy link
Contributor

Some other languages already have a syntax for "map", e.g. "/@" in
Mathematica. Maybe Matlab has something as well? It may make sense to
borrow from there.

-erik

On Wed, Sep 17, 2014 at 2:01 PM, Stefan Karpinski [email protected]
wrote:

What about f.(x)? It also now-regretted prior meaning, but you can see
why it makes sense.

Reply to this email directly or view it on GitHub
#8389 (comment).

Erik Schnetter [email protected]
http://www.perimeterinstitute.ca/personal/eschnetter/

@JeffBezanson
Copy link
Member

I don't think there's much to be said in favor of /@.

@quinnj
Copy link
Member

quinnj commented Sep 17, 2014

+1 to f.(x). It could go hand in hand with #1470, so you could do Int.(x) instead of the current equivalents convert(Array{Int}, x) and int(x).

I've also been a previous advocate for the x as Vector{Int} syntax as well.

@vchuravy
Copy link
Member

I find the f.() syntax hard to read and it would be confusing to newcomers to Julia
I would like some syntactic sugar for map application, but I would prefer it to be something which allows white space, like f $ x (sadly that is already defined as XOR)

@timholy
Copy link
Member

timholy commented Sep 17, 2014

I like f.(x), since . is already used to do stuff over elements. I don't think there's any confusion wrt broadcasting, since f doesn't have a size but x does.

@staticfloat
Copy link
Member

I like f.(x) for succinctly expressing mapping of a function over elements. It's so nicely general and easy to type, and has a nice parallel with the other dot operators. I assume that this would work only for single-argument functions such as those you pass in to map?

@StefanKarpinski
Copy link
Member

That is rather a significant issue.

@JeffBezanson
Copy link
Member

map(f, x, y, ...) is well-defined; it maps over everything. Of course sometimes you want to iterate over some arguments and not others, but that probably gets too detailed for special syntax.

@staticfloat
Copy link
Member

The other way this could work is to do something like we do for do blocks where the first argument is "special". That could be nice, as it could allow you to map a function across the first argument and pass the same values in for the others.... but as Jeff points out this diverges from map syntax. Not sure if that's a good thing, (different syntaxes for different things) or a bad thing (could argue it's less consistent).

If we do go down the path of mapping over only the first argument, I imagine the f.(x) syntax would become equivalent to creating anonymous functions that wrap f to pass in the static arguments, e.g. f.(x, y, z) would correspond to map( container -> f(container, y, z), x). That doesn't seem terrible to me, and could be nice to expose easily.

@johnmyleswhite
Copy link
Member

For multiargument functions, I feel like the right approach is to improve Julia's ability to define closures so that we can do more currying.

@toivoh
Copy link
Contributor

toivoh commented Sep 18, 2014

@johnmyleswhite: Which behavior for multi argument functions are you proposing?

I think that it would be most consistent if all arguments are treated equally.
Also, if we get this, I think people will start to ask that f.(x, y) do broadcasting (probably myself included).

@nalimilan
Copy link
Member Author

I can see three solutions:

  1. Vectorization only over the first argument: f.(x, y, z) = map(container -> f(container, y, z), x)

  2. Vectorization over all arguments: f.(x, y, z) = map((a, b, c) -> f(a, b, c), x, y, z)

  3. Vectorization over all arguments like 2), but scalars are accepted and passed as-is: generalization of @vectorize_2arg

  4. has the advantage that would allow replacing @vectorize_2arg with the new syntax everywhere, and that it is possible to pass parameters which are the same for all elements of the array (for example to automatically vectorize calls like log.(5, [1:10]) to use base 5). But it may be confusing if you pass the wrong input and get a weird error.

But we need real use cases to decide what's the most useful definition. FWIW, there's a list of currently vectorized functions in the manual: http://docs.julialang.org/en/release-0.3/manual/arrays/#vectorized-operators-and-functions

@StefanKarpinski
Copy link
Member

Another possible syntax: f[v]. Could be implemented right now, actually.

@tonyhffong
Copy link

Wait that reads like type constructor on vector. e.g. Int[1,2,3], so we already have

a = {1,2,3}
Int[ a... ]

So do we still keep the ...?

@toivoh
Copy link
Contributor

toivoh commented Sep 19, 2014

@nalimilan: Using broadcast would be a natural generalization of 3), where you extend the dimensions which are absent (or size 1) in each argument.

@tonyhffong: Yes, that is a problematic ambiguity, since they have different semantics. We would have to deprecate that syntax, which I'm not so sure is a good idea.

@rfourquet
Copy link
Member

I find the f[v] syntax nice precisely because it's almost what is already in Julia. When f is a type constructor (and suppose convert calls it), Int[1, 2, 3] is already a kind of map (map(Int, [1,2,3]) if Int was callable), and could be seen as sugar for Int[[1, 2, 3]] or v={1,2,3}; Int[v]. There is an ambiguity in cases like Vector{Int}[[1,2,3]] which currently is not equivalent to map(Vector{Int}, [1, 2, 3]) but Vector{Int} is not callable, and if it was, finding out the "good" version (map or current meaning) seems doable. There is hopefully a good general solution for dealing with this ambiguity.

@Jutho
Copy link
Contributor

Jutho commented Sep 19, 2014

To resolve the issue of @tonyhffong, what if f[v] where equivalent to map(f,collect(v)), such that one could write f[1,2,3]? In that case, the syntax is completely equivalent to functions and type constructors. If a is already a collection, one should write f[a...], which than maps to map(f,collect(a...)). I am not sure if that would cause a massive overhead or whether it would be inlined away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests