Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dict syntax is getting me down #6739

Closed
JeffBezanson opened this issue May 4, 2014 · 57 comments · Fixed by #8521
Closed

Dict syntax is getting me down #6739

JeffBezanson opened this issue May 4, 2014 · 57 comments · Fixed by #8521
Labels
needs decision A decision on this change is needed
Milestone

Comments

@JeffBezanson
Copy link
Member

I have not been enjoying that => is not first-class. It works well for simple dict literals ([a=>b, c=>d]), but the other forms cause trouble. For example I defined

typealias AbstractValue Dict{Symbol,LatticeElement}

but one cannot make a Dict of this type using this definition. You have to write (Symbol=>LatticeElement)[...] every time.

In the near future it will be possible to write Dict( (a,b) for i in x ), which makes a pretty good dict comprehension syntax without any special code in the front end. For literals this would be Dict([(a,b), (c,d), ...]), which starts to suffer from too much punctuation. One way to solve this is to make => a type, so you could write Dict(a=>b, c=>d, ...). => would not be iterable, so this would not be ambiguous. This would eliminate all special dict syntax. It would also make it more obvious how to write a typed empty Dict: you just pass 0 arguments.

@StefanKarpinski
Copy link
Member

I've often wanted this too. What would the type of => be? Pair? Having a tuple of pairs as a lightweight Dict-like type would be really handy too. Or something like that.

@quinnj
Copy link
Member

quinnj commented Jul 2, 2014

+1, it'd be great to have a=>b be =>(a,b) --> Pair.

@quinnj quinnj added this to the 0.4-projects milestone Aug 14, 2014
@quinnj
Copy link
Member

quinnj commented Aug 14, 2014

It'd be great to address this with #7941.

@JeffBezanson
Copy link
Member Author

Absolutely. Personally I want a syntax that generalizes, as described above, but all the better if if plays along well with array syntax.

@JeffBezanson
Copy link
Member Author

I propose

  1. Make a=>b syntax for (a,b).
  2. Deprecate the syntax (a=>b)[...] and [ a=>b, ... ] and { a=>b, ... }.
  3. Continue to allow { a=>b for a in ... } a bit longer, but it should be deprecated too.
  4. Use Dict(a=>b, ...) and Dict{A,B}( ... ).

Dict(a=>b) is sketchy since Dict can also accept a single iterable argument, and tuples are iterable. It may be that the single best way to resolve this is to make Dict(xs...) fast. Otherwise you could run into problems if you have function f(pairs...) and you call Dict(pairs) inside. However we already use Dict(pairs) and Set(elts), so that might be too much change.

Another option is to make a=>b give a different type, but that could cause issues elsewhere, like iterating with for (k,v) in dict.

@vtjnash
Copy link
Member

vtjnash commented Sep 27, 2014

Shouldn't that be Dict(a=b, ...)? Otherwise, what would the syntax in 4 mean generally?

@quinnj
Copy link
Member

quinnj commented Sep 27, 2014

A couple other things to think about in accordance with other proposed changes:

  • Generalized comprehension syntax #4470: Implementing generators could also provide a Dict(i=>v for i in ...) type syntax
  • what should we use { } for? #8470: Using { } for tuple types could open up Dict constructions/initialization to Dict({K,V}) or some other tricks
  • Actually, what about {K=>V} be shorthand for Dict{K,V}? It would kind of go hand in hand with the { } for tuple types change and writing {K=>V}[] seems pretty convenient.
    I kind of like the idea of a=>b returning a Pair(a,b) object that kind of represents a single association. I think it could be useful in a slew of other places apart from associatives. I don't understand how that would mess with iterating with for (k,v) in dict, @JeffBezanson?

@JeffBezanson
Copy link
Member Author

Dict(a=b) could be allowed, but only works for symbol keys.

I really don't want special syntax using various combinations of brackets
and =>. Those don't generalize and are invariably hard to remember.

@toivoh
Copy link
Contributor

toivoh commented Sep 28, 2014

I really think that a=>b should produce instances of a type distinct from Tuple, such as Pair. This will allow to distinguish uses such as Dict(a=>b) and Dict(iterable) based on syntax. Pair could still be iterable, or iteration over dicts could still return tuples.

@eschnett
Copy link
Contributor

If a=>b produces a special type, then why not make it DictPair? Then the
distinction would be unambiguous. DictPair would not be iterable.

-erik

On Sun, Sep 28, 2014 at 2:09 AM, toivoh [email protected] wrote:

I really think that a=>b should produce instances of a type distinct from
Tuple, such as Pair. This will allow to distinguish uses such as
Dict(a=>b) and Dict(iterable) based on syntax. Pair could still be
iterable, or iteration over dicts could still return tuples.

Reply to this email directly or view it on GitHub
#6739 (comment).

Erik Schnetter [email protected]
http://www.perimeterinstitute.ca/personal/eschnetter/

@JeffBezanson
Copy link
Member Author

I think it could be OK for a=>b to give a Pair for the purpose of writing Dict(a=>b). In all other cases --- Dict(iter), iteration --- we can still use tuples. This is needed for composability with zip etc.

Another advantage @jakebolewski (IIRC) pointed out is that a Pair can be printed as a=>b.

JeffBezanson added a commit that referenced this issue Sep 28, 2014
use Dict(iter) or Dict(::Pair...) as Dict constructors

ref #6739
@JeffBezanson
Copy link
Member Author

Under development here: https://github.com/JuliaLang/julia/tree/jb/dictsyntax

After trying this briefly, I am instantly in love with it.

@jakebolewski
Copy link
Member

Another use for the Pair type which was discussed was using pair syntax for keyword arguments:

func(a,b,c=>1) = ....

A breaking change which offhand doesn't bring real benefits, but does seem more consistent.

@johnmyleswhite
Copy link
Member

I'm excited about the pair idea.

@timholy
Copy link
Member

timholy commented Sep 28, 2014

julia/base/collections.jl

Lines 110 to 113 in 8c60aac

immutable Pair{A,B}
a::A
b::B
end
could be plunked in Base.

@timholy
Copy link
Member

timholy commented Sep 28, 2014

Oh, I see, you already introduced another Pair type. Probably best to have only one Pair, though.

@JeffBezanson
Copy link
Member Author

@timholy Ha, yes, of course. Should have thought of that.

Keyword arguments are not just pairs, but passed differently, by name instead of position. #7704 is related: in a=>b both a and b are evaluated, unlike in a=b keyword arg syntax. So f(x; a=>b) makes sense, and after this would actually work if the syntax were simply permitted (again, #7704).

@vtjnash
Copy link
Member

vtjnash commented Sep 28, 2014

@JeffBezanson is is possible for => to be just another normal binary operator, so that it doesn't require adding Pair to Base.Operators?

@JeffBezanson
Copy link
Member Author

Yes that is possible. Doing that would be a bit easier if we got rid of the
old dict syntax entirely, but we can still do it. There's also a chance you
wouldn't want => redefined or overloaded. In fact you do not want to add
definitions to this. Do we want => to be the name of the type instead of
Pair?

@Jutho
Copy link
Contributor

Jutho commented Sep 29, 2014

I am sure there would be many uses of a Pair type in Base, several of which would rely on kind of an equivalent status of / reflexive relationship between both members, rather than the kind of (label,value) status / implied relationship that is used in Dict entries and is contained in the => notation. Not saying that this is incompatible, just something to consider.

@johnmyleswhite
Copy link
Member

A Pair type might be nice for specifying graphs in terms of their edges.

@Jutho
Copy link
Contributor

Jutho commented Sep 29, 2014

Indeed, or the nodes of a binary tree, where every pair contains new pairs, up to the nodes at the lowest level, which contain the leaves.

@JeffBezanson
Copy link
Member Author

This Pair is like a tuple in that the types of both elements are tracked --- you wouldn't want to use it to represent a dynamic tree structure.

@johnmyleswhite
Copy link
Member

I don't people are suggesting this be the final data structure, just that it's convenient sugar for specifying literal graphs.

@StefanKarpinski
Copy link
Member

Yes, that's a fair point. It could well be useful to use the syntax for other data types.

@eschnett
Copy link
Contributor

The Pair for a Dict should not be iterable. If you want a reflexive data structure, then a tuple would be best, no need to invent a new type Pair for this.

What we want here for Dict (and other, similar cases) is a Pair with a bit more structure, a type that implies a "from--to" relation or somesuch. Maybe we could use => as syntax for this, i.e. a use => as name for the type? If function names can be symbols, then why not types? If we want an English name for this type, then Arrow comes to mind.

=>{Key, Value} (shorthand: Key => Value)
longhand: Arrow{Key, Value}

@IainNZ
Copy link
Member

IainNZ commented Jul 9, 2015

@elcritch I think your macro may actually work. Also the typed version doesn't really seem a relevant comparison.

As for a written guide for porting v0.3 to v0.4 code, https://github.com/JuliaLang/Compat.jl serves as both a way to make code run deprecation-warning-free on both, but is at this point also the defacto guide. I think once we have a 0.4 prerelease there will interest in a transition guide. End-users shouldn't be playing with 0.4, its too unstable, so its kind of a low priority.

@JeffBezanson
Copy link
Member Author

Yes I think such a @json macro would work fine. Also NEWS.md is supposed to be a quick guide to what changed from 0.3 to 0.4.

@ScottPJones
Copy link
Contributor

@JeffBezanson you mean the @json { ... } style macro? Would that get the "..." strings already parsed in Julia fashion, instead of JSON fashion? (that's partly why I was thinking a string macro instead might work better)

@elcritch
Copy link

elcritch commented Jul 9, 2015

@ScottPJones , a json""" ... """ macro is nice for larger copy-and-paste situations. For generic code I prefer the native Julia as it is syntax highlighted in most editors and you don't need to prepend $ to variables. It could be added to the json.jl package easily. PS What do you mean when you ask if the "..." strings would be parsed in Julian fashion? AFAIK, they'd be parsed as normal strings after the macro expansion.

@elcritch
Copy link

elcritch commented Jul 9, 2015

@IainNZ , Thanks I'll look into it. Does that module also show how to allow macro to suppress syntax warnings? @JeffBezanson , the macro is really straightforward, but I'm unsure if it'd be more preferable to use traditional JSON notation e.g. { "a": 1, "b": { "bb": 2} } by overriding the : operator (I tried it, works) or to stick with the => operator? Also, would it be appropriate to create an issue with example code and/or a pull request? I'll look into the relevant community docs if so.

@elcritch
Copy link

elcritch commented Jul 9, 2015

@JeffBezanson , thanks the NEWS.md pointed me to the syntax change. Unfortunately it was unclear if the { "a" => 1, ... } syntax was being deprecated in addition to the Dict list comprehension syntax.

@JeffBezanson
Copy link
Member Author

I'm confused --- Dict comprehension syntax ([ a=>b for x in y ]) is not deprecated. NEWS says:

  * `Dict` literal syntax `[a=>b,c=>d]` is replaced by `Dict(a=>b,c=>d)`,
    `{a=>b}` is replaced by `Dict{Any,Any}(a=>b)`, and

@elcritch
Copy link

Ooops! My bad for skimming, I must have interpreted it backwards. Here's a gist to an example macro json_macro.

@kmsquire
Copy link
Member

@elcritch, it would be great if you submitted a PR to JSON.jl.

@ScottPJones
Copy link
Contributor

@elcritch The problem is that Julia's string syntax is not really compatible with JSON string syntax. (I had to deal with this myself a year ago, writing a JSON parser for another language).
JSON requires exactly 4 hex digits after \u, and characters > 0xffff must be represented by surrogate pairs, i.e. \uXXXX\uYYYY. String interpolation will eat $ characters (I don't know if there is any way of turning that off for string literals that are going to be interpreted by a macro). There can't be any characters < 0x20 (control characters) either in a conforming JSON string. Numbers should be preserved (not converted to a floating point representation - this is a common mistake with JSON parsers, because people confuse what JavaScript does with the JSON standard) It is up to the consumer of the JSON string to convert numbers into whatever numeric format is desired, which could be an arbitrary precision decimal floating point number, for example. There is no "Inf" or "NaN" either.

@kmsquire
Copy link
Member

That has little to do with @elcritch's macro or request. AFAIK, we don't validate many of the strings/inputs we currently convert to JSON either, and in either case, that could (and should) be handled by a separate (optional) validation step in JSON.jl.

@ScottPJones
Copy link
Contributor

@kmsquire It all depends on how similar to JSON syntax @elcritch wants to get. If all he wants is just the {, }, [, ], and : to act like JSON, then that's fine. If he expected the stuff following the @json macro to really be JSON syntax, then the issues I raised would be important.

@kmsquire
Copy link
Member

What I was suggesting is that both cases would (and should) be handled by adding validation to JSON.jl (assuming that's where his macro would live).

@ScottPJones
Copy link
Contributor

OK, yes, I fully agree with adding this all to JSON.jl. I'll start looking at what can be done to handle JSON string constants (possibly just within a json"""...""" string macro, if that's not already done)

@ivarne
Copy link
Member

ivarne commented Jul 10, 2015

@ScottPJones JSON.jl doesn't have any macro.

@elcritch
Copy link

@ScottPJones , that'd be great! Having a JSON string macro would be useful for the cases you mention (e.g. direct compatibility). It sounds like you have experience with parsing JSON spec, would be possible for you to check the JSON.jl package and see if they're handling some of the more obscure cases correctly? I'll work on creating a PR for them.
@kmsquire , agreed. My main intent is just to embed json-esque data into my code so normal Julia string handling is what I'd prefer for that macro. A proper string_JSON macro would be more appropriate for the other cases.

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jul 10, 2015 via email

@ScottPJones
Copy link
Contributor

@StefanKarpinski I agree, people tried to do the same thing to the language I worked on. You don't see a problem with having a json"""...""" string macro in JSON.jl though for convenience, do you?

@StefanKarpinski
Copy link
Member

StefanKarpinski commented Jul 10, 2015 via email

@ScottPJones
Copy link
Contributor

At least the @json macro that @elcritch wants is doable in Julia, without (AFAIK) even touching the base language, unlike the nasty language ambiguities it would have added to ObjectScript, 👏 👏 👏 to the people who made things like that possible in Julia!

@elcritch
Copy link

While not having a special dictionary syntax is annoying at first (primarily since I've been using Python for so long), I completely agree that it actually ends up being a very annoying problem. For example, try adding a first class OrderedDict to Python. :/ @ScottPJones I could work on adding a macro string in a PR using the standard parsing method if you wanted to focus on the validation parts. I'd like to be able to contribute something, even if it's small. :)

@elcritch
Copy link

And fantastic point @ScottPJones regarding how Julia makes it possible. Great work @StefanKarpinski and @JeffBezanson and the rest of the team!

@ScottPJones
Copy link
Contributor

https://github.com/JuliaLang/JSON.jl seems to be mostly the work of @WestleyArgentum and @aviks, very nice people I met at JuliaCon, so hopefully they will help us out! We should probably move this discussion to a PR under JSON.jl (can you open one there, @elcritch?) Macros are not (yet 😀) my forté, so I'd be happy just to deal with the validation parts.

@elcritch
Copy link

Great, will do!

@axsk
Copy link
Contributor

axsk commented Dec 27, 2015

What was the reason for deprecating the {k=>v} syntax? I liked it and having to write Dict(..) every time seems kind of clunky.
Being such a basic structure justifies the extra syntax in my eyes, we also wouldn't want to drop []...

@Ismael-VC
Copy link
Contributor

@axsk that ship has sailed long ago, your question is better suited at julia-users. Long story short, concistency, Julia is not Python and it uses { } for other things, please let us continue this at julia-users or at the Gitter chat room, where you can link to this page and keep disscussing this if you want.

@quinnj
Copy link
Member

quinnj commented Jan 7, 2016

@axsk, note that you can still do

julia> [a=>b for (a,b) in zip(1:10,11:20)]
Dict{Int64,Int64} with 10 entries:
  7  => 17
  4  => 14
  9  => 19
  10 => 20
  2  => 12
  3  => 13
  5  => 15
  8  => 18
  6  => 16
  1  => 11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs decision A decision on this change is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.