Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify set API #3272

Closed
JeffBezanson opened this issue Jun 2, 2013 · 23 comments
Closed

clarify set API #3272

JeffBezanson opened this issue Jun 2, 2013 · 23 comments
Labels
needs decision A decision on this change is needed

Comments

@JeffBezanson
Copy link
Member

I suggest we remove "cute" definitions such as

|(s::Set...) = union(s...)

and add a subset or issubset function to go with contains. This makes it possible for Arrays to fully implement the set API without confusion.

ref #1963

@StefanKarpinski
Copy link
Member

I concur.

@johnmyleswhite
Copy link
Member

This seems like a clear win. If we were going to have anything cute, we could use unicode characters for set theoretic operations.

@StefanKarpinski
Copy link
Member

Oh, that would be nice. I like that generic functions force us to be so careful about not mixing different meanings wantonly into the same operator/function. It's so tempting to pun on these things because other languages have. I feel a bit remorseful about some of our uses of the pipe – function chaining and command pipelines. But at last for chaining pipelines, it seems like the syntax is worth it.

@johnmyleswhite
Copy link
Member

I personally would prefer that | were replaced with something less ambiguous, like ==>, but agree that the pipes are probably worth keeping despite being internally inconsistent.

@StefanKarpinski
Copy link
Member

In the comman case, we could get rid of them by pushing them inside of the backitcks. That's a big step though because it turns backitcks into a real language instead of just a way express arrays of strings with interpolated expressions (which is already pretty complicated).

@johnmyleswhite
Copy link
Member

That case seems particularly worth allowing since the shell tradition encourages thinking about pipes.

@JeffBezanson
Copy link
Member Author

We could use |> as a pipe operator.

@pao
Copy link
Member

pao commented Jun 4, 2013

We could use |> as a pipe operator.

Elixir is using this notation, so it's not entirely crazy. (That might be why you brought it up.)

@StefanKarpinski
Copy link
Member

Yep, I sent Joe Armstrong's post about Elixir to Jeff yesterday, so I suspect that might be the impetus. I would be ok with that, although I wouldn't mind making => a more general operator or using ==>.

@toivoh
Copy link
Contributor

toivoh commented Jun 4, 2013

I still would like to see \/ and /\ operators for set union and intersection.

@StefanKarpinski
Copy link
Member

I'd rather just add support for and as Unicode operators with appropriate precedence and everything.

@toivoh
Copy link
Contributor

toivoh commented Jun 4, 2013

That could be ok too. I'll have to figure out how to type them.

@ViralBShah
Copy link
Member

I will register my vote against unicode operators once more here, as I have done before.

@StefanKarpinski
Copy link
Member

You wouldn't be required to use them as there would always be longer names for the same operations. I just want them as an option. I'd like to be able to write C = A ⋃ B instead of C = union(A,B).

@staticfloat
Copy link
Member

Let's just throw a latex parser into Julia, so we can type C = A $\union$ B, and have simultaneously the fastest and most beautiful code on the planet,

On a more serious note, the only caution I have about going with unicode is that many environments have subtly broken support for unicode. The only reason I'm cognizant of these issues is because I love mosh, and mosh fixes a lot of this for me. I work in a terminal a lot, (as, I'm sure, many of you do) and if printing gets broken in a terminal, it's an annoyance to have to use a tool like mosh just to read source code.

A concrete example; without mosh, in an Ubuntu 13.04 terminal, typing "ab⋃" and deleting the "ab" will leave lines behind after the union symbol, because the ⋃ extends farther horizontally than the rest of the fixed-width font. Not a huge problem, but an annoyance to be sure.

@JeffBezanson
Copy link
Member Author

We would never require unicode for core functionality. But it should be available as an option.

@staticfloat
Copy link
Member

We would never require unicode for core functionality. But it should be available as an option.

I understand that C = A ⋃ B and C = union(A,B) would mean the same thing. The only reason I think this feature could cause disruption is because if Stefan writse a bunch of code that uses those unicode operators and I try to edit it in a broken environment, it can be annoying. Perhaps some basic tool to translate a unicodey source to its explicit equivalent would suffice to allow this kind of editing, but that also seems like a bad solution, as translating code from one standard to another is a recipe for sad version control systems.

@staticfloat
Copy link
Member

And perhaps this is over thinking it and these kinds of problems won't surface too frequently. I just think that, unfortunately, the state of unicode in terminal applications is really lacking, and allowing source code to contain it is a great plan to test all sorts of areas in the console pipeline that don't normally see unicode.

@JeffBezanson
Copy link
Member Author

I have to say I don't encounter unicode-related problems in my toolchain very often. And some people are already writing julia in non-english languages, so unicode source will be out there.

@johnmyleswhite
Copy link
Member

Honestly, I'd kind of like to see that kind of failure happen since it will provide a definitive conclusion to our ongoing debates about the wisdom of allowing Unicode operators.

@staticfloat
Copy link
Member

Honestly, I'd kind of like to see that kind of failure happen since it will provide a definitive conclusion to our ongoing debates about the wisdom of allowing Unicode operators.

Good call, always a good idea to come back to the concrete problems in a discussion like this. I can play around with the set unicode symbols discussed here. Give me some others that you're interested in seeing, and if I don't run into any major problems using them (I don't think the printing error I showed above should be considered major) I'll rest my case.

@johnmyleswhite
Copy link
Member

We've already added Unicode variable bindings for π and for the Euler-Mascheroni γ. Those are obvious ones to check.

@staticfloat
Copy link
Member

I retract my complaints. I haven't been able to get anything to break other than some simple fixed-width issues. My link to the mosh page is mostly only useful when using control characters such as circumflexes, etc., not immediately relevant to this discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs decision A decision on this change is needed
Projects
None yet
Development

No branches or pull requests

7 participants