Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] new brackets: angle, Brack, Brace #8892

Closed
wants to merge 1 commit into from

Conversation

tonyhffong
Copy link

Relevant issue: #7128

I propose adding a number of unicode and composite brackets. In the table below, I include the new LaTeX shortcuts for these new brackets and how they will be parsed. As it stands now, these brackets can be used inside macros, but they have no meaning in proper Julia (yet). I have also given myself the creative license to equate [| |] to ⟦ ⟧ and {| |} to ⦃ ⦄. I think they make input more pleasant, and visually their similarity is compelling.

The motivation is that we seem to struggle with using only three bracket types (), [], {} for a large number of language features.

LaTeX ASCII Unicode Parse Example1 Parse Example2
\langle, \rangle, \lAngle, \rAngle - ⟨⟩ ⟪⟫ is canonical, ❮❯, ❰❱, acceptable ❮a,b❯ -> Expr(:angle,:a,:b) @enclose_Angle(:a,:b) foo❮a❯ -> Expr(:anglecall,:foo,:a) @call_Angle(:foo, :a)
\lBrack, \rBrack `[ ]` ⟦⟧
\lBrace, \rBrace `{ }` ⦃⦄

Notes and discussion points:

  • How should Julia incorporate these new brackets in the language proper? I'd defer to the community to decide the use of these "scarce resources".
  • I couldn't find a good way to 'ascii-fy' the angular brackets without stepping into dangerous edge cases.
  • Dingbat angle bracket ❮❯, and canonical LaTeX angle bracket ⟨⟩ are interchangeable to generate the same AST, although they must be matched exactly. The parser would not match with . By the same token, the parser would not match [| with , etc.
  • I have considered that |: :| being a bracket possibility as well. However, I notice that colon is used inside multiple contexts (range, symbol, quote, indexing wildcard) so there is also risk of surprise.

@JeffBezanson
Copy link
Sponsor Member

Nice! I wouldn't get greedy trying to add |:.

@tonyhffong
Copy link
Author

I guess not, considering this PR would already more than double the brackets available...

I notice that Travis CI is failing on osx, though on my local mac it passed the test.

@staticfloat
Copy link
Sponsor Member

I'm working on the OSX issue. It's because there has been some path breakage due to a new GCC version being released.

@jakebolewski
Copy link
Member

Cool, I'm not a fan of the concatenated *call names though. It doesn't really fit in with how foo[x] and foo{x} is parsed at the moment.

@JeffBezanson
Copy link
Sponsor Member

Fair enough, though the way foo[x] is parsed now wasn't designed to be generalized.

@tonyhffong
Copy link
Author

Thank you @staticfloat for the heads up. I'm a bit nervous since this is my first time touching such a core component of Julia.

A little more color on the current "implementation": I'm kind of ok with using the existing parse-arglist logic around the :*call, so an expression like foo[| a, b; c=d |] gets translated into Expr( :Brackcall, :foo, Expr( :parameters,...), a, b ). However, I'm also using the same parse-arglist for head-less parsing, which may not be what we would like them to behave.

@jakebolewski I'm okay with renaming them to whatever, presumably after we have some good idea how we are going to use them.

@tonyhffong
Copy link
Author

On the sub-topic of scarce resources: presumably there'd be a little land-grab on using those brackets, so I may as well start this first!

So, I'd like to reserve foo{| args... |} to translate into bracecall( :foo, args... ). The first argument may be a symbol but it could be something else e.g. foo.bar{| args... |} or foo{T}{| args... |}. With multiple-dispatch I think it can be quite useful for many applications.

@JeffBezanson
Copy link
Sponsor Member

Most likely all of these should be lowered to calls of certain standard names. The first argument (foo in your example) doesn't need to be a special case.

@tonyhffong
Copy link
Author

Wait until algebraists and differential geometrists get their hands on them... I'm not taking chances.

@elextr
Copy link

elextr commented Nov 3, 2014

Good idea, but...

the \langle and \rangle is not very visually distinguishable from parens. I had to read your table several times before realising that ⟨a,b⟩ was angle brackets not parens. And thats in a table where I was expecting it, imagine in the middle of a complex expression, dibs on using it in the first annual obfuscated Julia competition :)

@tonyhffong
Copy link
Author

@elextr, of course, none of these are set in stone. Here are some alternatives:

  • Use different unicodes that render better/more distinctly in today's machine. From http://shapecatcher.com/ and this nice pdf table I see '\U2039' ‹ › (kind of small), '\U276e' ❮❯ (now they look thicker than Angle), '\U29fc' ⧼⧽ (curvy angle, looks like straight angle in small font, more curvy in large font)
  • use compound brackets for angles, such as <! !> (<||> are spoken for), <* a,b *> or <<* a *>>, and forget about unicodes.
  • Pick a better fixed width font for our editors / terminal / browser.

What do you think?

@elextr
Copy link

elextr commented Nov 4, 2014

Use different unicodes that render better/more distinctly in today's machine. From http://shapecatcher.com/ and this nice pdf table I see '\U2039' ‹ › (kind of small), '\U276e' ❮❯ (now they look thicker than Angle), '\U29fc' ⧼⧽ (curvy angle, looks like straight angle in small font, more curvy in large font)

Agree 2039 is kinda small, 276e is distinguishable from parens, but 29fc didn't render (on current release Linux Mint default).

use compound brackets for angles, such as <! !> (<||> are spoken for), <* a,b > or << a *>>, and forget about unicodes.

Note that C++ is just about to drop trigraphs to support all of ASCII IIUC :)

As a next generation language Julia probably should just go Unicode without ASCII alternatives, only beware of characters that are hard to distinguish thats all. Even in your pdf the \lparen and \langle need more than a quick glance to distinguish.

Pick a better fixed width font for our editors / terminal / browser.

As has been suggested elsewhere, Julia font anyone? It would be guaranteed to contain good renderings for all codepoints used in Julia and packages (well packages that contribute glyphs anyway).

@tonyhffong
Copy link
Author

What about dingbats '\U276e' ❮a,b❯ for angle brackets and '\U2770' ❰a,b❱ for Angle brackets. They look much more distinguishable from (<>) on a mac.

@tonyhffong
Copy link
Author

Updated above. Looks better to me.

@ghost
Copy link

ghost commented Nov 4, 2014

As has been suggested elsewhere, Julia font anyone?

What are we? APL? At least we don't need to ship a keyboard.

@nalimilan
Copy link
Member

The rendering may be very different depending on the font, so there's little point in discussing this in general. I'd say a font in which you cannot easily distinguish parentheses from angle brackets has a bug and should be improved. The good news is that it can be done for open source fonts.

Using dingbats instead of the mathematically correct Unicode character isn't great, it means the result will be different than what you'd get e.g. in LaTex and will lead to confusion.

@tonyhffong
Copy link
Author

I'm taking a more pragmatic view on this. There isn't a "mathematically correct unicode character". If the poor first-time user, with a suboptimal font set, encounters a un-renderable or confusable bracket, that is not a very good first impression.

@tonyhffong
Copy link
Author

Actually, this starts to feel like something we should enlist broader input from julia-users mailing list. wdyt?

@tonyhffong
Copy link
Author

Playing with mock code with the brackets a bit, here is a draft of their potential use

Code snippet Proposals
❮a,b,..❯ wide open. convert to getobject(args...)? Complex inner product in QM `❮a
foo❮args...❯ convert to anglecall( :foo, args... )
❰a,b,..❱ remove as a separate bracket type (see below)
foo❰args...❱ remove as a separate bracket type (see below)
⟦a,b;c,d⟧ matrix construction? (with its own rule on space, commas, newlines and semi colons inside )
foo⟦a⟧ lifting a function to work on a matrix / iterable? Typed matrix construction? brackcall( :foo, args... )
⦃a,b⦄ wide open (Alternative to Dict construction)?
foo⦃args...⦄ convert to bracecall( :foo, args... )

I'm already annoyed with the Angle being confused with the small angle brackets. I think we should just collapse them.

@jiahao
Copy link
Member

jiahao commented Nov 4, 2014

There isn't a "mathematically correct unicode character".

Actually, there are. In principle, Unicode distinguishes between characters (with semantic meanings) and glyphs (their visual representation), and so differentiates things like ⋅ U+22C5 (dot operator) and · U+00B7 (middle dot) . The standard glyph chart (pdf) explicitly states "(22C5) • preferred to 00B7 ·  for denotation of
multiplication". (The chart also lists many more examples.)

@JeffBezanson
Copy link
Sponsor Member

@jiahao is quite right. Unicode includes characters for their meaning, not their appearance. There are many pairs of characters that are likely to look the same in most fonts.

"\\rangle" => "⟩", #U27e9
"\\lAngle" => "⟪", #U27ea
"\\rAngle" => "⟫", #U27eb
"\\ldangle" => "❮", #U276e
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is \ldangle actual LaTeX?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no. it's just a shorthand for "dingbat-angle"

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I think we should not have those.

@tonyhffong
Copy link
Author

Actually, there are. In principle, Unicode distinguishes between characters (with semantic meanings) and glyphs (their visual representation), ...

I probably wasn't very precise in presenting my view. The closest analogy is that π is the mathematically correct unicode for pi but we don't penalize anyone using pi because they have trouble typing or displaying π in their browser / terminal. And we build that liberal view right into Base. In that vein, I'm taking a similar approach to accepting multiple unicode for angular brackets.

@elextr 's concern of the appearance of ⟨⟩ being very close to () is very real. If you cannot tell them apart visually, who cares what unicode standard says what they mean?

Besides, what should a angular bracket correctly mean anyway, mathematically? I don't know. We are giving them meaning here.

I think we should focus on enriching the expressive power of Julia while making them readable (by human). Pretty-printing can be monkey-patched when non-standard angular bracket characters are encountered.

@jiahao
Copy link
Member

jiahao commented Nov 4, 2014

What you're describing sounds like defining your own Unicode canonical equivalence for angle brackets (see UAX 15). You can have multiple Unicode code points all be treated the same as angle brackets because of visual ambiguity. Nonetheless, Unicode does have very specific code points for angle brackets (U+3008 and U+3009), and we should respect that, even if fonts don't. "Get a better font" is a perfectly valid solution.

@JeffBezanson
Copy link
Sponsor Member

Granted, it's not too helpful that Unicode has U+2329, U+3008, and U+27E8 just for left angle bracket, never mind the extra dingbats and whatnot.

For years, programmers have been picking fonts based on the distinguishability of 0 and O, and i, I, l, and 1.

@tonyhffong
Copy link
Author

⁽ᵒᵏ⁾

@jiahao
Copy link
Member

jiahao commented Nov 4, 2014

ref: JuliaStrings/utf8proc#11 - possible use case for canonicalizing brackets

@tonyhffong
Copy link
Author

@jiahao thanks for the pointer. It's nicer to deal with unicode normalization issues for julia code in one place.

@tonyhffong tonyhffong changed the title [WIP] new brackets: angle, Angle, Brack, Brace [WIP] new brackets: angle, Brack, Brace Nov 6, 2014
@tonyhffong
Copy link
Author

@JeffBezanson, @jakebolewski I wonder if it's better just to transform all the new brackets into macrocall and let base (and module developers) determine the subsequent transformations, like @r_str. It's cleaner on the parser side.

@tonyhffong
Copy link
Author

Ok, we are further along where the code can actually be used.

import Base: Brace_call, Brace_enclose
Base.Brace_call( s::Any, x::Int ) = (s,x)
Base.Brace_enclose( x::Int ) = (x,)

@assert sin{| 1 |} == (:sin,1)
@assert f{T}{| 1 |} == ( :(f{T}), 1 )
sin{| 1.0 |} # ERROR

@assert {| 1 |} == (1,)
{| 1.0 |} # ERROR

So a module can specialize these brackets for its own use by extending the call or enclose functions of the desired bracket types (angle,Brack,Brace) with its own signatures.

@JeffBezanson
Copy link
Sponsor Member

What's with the extra quoting in the parser? If these are going to be macros (which I'm not sure they should be) then that's totally redundant, since macros always see the argument expressions anyway. If they're not going to be macros, then the extra quoting is just crazy.

@tonyhffong
Copy link
Author

@JeffBezanson That's fair. I'm learning the rope as I go. How does it look?

@tonyhffong
Copy link
Author

Another related issue: #8599

@tonyhffong
Copy link
Author

Here is a status update on this PR

  • all unnecessary quotes have been removed in the parser.
  • test/parser.jl added and passed
  • non-LaTeX backslash shortcuts have been removed
  • @doc documentation
  • Expressions enclosed by these new brackets currently must be commas-separated. This restriction can be relaxed later on when we need them. (Unless we decide to kill the commas completely, future changes should be harmless.)

@IainNZ
Copy link
Member

IainNZ commented Nov 15, 2014

I've been waiting to see where this goes, and it seems like its converging to something.
My 2c: is a bad, bad idea, and I'd ban it from any codebase I could ban it from. Its too visually ambiguous.
I don't see myself ever using the others, but at least they are relatively visually unambiguous and have an ASCII equivalent. Maybe a good example would change that view?

Also, does anyone else read [|a|] as [abs(s)]?

@elextr
Copy link

elextr commented Nov 15, 2014

My 2c: ⟨ is a bad, bad idea, and I'd ban it from any codebase I could ban it from. Its too visually ambiguous.

Yes, and the "use a better font" argument is a furphy that might apply to those developing Julia or using it intensively, but it doesn't apply to casual code readers on github or elsewhere. Julia should not require a special font to read the code.

Also, does anyone else read [|a|] as [abs(s)]?

Now that you mention it, yes.

@tonyhffong
Copy link
Author

I think in the likely use case it'd be [| a,b... |] so it doesn't exactly read like [abs(a,b...)].

@IainNZ, @elextr We could remove/ban \langle and \rangle ⟨⟩. We have \lAngle and \rAngle ⟪⟫ which are visually more distinct. Or, come to think of it. If angle brackets elicit such controversy, we should just take them out altogether. Two new bracket sets should be enough for everybody 😉.

@IainNZ
Copy link
Member

IainNZ commented Nov 16, 2014

The thing is, and I think this is getting at a more fundamental objective I have, is: do we have anything like it in this language or the 95% of popular languages?

I'm trying to think of things that are double-punctuation:

  • ::, but that has never felt too odd because its the same character twice
  • A'*B, which is a barely-tolerable level of magic IMO (because it provides so much benefit).
  • <:, again hard to read any other way
  • ...?

I just feel like [| |] is always going to look somehow out-of-place and hard to mentally parse, and will be how most people use this feature. | just has two meanings for me, one in computing and one in math, and this use of it is unrelated to either.

@tonyhffong
Copy link
Author

Yes, let's have some data points. From Haskell, there are

  • [: :] in data parallel Haskell
  • [| |] in template Haskell
  • (# #) for unboxed tuple

Ok, Haskell may not be even in the 95% of popular languages, but it's certainly not a "bad idea language". Just saying.

@elextr
Copy link

elextr commented Nov 16, 2014

@tonyhffong the ⟪⟫ is fine, its visually distinct from everything else. Just the single ⟨ is font dependent to distinguish it from ( and therefore should not be used by a sensible language like Julia. BTW to emphasise, other than ⟨ more brackets is a good thing :)

Haskell is trying to take Perl's mantle of "every sequence of punctuation is valid code" :)

But [| |] is not really that bad, sure it reads as abs() sometimes, but then you remember that this is code not math. And as you say, if its got a comma in it, the "autoreader" is interrupted anyway.

@tonyhffong
Copy link
Author

@IainNZ , @elextr , I take out the skinny ⟨⟩. Only the meaty ❮❯ (\lAngle, \rAngle) are canonical angular brackets.

@IainNZ
Copy link
Member

IainNZ commented Nov 16, 2014

@tonyhffong I still don't like it, or the whole idea in general, but I don't think anything other than amazing example that you can't do any other way is going to sway me. Luckily, its not me you need to sway :D

@tonyhffong
Copy link
Author

@IainNZ lol. okay. Well, let's see. I guess the train of thought I followed is like this:

  • So we have (), [], {} in Julia as in many programming languages.
  • Are these three brackets the goldilock number? Are 2 too few. Are 4 just too many? Why? (1 is enough if you are happy with lisp 😉)
  • We already see some languages really struggle with introducing more brackets. C++/Java's C<T> is a good example. Julia rescued {} from their unfortunate fate in C, so we can use them for parametric types. That helps.
  • However, tension is apparent in []. See syntax: separate array concatenation from array construction #7128 and pointers therein. It seems hard to keep both "list/range camp" and "vector/matrix camp" happy. With new brackets, we could solve this.
  • I don't have an "amazing example" of why we couldn't do this any other way. We can look at the list vs matrix discussion again. It's just as well someone decides to make one camp use a more cumbersome form e.g. buildmatrix( buildvector(), ... ) or flatten( list1, list2, ... ). Whatever the final solution looks like, I'm sure it must involve a careful consideration of taste, terseness and readability. But let's not restrict ourselves prematurely by being unwilling to take on new brackets.

Their latex symbols are \lAngle,\lBrack,\lBrace
( the closing ones should be obvious).
Brack and Brace has ASCII equivalent of [| |] and {| |}

They map to :
  * With prefix expression: `@call_Angle, @call_Brack, @call_Brace`
  * Without prefix: `@enclose_Angle, @enclose_Brack, @enclose_Brace`

The further lowering of @call_Brace is different from others
to allow for the need to look at the prefix symbolically, instead of
its dereferenced value.
@tonyhffong
Copy link
Author

Another update on this PR:

  • all commits squashed
  • As far as I know, the code issues have been addressed.
  • I avoided the more visually ambiguous skinny angle brackets in favor of double angle brackets, to resolve the debate on it above.
  • Some extra docs

The remaining issues are:

  • Is it a good idea? So far, some like it, and some don't. More people should weigh in on this.
  • If so, do we wait until Base has a more solid use for them before rolling it out? or should we set them loose to see how packages use them?

@ihnorton
Copy link
Member

FWIW, I'm not a fan of this. I think it adds too much visual and functional ambiguity without - as yet - a clearly-articulated motivation or usage guidance. "wait, which bracket is implemented for this type; ok now what does it do?!?" -- multiplied by 4,000+ packages in a few years.
(also, I thought minimalism was at least an aspiration for Julia, if not a dogmatic goal)

@tonyhffong
Copy link
Author

okay, let's shelf this PR. Carry on.

@coveralls
Copy link

coveralls commented May 17, 2016

Coverage Status

Changes Unknown when pulling 0c72c01 on tonyhffong:master into * on JuliaLang:master*.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants