Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Unicode symbols to be used as operators #1079

Open
5 tasks done
voronoipotato opened this issue Sep 21, 2021 · 13 comments
Open
5 tasks done

Allow Unicode symbols to be used as operators #1079

voronoipotato opened this issue Sep 21, 2021 · 13 comments

Comments

@voronoipotato
Copy link

voronoipotato commented Sep 21, 2021

I propose we revisit the proposal of allowing unicode symbols to be used as operators with a simplified precedence model.

The existing way of approaching this problem in F# is to create an inline function with backticks

let ``∫`` xs = xs |> Seq.sum
//and..
let ``∪`` a b =  Set.union a b
let x = a |> ```` <| b

the proposed way of writing this is

let () xs = xs |> Seq.sum
let l =[1..10]
//and..
let () a b = Set.union a b 
let x = a ∪ b

(yes my examples are silly :P , you know the real ones)

The proposal is novel in that operator precedence of unicode operators would simply be left to right. If that's undesirable perhaps a attribute for the precedence with level 1 being lowest and defaulting to whatever ~ is when unspecified. We could then document each of the levels in https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/symbol-and-operator-reference/ so that the creators of operators could set up precedence in intuitive ways.

[<OperatorPrecedence(1)>]
let () xs = xs |> Seq.sum

[<OperatorPrecedence(3)>]
let () a b = Set.union a b 

Pros and Cons

The advantages of making this adjustment to F#

  • It would help allow mathematically inclined users to more naturally express things in code.
  • Mathematical symbols are easier to type in windows now that win + ; exists.
  • People who have difficulty reading mathematical operators would also have difficulty reading the math
  • People who want mathematical notation are already writing code like this using ``∫``.
  • Removing backticks makes it easier to read

The disadvantages of making this adjustment to F# are ...

  • Operator precedence would basically be left to right for unicode operators
  • More operators.
  • Math
  • A more APL style programming experience (left to right) or a new attribute
  • let x = 🧨7

Extra information

Full disclosure, This has been kind of proposed before, please review. I'm hoping to revisit it now that data science and scientific computing is a target audience and that I'm proposing a simplified approach to operator precedence.

#224

As discussed in the comments in the original submission, this is a minefield.... Deciding the precedence for such operators is really hard. It would also create swathes of unreadable F# code. I'll close this since we've previously decided not to do this in F# 2.0, and there has not yet been a major change of circumstance to warrant altering this which addresses the concerns. - Don Syme

Estimated cost (XS, S, M, L, XL, XXL): S

Related suggestions: (put links to related suggestions here)
#224

Affidavit (please submit!)

Please tick this by placing a cross in the box:

  • This is not a question (e.g. like one you might ask on stackoverflow) and I have searched stackoverflow for discussions of this issue
  • I have searched both open and closed suggestions on this site and believe this is not a duplicate
  • This is not something which has obviously "already been decided" in previous versions of F#. If you're questioning a fundamental design decision that has obviously already been taken (e.g. "Make F# untyped") then please don't submit it.

Please tick all that apply:

  • This is not a breaking change to the F# language design
  • I or my company would be willing to help implement and/or test this

For Readers

If you would like to see this issue implemented, please click the 👍 emoji on this issue. These counts are used to generally order the suggestions by engagement.

@charlesroddie
Copy link

Notes

  • Ordinary identifiers in F#/C#/CLI use https://www.unicode.org/reports/tr15/tr15-18.html#Programming%20Language%20Identifiers . That allows a few mathematical functions let Σ = ..., where Σ is capital sigma, but only because they happen to be letters.
  • Operators are syntactic sugar designed for conciseness and similarity to existing notation, particularly mathematical notation. Allowing unicode mathematical symbols would extend this and result in F# code that is extremely readable in mathematical domains.
  • I believe F# users will generally not know precedence rules for user-defined operators, and will just disambiguate any possible "ambiguities" by bracketing. For this reason the precedence rule that is adopted is not very important.
  • Implementation questions:
    • Some syntax like op_UnicodeAAAA looks like the natural extension of existing rules.
    • Would we allow multiple unicode characters in an operator or just a single unicode character?

@uxsoft
Copy link

uxsoft commented Sep 24, 2021

Hmm, even with [ Win + ; ] these sound very annoying to type to the point that I reckon I'd never use this.

Also, my experience with math is that every branch/theory has its own crazy notation which makes it very hard for newcomers to read. So this isn't a feature I'd like to have in my code.

Not opposed to the feature existing just probably not the type of person who would use this.

@bisen2
Copy link

bisen2 commented Sep 24, 2021

Hmm, even with [ Win + ; ] these sound very annoying to type to the point that I reckon I'd never use this.

This really just comes down to tooling. Many editors (or editor plugins) provide user friendly ways of inserting unicode characters.

@chillitom
Copy link

Hmm, even with [ Win + ; ] these sound very annoying to type to the point that I reckon I'd never use this.

If you haven't seen it before checkout WinCompose, I just discovered it and can't work out why such a thing isn't built in.. MacOS has been doing something similar for years.

@johncj-improving
Copy link

What are the limits on this proposal? Would I be able to use   as an operator? That's an en space (U+2002). There are times when I would like to alias |> that way...

@voronoipotato
Copy link
Author

I think to start we would pick a unicode plane like the Basic Multilingual Plane, exclude non-visible glyphs. There's nothing stopping you from using a prettyprint extension in your editor though to make |> render as whitespace. You could also modify a font with ligatures to have |> be represented as whitespace, I don't think it would be particularly difficult to do and it would work regardless of the editor you use.

@johncj-improving
Copy link

I've already done the font ligature trick. I agree that if this is done, it needs some common sense restrictions.

@sv158
Copy link

sv158 commented Apr 14, 2023

Recently I saw another related issue closed (#1104, the one that wanted F# allows the use of APL symbols), and labeled as probably not, which feels a bit regrettable. I have used APL, J and K (all array languages). If currently I want to use overload to implement APL-like semantics without Unicode operators, then the final readability may be worse than J because J's operators(verb/adverb) are limited to ASCII charsets.

I can understand the decision of the development team, but out of curiosity, I looked at the situation of other programming languages. The first one is Julia, I found relevant discussions in a 2021 post on the Nim official forum, which mentioned that Julia allows some Unicode characters as operators. Then I jumped to look at the states of Julia. The proposal in Julia community was first [post] in 2012 (maybe people who are engaged in scientific research really feel that symbols are not enough), and then this feature was implemented in 2014.

Then I Looking back at Nim, it has implemented this feature in 2021 (not long before and within a year), although it is only an experimental feature in the stable version at present (v1.6), but it is already available by default in the development branch. Nim's strategy is more conservative than Julia's, and the available Unicode characters are limited, with only two priority levels. Later on, I also looked at the situation of other programming languages by the way. Some support it (such as Raku), some don't (Rust, Zig), and there are not many languages that support this feature. In addition, I also saw some related blogs and felt that this requirement is indeed very practical for some people (including myself).

If F# supported this feature (introducing some Unicode symbols as operator like Nim), the readability issue might be handled by linter and formatter, or a warning could be added in a 'strict mode' (hypothetical).

P.S. more addtional off-topic information about operator overloading, the Python community has also discussed this aspect, such as using symbol overload to optimize the readability of matrix multiplication. The Elixir community also follows the same idea and has specifically implemented a syntax sugar for n**3 similar to Python for n*n*n. It's a little interesting that more and more discussions and implementations come around 2021. This may mean that programming languages are beginning to become more like a general tool (not programmer only)? After all I closed, maybe the more users from different fields, the richer the ecosystem (like {Elixir-**-Nx} ~ {Python-@/__matmul__-Numpy}).

@dark-valkyrix
Copy link

Yes please, it would be great to be able to define operators like ⊨, ⊕, ∧ (replacement for and?), and so on... Cartesian product (tuples) should also be replaced (or support-added) with ⨯ to match the mathematical notation, I don't like * which means something else. And support for combining diacriticals such as 20D7 (right arrow) to define vectors could be awesome.

@dsyme
Copy link
Collaborator

dsyme commented May 15, 2023

Re my earlier comment:

As discussed in the comments in the original submission, this is a minefield.... Deciding the precedence for such operators is really hard. It would also create swathes of unreadable F# code. I'll close this since we've previously decided not to do this in F# 2.0, and there has not yet been a major change of circumstance to warrant altering this which addresses the concerns. -

Is there any concrete proposal for precedence for such operators? That's the key missing ingredient, and it's really impossible to proceed without a proposal.

@charlesroddie
Copy link

Is there any concrete proposal for precedence for such operators? That's the key missing ingredient, and it's really impossible to proceed without a proposal.

Current operator precedence seems to be based on the starting character(s). https://learn.microsoft.com/en-us/dotnet/fsharp/language-reference/symbol-and-operator-reference/

I propose that operators starting with a currently-disallowed unicode symbol are given the same precedence, just lower than all existing precedences for operators: lower than prefix operators but above .. Operators containing currently-disallowed symbols but starting with currently-allowed characters can fit into existing rules.

@sv158
Copy link

sv158 commented Sep 12, 2023

Is there any concrete proposal for precedence for such operators? That's the key missing ingredient, and it's really impossible to proceed without a proposal.

Few days ago, I saw that Nim officially released version 2.0, and the '--experimental:unicodeOperators' flag had already been removed. They eventually accepted 21 Unicode operators , with 13 having the same precedence as multiplication (∙ ∘ × ★ ⊗ ⊘ ⊙ ⊛ ⊠ ⊡ ∩ ∧ ⊓) and the remaining 8 having the same precedence as addition (± ⊕ ⊖ ⊞ ⊟ ∪ ∨ ⊔).

I think this approach (i.e., adopting a subset of Unicode operators first) might be worth considering. In this way, the main problem shifts from complex precedence rule settings to a relatively simpler process of filtering out the most needed Unicode operators and grouping them accordingly.

@MaxWilson
Copy link

MaxWilson commented Oct 27, 2023

Hmm, even with [ Win + ; ] these sound very annoying to type to the point that I reckon I'd never use this.

This really just comes down to tooling. Many editors (or editor plugins) provide user friendly ways of inserting unicode characters.

Could you have a tooling plugin that simply hides backticks around Unicode characters? Then you wouldn't need a language change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants