Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use abstract|primitive type, struct, mutable struct for type definitions #20418

Merged
merged 3 commits into from
Feb 10, 2017

Conversation

JeffBezanson
Copy link
Member

@JeffBezanson JeffBezanson commented Feb 2, 2017

Fixes #19157.

This regularizes the type-defining keywords and makes them more accurately descriptive. The proposed syntax is

abstract type Integer <: Real end
primitive type Int32 <: Signed 32 end
struct Complex ... end
mutable struct Ref ... end

This is nicely extensible for the future, avoids stealing the words abstract, primitive, and (im)mutable as keywords, and as a bonus supports Compat.

  1. We should use the word struct; previously there was no keyword that described what the most common kind of user-defined type actually was.
  2. We should steer people to immutable types by default. Immutable is generally better and faster, has better default behavior with ===, and it's really pretty rare to need to update object fields. Base has many more immutable types than mutable, and in making this change it seemed to me that even more types could be made immutable. But type Foo looks more natural than immutable Foo, so type tends to be the default choice. This change reverses that, with struct Foo being more natural, and immutable. Writing mutable struct Foo makes you ask "does this really need to be mutable?", which is a good thing!
  3. Immutable struct types are the closest thing we have to value type structs in other languages. We inline them in arrays in at least some cases, and being immutable makes it much harder to tell whether they are value or reference types.
  4. FWIW, Rust also uses struct and makes them immutable. Not that we need to copy Rust, but it adds confidence that this is a reasonable thing to do.
  5. We could eventually phase out mutable structs, and instead use Ref fields where needed in immutable structs.

So far I really like the way this change looks. immutable is kind of a long and ugly word.

The deprecation warning gives line numbers.

I haven't updated the manual yet. I think we'll want to reorganize it a bit to introduce immutable structs first.

@StefanKarpinski
Copy link
Member

As @simonbyrne pointed out during a discussion of this yesterday:

From what I can tell, struct originates in ALGOL, where it was a "structured value":

A “structured value” is composed of a sequence of other values, its “fields”, each of which is “selected” {b} by a specific 'TAG' {9.4.2.1.A}.

So its defining property is that it is made up of fields not its mutability or lack thereof

@StefanKarpinski
Copy link
Member

If we go the way of annotating individual fields as mutable, then mutable struct could be shorthand for a struct in which all the fields are marked mutable (when do you even need that?). This could, of course, just be a gussied up version of having those fields be Refs.

@kmsquire
Copy link
Member

kmsquire commented Feb 2, 2017

Is this for v0.6?

@JeffBezanson
Copy link
Member Author

JeffBezanson commented Feb 2, 2017

Yes --- otherwise it's unlikely to happen.

@kmsquire kmsquire added this to the 0.6.0 milestone Feb 2, 2017
immutable LinearFast <: LinearIndexing end
immutable LinearSlow <: LinearIndexing end
struct LinearFast <: LinearIndexing end
struct LinearSlow <: LinearIndexing end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@StefanKarpinski said:

So its defining property is that it is made up of fields, not its mutability or lack thereof.

If one agrees with this perspective (I think it makes sense), then structs with no fields feels a little odd.

(I actually thought these lines seemed odd when I ran across them, and post-justified it with Stefan's comment because it fits.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zero is a valid number of fields :) Most (all?) languages with structs allow empty structs, and they're especially useful in languages that support various type system tricks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

C/C++ don't, but I would hope that language is the exception

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust calls these "Unit Structs"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, empty structs are fine in C++ and commonly used for unique tags with which traits can be associated, etc. In an amusing quirk of the C++ standard, sizeof(boost::blank) != 0 :-(

@ararslan ararslan added breaking This change will break code design Design of APIs or of the language itself labels Feb 2, 2017
@vtjnash
Copy link
Member

vtjnash commented Feb 3, 2017

FWIW, Rust also uses struct and makes them immutable

Immutability is orthogonal to structs in rust (it's instead a property instead of the variable pointing to it, which is enforced by its linear type system, so probably not applicable here).

@JeffBezanson
Copy link
Member Author

The Rust documentation says "The values in structs are immutable by default" and the mut keyword for making bindings mutable cannot be applied to struct fields: "Rust does not support field mutability at the language level." Given a binding that points to a struct, in theory all 4 combinations of binding mutability and struct mutability are possible. In Rust the binding can be either, but the struct itself is immutable. So I think it's quite fair to say Rust structs are immutable.

@maleadt
Copy link
Member

maleadt commented Feb 3, 2017

Is a bitstype rename off the table then? Ref @vtjnash @ #19157 (comment)

@JeffBezanson
Copy link
Member Author

No, I'm willing to rename bitstype too.

@KristofferC
Copy link
Member

KristofferC commented Feb 3, 2017

I bet this could be done with sed but I made a small script that can update all the .jl files in a folder (recursively). It uses https://github.com/KristofferC/Tokenize.jl so you need to get that.

The script is here: https://gist.github.com/KristofferC/24c269727d914ffb88bfc21624dad891

update_folder!(path) just shows all files that would be modified, does not do any modifications
update_folder!(path, false) actually does the modifications.

update_file!(filepath) updates a single file.

Standard caveats about backups, only run in a vm, read the code, don't use it at all etc. etc. apply.

I ran it on base with the following diff as result: https://gist.github.com/KristofferC/b73561725c74501c4ebf06ec400a3907

@timholy
Copy link
Member

timholy commented Feb 3, 2017

👍 to the change.

The deprecation warning gives line numbers.

But to clarify, is the deprecation warning (which seems to be implemented in the scheme code) more sophisticated than @deprecate_binding? The latter almost never gives a useful line number (or even hint about which package the offending line occurs in), and to find the source of the problem I usually end up recursive-grepping my entire .julia/v0.x folder. See also #20057. For something that causes this many deprecations, I'd say having good line-reporting is blocking for this change.

@martinholters
Copy link
Member

For cross-version compatibility, would I need to wrap my type definitions in an include_string or the like, or is there a better way? Can we do something useful in Compat?

@carlobaldassi
Copy link
Member

We could eventually phase out mutable structs, and instead use Ref fields where needed in immutable structs.

mutable struct could be shorthand for a struct in which all the fields are marked mutable (when do you even need that?)

I often use a pattern where I have tons of keyword arguments to set the parameters of an algorithm (example), and then build a struct to keep them and pass them around (example). Most of the times, all or most of the parameters need to be mutable, as they can change during the algorithm execution.

(I also have other patterns where I heavily use mutable fields, in iterative algorithms that solve complicated systems of equations, although I don't have a publicly available example of that at the moment.)

If the implication of using Refs for mutable fields is that we'd end up sprinkling []'s around to access them all the time, that would be rather annoying.

@StefanKarpinski
Copy link
Member

If the implication of using Refs for mutable fields is that we'd end up sprinkling []'s around to access them all the time, that would be rather annoying.

I don't think that's a danger. There would have to be enough magic to make this transparent, which means that it's largely an implementation detail. Also totally irrelevant to the rest of this discussion :)

@JeffBezanson
Copy link
Member Author

@timholy Yes it gives decent line numbers, I really mean it :) Deprecations that can be implemented in the parser give the best line numbers, because only the parser knows exactly where it is in the original file. After that point information starts to get lost.

@KristofferC Thanks, that is awesome!

@timholy
Copy link
Member

timholy commented Feb 3, 2017

I wonder if we could implement @deprecate_binding via the parser, too?

  • the macro triggers a ccall that inserts the name into a table the parser has access to and affects future parser behavior
  • when the parser encounters a line with a deprecated symbol, it insert!s (right before that line) a suitable depwarn call
  • that warning only gets triggered when the function is executed. It has all the backtrace goodness available to it so that it only fires once from that line.

@JeffBezanson
Copy link
Member Author

JeffBezanson commented Feb 3, 2017

Update: as I hoped, this PR stimulated a lot of discussion. @StefanKarpinski proposed using compound keywords, and many of us now favor:

abstract type Integer ... end
primitive type Int32 ... end
struct type Complex ... end
mutable type Ref ... end

This has a lot of advantages:

  1. It reads nicely (at least in english).
  2. It's highly consistent.
  3. It lets us use complete words without stealing them as identifier names. type, mutable etc. by themselves would be plain identifiers, and only keywords in combination with each other. We can also add more kinds of types without stealing more words in the future, making such changes non-breaking.
  4. It can be fully supported by Compat; @compat in front of all of those can be parsed in <=0.5.
  5. They are all block forms with end, allowing us to add features by putting stuff in the blocks.

@ararslan
Copy link
Member

ararslan commented Feb 3, 2017

If I may, I'd like to explain my opposition to the multi-word keyword definitions proposed.

  • They go against the powerful conciseness that Julia currently employs. Currently everything that requires a keyword can be expressed in exactly one keyword, all of which are perfectly clear. Having the type declarations be struct, mutable, abstract, and primitive seems entirely unambiguous and consistent without adding the extra line noise.

  • They add more special cases in the parser in that type and mutable can be used separately but mean something different when used together. Maybe that's not a big deal in terms of the parser, but it's conceptually odd.

  • We're pre-1.0; we should go with what makes more sense for the language going forward rather than what parses with @compat. Once we hit 1.0, it's a much bigger deal to make breaking changes.

@StefanKarpinski
Copy link
Member

To rebut these points...

  • Conciseness: defining new types is not that common – it does not need to be especially concise. What it needs to be is clear and consistent. Also, "struct type" is only two letters longer than "immutable" and considerably easier to type on a qwerty keyboard. Yes, "mutable type" is eight letters longer than "type" but we actively want to discourage its use, so that's ok by me. Declaring abstract and primitive types is so rare that it's ludicrous to worry about brevity.

  • Special cases: this actually makes fewer special cases, since keywords by definition are special cases. In fact, that's precisely what makes it possible to add new types in the future without breaking any code. Identifiers can't contain spaces so they can't possibly conflict with these two-word keywords. The only place you need to worry about a possible conflict is in places where two bare words next to each other are allowed: inside of [ ] and macro calls – and type declarations can't go inside of [ ] so we only have to worry about the macro case. Fortunately, macros are precisely where we have the most flexibility with upgrade paths – a package can rewrite any valid syntax arbitrarily based on what Julia version is running.

  • Upgradability. It's true that Julia is pre 1.0. However, having a viable upgrade path is crucial. It's fine to say that we can make arbitrary changes, but where does that leave people with code bases already written in Julia? Part of what's good about this proposal is that it makes adding new features in the future – beyond 1.0 – smooth. We could add enum type, protocol type, trait type, interface type, etc. without the slightest fear of breaking code.

@nalimilan nalimilan mentioned this pull request Feb 3, 2017
@stevengj
Copy link
Member

stevengj commented Feb 4, 2017

I see at least three possibilities for Compat with one-word names like struct Foo ... end:

  • In Julia 0.6, support both syntaxes without any deprecation warning. Only issue the deprecation warning in the next Julia release, at which point people can switch over and retain 0.6 compatibility.

  • Use @compat type ... end and @compat immutable ... end, and have the @compat macro rewrite these into struct and mutable struct in 0.6. (This is the reverse of the usual @compat pattern, in which normally you use the newest syntax.) However, if the syntax deprecation warning is in the parser, it may still issue a warning?

  • Use some completely different macro, like @struct begin ... end.

I would prefer the first option (no deprecation warning in this release.)

Update: I didn't read far enough in this thread... I see that @JeffBezanson has an alternate proposal that would work with @compat. The two-word names are fine with me too, though I'm not thrilled. (Ultimately, I guess I just don't care that much one way or the other about this.)

@swissr
Copy link
Contributor

swissr commented Feb 4, 2017

A language is also about aesthetics.

When trying out const struct using adapted syntax highlighting in the base/dict.jl code, I was surprised how much, for me, the readability suffered. It's not a real argument, but I would find it quite sad if such 'one-word-keyword-anchors' go away. (Wanted even propose mstruct, but this is not good).

Regarding upgradability I can't really assess, just thinking:

  • is there fear of a Python 2/3 situation?
  • such possible new keywords don't seem to be such a big danger, easy enough to grep/replace?
  • would [protocol|trait|interface] struct provide some upgrade flexibility if a 'one-word-keyword' is not possible immediately? And/or could enum, protocol, trait and interface be reserved pre 1.0 such that they are available?

Special cases:

  • it seems less important here to avoid special cases for the compiler than to provide a 'good experience' for the user (subjective of course - in general it's great that the language is so stringent in many ways!).
  • (the compiler e.g. also has to work more in order to support string interpolation and enable the user to be 'lazy')

@stevengj
Copy link
Member

stevengj commented Feb 4, 2017

I have to say that with the two-word variants, struct type and mutable type seem like a weird pair ... it seems like we would want them to be closer to antonyms, like const type vs mutable type maybe?

You could also have alias type to replace typealias, though it would be weird to use end with this.

@JeffBezanson
Copy link
Member Author

It would probably be mutable struct type, except 3 words just seems excessive.

@JeffBezanson
Copy link
Member Author

@swissr Very much agree aesthetics count. I think struct type Complex reads pretty well by that standard. const struct is not being proposed here. protocol struct doesn't make any sense to me.

@mweastwood
Copy link
Contributor

If you want to encourage people to default to using struct type over mutable type, maybe it makes sense to have a special case where you can omit the 2nd word and go with struct Complex instead of struct type Complex.

@tknopp
Copy link
Contributor

tknopp commented Feb 4, 2017

Is the proposal at the top of the PR obsolete?

  • if we have the pairs (struct, mutable struct) and (const struct, struct) I would like to question if the word const and mutable will be used in other context. If possible I would try only having one of them within the language. That one of immutable and type is used slightly more often does not seem to be a pressing argument for me to introduce the new mutlable keyword. Base is also not the most representative code base to judge this (because it contains much low-level code).

  • I do not see the gain adding type to anything. Julia is concise and this does not make things more readable. In practice its not the keyword that we look at but the actual type name and going from

type ThisIsAMoreComplexTypeThanTheComplexTypeAbove

to

mutable struct type ThisIsAMoreComplexTypeThanTheComplexTypeAbove

is not an improvement in readability for me.

@JeffBezanson
Copy link
Member Author

Did some more renaming; should be in fairly good shape now. In updating docs I found the new keywords quite helpful; we can now easily distinguish immutable types (meaning anything that's immutable, e.g. tuple, struct, or primitive type), bits types ("plain old data"), primitive types, and structs. Probably even more improvements in doc, error message, and internal function names are now possible.

The notepad++ mode seems to use a space-separated list of keywords, so I don't know how to handle the 2-word keywords there. For now I just threw in all the words.

@StefanKarpinski
Copy link
Member

Agree – this is going to help with explaining things a lot.

msg::AbstractString
end

#type UnboundError <: Exception
Copy link
Contributor

@tkelman tkelman Feb 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

legacy cruft?

edit: yeah wow that's old, added in 6571654 and deleted in 54e40e3, was it ever used?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it was ever used. Surprising how long those 3 lines managed to hang around.

@JeffBezanson
Copy link
Member Author

Anybody understand the os x failure here?

@StefanKarpinski
Copy link
Member

Looks like a timeout – it got hit with some signal and stopped. I restarted it.

@tkelman
Copy link
Contributor

tkelman commented Feb 10, 2017

and now the log of the failure is gone. save to a gist before restarting on travis, please! copy paste the raw log link and run curl -L url | gist, it takes two seconds.

@StefanKarpinski
Copy link
Member

The log wasn't that interesting. We have tons of examples over at #17626.

@tkelman
Copy link
Contributor

tkelman commented Feb 10, 2017

This case looked a little different to me, it was in the parallel test instead of spawn.

@JeffBezanson JeffBezanson merged commit 5151271 into master Feb 10, 2017
@JeffBezanson JeffBezanson deleted the jb/structkeywords branch February 10, 2017 17:59
carlobaldassi added a commit that referenced this pull request Feb 10, 2017
introduced while #20418 was pending
@StefanKarpinski
Copy link
Member

Ah well, I apologize. Otoh, this PR got merged!!

@ViralBShah
Copy link
Member

I think it would be great to put this change out as a blog post, perhaps even before the 0.6 release announcement.

maleadt added a commit to maleadt/LLVM.jl that referenced this pull request Feb 13, 2017
maleadt added a commit to JuliaGPU/CUDAnative.jl that referenced this pull request Feb 13, 2017
maleadt added a commit to JuliaAttic/CUDArt.jl that referenced this pull request Feb 16, 2017
vchuravy pushed a commit to JuliaAttic/CUDArt.jl that referenced this pull request Feb 17, 2017
@s-celles
Copy link
Contributor

I noticed that latest doc (0.6.0-dev from December 07, 2016) https://media.readthedocs.org/pdf/julia/latest/julia.pdf
doesn't reflect theses changes.

@tkelman
Copy link
Contributor

tkelman commented Mar 11, 2017

we don't use readthedocs any more

@nalimilan
Copy link
Member

Is there a way to remove those files? It would be misleading to have them appear as "stable" or "latest" forever.

@s-celles
Copy link
Contributor

So where can I download a PDF file of latest doc?

omus added a commit to JuliaTime/TimeZones.jl that referenced this pull request Mar 16, 2017
omus added a commit to JuliaTime/TimeZones.jl that referenced this pull request Mar 16, 2017
@fredrikekre
Copy link
Member

Just want to say that this change is really convenient when searching for type definitions, e.g. when you don't know if the type is mutable or immutable; searching for struct Foo matches both mutable struct Foo and struct Foo 🎉

Bachibouzouk added a commit to Bachibouzouk/Mbo.jl that referenced this pull request Jul 22, 2024
-Range renamed AbstractRange
JuliaLang/julia#23570

-squeeze renamed dropdims

-immutable renamed struct
JuliaLang/julia#20418
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking This change will break code design Design of APIs or of the language itself
Projects
None yet
Development

Successfully merging this pull request may close these issues.

crazy idea: change the type keyword