Evaluating Design Choices for Records #3231

gafter · 2020-02-24T16:43:44Z

gafter
Feb 24, 2020

Evaluating our Choices

What are the goals of the records feature?

Simplify the declaration of data-oriented classes.
Encourage sound programming practices (and discourage unsound practices).

Our choices and (small o) opinions of language features...

Affects the (capital O) Opinion of the language on preferred programming practices.

We should want the language to encourage sound programming practices.

It should guide users into the pit of success rather than the pit of failure.

We had proposed a criterion for evaluating our design choices based in this principle.
It looks like this:

Assume feature choice X causes the language to encourage programming practice Y.
Making feature choice X,
results in a language that has the Opinion that Y is a sound programming practice.

Then

if we prefer that programmers exhibit Y behavior
(i.e. Y is sound programming practice)
Then we should prefer choice X.
If we prefer that programmers not exhibit Y behavior
(i.e. Y is poor programming practice)
then we should not prefer choice X.

Conversely, if feature choice X discourages behavior Y, we should prefer choice X iff we consider Y a poor programming practice.

All other things being equal,
we should prefer that our choices (opinions about X) drive language
Opinions (about Y) that we agree with.

Sometimes a language feature choice X can have more than one effect on programmer behavior.
It might encourage behavior Y1 and Y2, one of which is "good" and one of which is "bad".
For example, consider an abstract data type

class PointA
{
    public int X;
    public int Y;
}
// and client code
PointA p = new PointA();
p.X = 3;
p.Y = 4;

We might bemoan the ceremony during initialization, and provide a more declarative way to write the client code:

PointA p = new PointA { X = 3, Y = 4 };

In this case feature choice X is to add object initializers.
Beneficial software engineering effect Y1 is that X encourages more declarative (clear) code.

The original programmer in this example could also have written

class PointB
{
    public readonly int X;
    public readonly int Y;
    public PointB(int x, int y)
    {
        X = x;
        Y = y;
    }
}
/// and client code
PointB p = new PointB(x: 3, y: 4);

This lacks the ceremony during initialization (feature X doesn't affect this programmer), but
at the expense of ceremony during the declaration of the abstract type.

Before the addition of object initializers, programmers were encouraged by the language
to write PointB to reduce the ceremony on the consumers of the type.
Once object initializers are in the language, this incentive no longer exists.
Because of the reduced boilerplate at declaration time, the language with the addition
of object initializers causes programmers to prefer PointA. So

Detrimental software engineering effect Y2 is that the addition of (X) object initializers encourages the declaration of mutable types (i.e. results in a language whose Opinion is that the declaration of mutable types is a preferred practice). This is an unintended consequence that records are designed to correct.

If the programmer for this type wanted to provide support for validating the
object's invariants after construction, it would have to be through the use of a method that
is intended, by convention, to be called by the client after the object is built.
Unfortunately, it is all too easy to forget to call it, or to continue to mutate the object
after calling it. So

Detrimental software engineering effect Y3 is that X discourages validation of the object's state before exposure (i.e. results in a language whose Opinion is that validation before exposure is an unsound programming practice). This is also an unintended consequence that records are designed to correct.

To clarify our terminology, an "opinion" that some language feature should behave in some particular way (X) is a small-o opinion

That opinion is only relevant until we make a language design decision.

On the other hand, a programming practice that is encouraged (or discouraged) by the language as a result of language design choices is a big-O Opinion that (Y) the practice is sound (or unsound).

This language Opinion affects every programmer that uses the language.

One goal of adding records to the language is to move the language's Opinion toward

Discouraging, rather than encouraging, the use of mutable types for programmatic APIs.
Encouraging, rather than discouraging, validation of an object's state before exposure.
Continuing to encourage sound programming practices currently promoted by the language.
Continuing to discourage unsound programming practices currently discouraged by the language.

Language Opinions come in four varieties:

Should: a given practice is to be encouraged. We would like the language to have this kind of Opinion when the practice is "better" than the alternatives.
May: a given practice is to be permitted. We would like the language to have this kind of Opinion when the practice is as sound as the alternatives or where the benefits are situation-dependent. This is, in a sense, a non-Opinion.
Should Not: a given practice is to be discouraged. We would like the language to have this kind of Opinion when the practice not as sound as the alternatives or generally bad.
May Not: a given practice is to be forbidden. We would like the language to have this kind of Opinion when the practice is among the worst. In general we accomplish this simply by not supporting any way to engage with the practice.

The "should" and "should not" kinds of Opinions are the ones we are most concerned about.

Let us look at some possible choices:

(Possible choice): Define Equals to be symmetric (vs non-symmetric)

I don't think this is controversial. Types such as Dictionary have undefined behavior if the key's Equals is not symmetric. Implementing a type by hand with its equality contract being symmetric is notoriously difficult. Fortunately, this is a problem we've solved.

Our choice (X) to provide a symmetric Equals in records drives the language Opinion that (Y) an equality contract should be symmetric. We'll see that other choices may undermine this Opinion.

(Possible choice): Forbid (vs permit) the declaration of "behavior"

C# is a fundamentally object-oriented language. That means, in part, that it promotes the bundling together of data and behavior. The declaration of a record is much more concise than equivalent declarations written longhand, so any restrictions result in the language that discourages those practices. In short, deciding that (X) records should forbid the declaration of behavior results in the language Opinion that (Y) bundling behavior with data (i.e. object-oriented programming) is an unsound programming practice. Similarly, if domain modeling requires bundling additional behavior, the programmer does not get the benefit of a compiler-provided symmetric equality contract. So this restriction would also result in the language Opinion that (Y) a type that bundles behavior should not have a symmetric equality contract.

I believe that bundling behavior with data and symmetric equality are sound programming practices, so I prefer that we choose to permit the declaration of behavior in a record. Object-oriented programming is so central to the experience of programming with C# that forbidding it would make the language feel inconsistent and balkanize the language into different feature "camps".

(Possible choice): Members declared in a record are `public` by default

When declaring a concrete class, the default accessibility is private. Unless the programmer explicitly selects some other accessibility, a declared member is not part of the API. This language choice (X) creates the language Opinion that (Y) elements of APIs between program components should be explicit, intentional choices. I suspect that this is a software engineering principle that we all agree with.

If we make the opposite choice in a record, making public the default (or only) accessibility, it results in the opposite Opinion. Forbidding private members would also complicate the declaration of helper methods to implement behavior. If (X) elements of a record's API are by default public (or may only be public), this creates the language Opinion that (Y) elements of APIs between program components may be implicit, unintentional choices.

(Possible choice): Define Equals based on all fields (vs on explicitly selected members)

The semantics of the equality contract are a part of the programmatic API of a concrete type. As such, equality semantics affect the programmatic API. If we decide that (X) the equality contract is defined by the set of all fields in a type, then this implies that (Y) elements of APIs between program components should be implicit. This choice conflicts with the language Opinion driven by the default accessibility in a concrete type above. For the same reason that we should prefer that members be implicitly private, we should prefer that fields not be included in the equality contract by default. A hypothetical explicit mechanism to "opt-out" a member would not change the language Opinion, because it is easier not to use that mechanism.

Including all fields also causes a serious practical problem for programmers, and this problem is not largely shared by the author of a struct type that uses the default equality implementation. When you use a struct as a key in a dictionary, there is not much risk of it mutating. If a caller gets a key out of the dictionary that is of the struct type and calls a mutating method, only the copy is mutated, not the copy used as a key in the dictionary. On the other hand, if we take the same approach (including all fields, even private mutable data) for classes, it is much more dangerous, as the caller can easily (even accidentally) mutate the same data that is held as a key in the dictionary thus invalidating the dictionary’s invariants.

There is also the problem that the full set of fields in a derived type may not be known to the compiler because the compiler cannot know the set of fields in the base type. Even if it did (e.g. delegate to a base ValueEquals method), it interferes with implementation inheritance (see next item).

In short, because I believe that (Y) elements of APIs between program components should be explicit, intentional choices, that drives my preference for (X) basing the equality contract on the user-declared set of primary members.

(Possible choice): Forbid (vs permit) inheritance

Inheritance among classes is frequently used for the inheritance of implementation details. In many situations such inheritance is better modeled using containment rather than inheritance, but in many other cases implementation inheritance is quite useful. Unless we have some reason to question the wisdom of implementation inheritance in general, or there are particular reasons why the technique which is applicable in general is not appropriate for records, there is no reason to restrict one record from inheriting from another record (or record from class, or class from record). Moreover, the current proposals do not permit the compiler to distinguish a record from a non-record class outside its own compilation, so the restriction would either have to change that fact or relate it to something that is not record-specific.

I can think of two reasons that we might forbid such inheritance.

Scala's "case classes", on which records are modeled, eventually changed the language to forbid record-to-record inheritance because their scheme for generating the equality implementation did not ensure symmetry. Fortunately, this is a problem that we have solved.
If one record Vector(double R, double Theta) inherits from another Cartesian(double X, double Y), it is not clear how to automatically implement the base Cartesian.With method in Vector. It seems as though this is a situation in which programmer intervention is required.

The language's current Opinion is that (Y) A concrete type may inherit from another concrete type. This latter issue notwithstanding, we should have some compelling software engineering principle to drive any decision that changes this Opinion.

(Possible choice): Forbid (vs permit) mutable state

One intent of records is to promote programming with immutable data. Could we promote it even further by forbidding mutable state? Unfortunately, that feature choice would result in the opposite. There are two interesting cases to consider to make this point: public data and private data.

In the case of public data, there are situations in which computing the data would necessarily be more complicated beforehand. For example, consider a situation in which you read a set of city descriptions that include city population, and want to have the city's internal representation include the population rank. You cannot compute the rank until all the cities have been read. While the programmer could address this using immutable data by copying the data into a new object that fills in the piece computed later, that can result in boilerplate both in the implementation of the code to copy all the relevant data, and in the implementation of the client that needs to be prepared for replacing one set of data with another.

A straightforward way to handle this situation is to have the rank be a mutable field that is set after the ranks have been computed. The programmer can implement that so that it is permitted to be set once and is frozen once it is set.

Implementing the second approach requires mutable data. If records forbid mutable data, then the programmer must write out the type longhand. In the process the programmer loses all the benefits conveyed by records. For example, a programmer-written Equals implementation is far less likely to be symmetric than that provided by the compiler. This restriction (X) means that the language's Opinion is that (Y) a type that contains mutable data should not have a symmetric Equals (whether or not the mutable data is part of the equality contract).

Similarly, private mutable data is often used to implement behavior. For example, a mutable field is useful to cache a lazily-computed property (that may appear to be immutable from the outside). That is not possible in a record if mutable state is forbidden. Consequently the choice to forbid mutable state implies the language Opinion that (Y) lazily computed properties are a poor programming practice, and that (Y) a type containing lazily computed properties should not have a symmetric Equals (or any of the other benefits of Records).

I believe we should (X) permit mutable data in a record because the language's Opinion should be that (Y1) Equality should be symmetric and (Y2) lazily computed properties are a sound programming practice.

(Possible choice): Do not support final object validation by the constructor

There is a part of the records proposals that we have not discussed recently. That is the possibility of having a "floating" constructor body in the record declaration:

class Point(double X, double Y)
{
    public Point
    {
        if (X == 0 && Y == 0)
            throw new ArgumentException("(X, Y) at origin");
    }
}

Relevant to this is the language's current constructor feature that leads to the current language Opinion that a data type may perform validation of its initial state before the object is exposed. I believe this Opinion is just as important for records as it is for other types. The alternative for the programmer is to forgo using records, or to provide a public method that the caller can use to validate the object state. Unfortunately, it is easier for the caller to neglect to call such a method.

If we decide (X) that we will not support the constructor body, then that leads to the language Opinion that (Y) a type that requires validation of its initial state should not have a symmetic equality contract (or any of the other benefits of record declarations).

(Possible choice): Define Equals to depend on (vs independent of) the base type's Equals contract

In C#, inheritance may be used for different purposes. Sometimes it is used to inherit a contract (API and behavior) from the base type to be exposed as part of the programmatic interface. Sometimes it us used to provide easy access to implementation artifacts in the base type.

Either way, we either inherit the base type's equality contract or the derived type defines its own equality contract. It cannot be both. Whether inheritance is used to extend a contract or to replace it, the language's current Opinion is that (Y) elements of APIs between program components should be explicit, intentional choices.

I believe this should hold for a record class as well. The equality contract in the derived type should either be inherited from the base type, or defined by the derived type. Since a record nominates a set of primary members to define the record's semantics, I believe those members should define the equality contract. If a nominated member is defined by the base type and inherited into the record, then the record's own equality contract would directly depend on the member and there is no need to delegate any part of the equality implementation to the base. If no member from the base type is nominated in the record's declaration, then no member from the base type is part of the record's provided API surface area and none should participate in the record's equality contract.

In addition, changes in the base type (e.g. the addition of a ValueEquals method) should not affect the meaning of the derived type's equality contract unless explicitly invoked.

Varying from these choices implies that we intend to vary from the language's current Opinion that (Y) elements of APIs between program components should be explicit, intentional choices.

orthoxerox · 2020-03-02T16:36:28Z

orthoxerox
Mar 2, 2020

Re: implicit/explicit equality.

Why not provide a simple method a la HashCode.Combine() for equality, something like EqualityComparer.Compare<T1, T2, ...>(T1 l1, T1 r1, T2 l2, T2 r2, ..., bool theResultOfAnotherComparerIfThereAreTooManyFields)? This way equality is always explicit, unless you agree to generate the most straightforward method for the simplest records.

0 replies

gafter · 2020-03-02T18:23:12Z

gafter
Mar 2, 2020
Author

@orthoxerox That technique doesn't produce a symmetric equality relation in the face of inheritance.

0 replies

orthoxerox · 2020-03-03T22:13:29Z

orthoxerox
Mar 3, 2020

@gafter Not even when that T1 is the equality contract?

0 replies

agocke · 2020-03-03T23:49:13Z

agocke
Mar 3, 2020
Collaborator

@orthoxerox The problem with equality across inheritance is that the compile-time type is not necessarily the run-time type. So if you have two objects, A and B where B : A, if you statically ask A.Equals(B), you may get true, while if you reverse it the B type may say, no, only instances of B can be equal to B. That's the violation of symmetry.

The only way to resolve this is to ask the objects themselves to compare, including the run-time type information, effectively saying that by-default only As are equal to As, not subtypes of A.

0 replies

WolvenRA · 2020-03-04T03:34:47Z

WolvenRA
Mar 4, 2020

Obviously, inheritance creates a lot of "issues" when it comes to "records". While I'm sure this isn't going to be a popular opinion (little o), it seems to me that basing "records" on structures would solve a lot of issues. A: Structures aren't inheritable, so all those inheritance issues go away. B: Elements of a structure are explicitly defined, that combined with no inheritance makes the Equality "symmetry" issue a non-issue. Making individual elements mutable or not mutable would require some new identifier for the element. Shouldn't be that difficult to do and allows opt-in or opt-out for immutability of elements. Structures are auto initialized, or you can add parameter constructor and set the values to whatever you want. (you can also have a method to clear\set mutable values to whatever you want).

Basically, from my little o perspective... A structure that automatically created the Value(s) Equal\Not Equal function would seem to meet almost all of the stated goals of "records". It would be nice if you could set the size of an array when you define it in a structure.

0 replies

WolvenRA · 2020-03-04T21:35:24Z

WolvenRA
Mar 4, 2020

After a little more thinking about it... There are a few other things I'd like to be able to do with records (as structures).

Years ago I was an RPG programmer on the AS\400, I Series, whatever they call it now. One thing IBM knew how to do was make structures easy and useful. I could define a structure using the keyword "LikeRec" and specifying a data table. The system would automatically create the structure with the column names and data type schema from the data table. I could also use LikeDS(SomeOtherDS) and create a clone of "SomeOtherDS". The beauty of those two examples is that if the data table schema or "SomeOtherDS" changed, the structure being created would automatically be modified to match.

A feature I'd like from records would be the ability to automatically convert a record (or any structure) to a datarow, and vice versa. I've written my own functions for doing it using reflection, but it's slow... and I shouldn't have to. Actually, it would be nice to have builtin functions to convert structures to collections, dictionaries, etc. (and vice versa).

Finally, record.ToString SHOULD return a string of the element VALUES of the record... not "System.Record".

0 replies

popcatalin81 · 2020-03-10T10:54:01Z

popcatalin81
Mar 10, 2020

I think equality should only include read-only members wich implement IEquatable<T> and some new ISetEquatable<T> for collections.

Example 1

record PointA // Assuming some record syntax could as easily use `data class`
{
    public int X; // IEquatable<int>
    public int Y; // IEquatable<int>
}
// becomes 
class PointA: IEquatable<PointA>
{
    public int X { get; } // Key member
    public int Y { get; } // Key member
    public bool Equals(PointA other) 
    {
       return other != null && this.EqualsImplementation(other);
    }
   protected virtual bool EqualsImplementation(PointA other) {
       return this.X.Equals(other.X) && this.Y.Equals(other.Y);
   }
}

This delimitation would be good because you can now use arbitrary types like immutable collections in your records without needing to explicitly opt-out of equality for members or not needing to opt-in

Example 2

record PointB
{
    public int X; // IEquatable<int> // Key -> IEquatable<int>
    public int Y; // IEquatable<int> // Key -> IEquatable<int>
    public ImmutableList<Layer> Layers; // Not Key
    // Conveniently this is how you'd implement equality between points, as well. Belonging to a layer 
    // or not should not be taked ino account when testing for equality
}

Example 3

record PointC
{
    int X; // IEquatable<int> // Key -> IEquatable<int>
    int Y; // IEquatable<int> // Key -> IEquatable<int>
    RgbColor Color { get; set; }; // Not Key -> Even though IEquatable<Color>, its mutable
    byte Alpha { get; set; }; // Not Key -> Even though IEquatable<byte>, its mutable
}

Example 4

record PointD
{
    int X; // IEquatable<int> // Key -> IEquatable<int>
    int Y; // IEquatable<int> // Key -> IEquatable<int>
    [NotKey]RgbColor Color; // Not Key -> Explicit Opt Out even though IEquatable<Color>not part of key
    [NotKey]byte Alpha; // Not Key -> Explicit Opt Out even though IEquatable<Color>not part of key
}

Example 5

record Person 
{
       int Id;
       string FirstName;
       string LastName;
       [Key] SSN; // Since key is present all other IEquatable<> members are ignored
       Address Address; // Address is also IEquatable<> but ignored due to explicit key members
}

One big problem with using just one interface is that You can't add IEquatable<> to a collection like HashSet<T> because that breaks hashing. (IE: If you add an element to a HashSet<T> which is a dictionary, you won't be able to retrieve it, but IEquatable<T> could be added to ImmutableList<T> without breaking hashing, so for collections a new ISetEquatable<T> interface might be needed.

0 replies

333fred · 2020-03-10T17:16:46Z

333fred
Mar 10, 2020
Maintainer

I think equality should only include read-only members wich implement IEquatable<T> and some new ISetEquatable<T> for collections.

This won't really work for reasons that you brought up in this proposal, actually. If we were to do this, then it would mean that updating a dependency could add or remove fields from record equality based on whether said library added IEquatable<T> to a type you were using in your record. The equality of a record type shouldn't be able to add/remove members from equality without explicit, intentional user action, regardless of what solution we settle on.

0 replies

popcatalin81 · 2020-03-10T17:39:14Z

popcatalin81
Mar 10, 2020

The equality of a record type shouldn't be able to add/remove members from equality without explicit, intentional user action, regardless of what solution we settle on.

That means a default Equality implementation is either all members or no members. both of these are subpar choices in multiple circumstances involving dictionaries, leading to unrecoverable objects stored in dictionaries or inability to recover previously stored objects based on a previous version of an immutable record.

0 replies

333fred · 2020-03-10T21:06:54Z

333fred
Mar 10, 2020
Maintainer

I'm not seeing how these are related. If you have a record that holds onto mutable data, and you're using that as a key in a dictionary, and you change the mutable data, it's now unrecoverable. It doesn't really change anything whether that mutable data implements IEquatable<T> or not.

0 replies

popcatalin81 · 2020-03-12T11:54:32Z

popcatalin81
Mar 12, 2020

It doesn't really change anything whether that mutable data implements IEquatable or not.

That's not the point. The point was which members should be chosen for Equality implementation (Object.Equals) and conversely for Object.GetHascode irrespective of whether records implement IEquatable<T> or not.

In the end, I guess a similar solution to structs automatic implementation of GetHashcode and Equals, could be implemented for records as well.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluating Design Choices for Records #3231

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 11 comments

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Evaluating Design Choices for Records #3231

Evaluating our Choices

(Possible choice): Define Equals to be symmetric (vs non-symmetric)

(Possible choice): Forbid (vs permit) the declaration of "behavior"

(Possible choice): Members declared in a record are public by default

(Possible choice): Define Equals based on all fields (vs on explicitly selected members)

(Possible choice): Forbid (vs permit) inheritance

(Possible choice): Forbid (vs permit) mutable state

(Possible choice): Do not support final object validation by the constructor

(Possible choice): Define Equals to depend on (vs independent of) the base type's Equals contract

Replies: 11 comments

gafter Mar 2, 2020 Author

agocke Mar 3, 2020 Collaborator

333fred Mar 10, 2020 Maintainer

333fred Mar 10, 2020 Maintainer

(Possible choice): Members declared in a record are `public` by default

gafter
Mar 2, 2020
Author

agocke
Mar 3, 2020
Collaborator

333fred
Mar 10, 2020
Maintainer

333fred
Mar 10, 2020
Maintainer