Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unqualified instance variable references #18

Closed
zenparsing opened this issue Mar 4, 2018 · 54 comments
Closed

Unqualified instance variable references #18

zenparsing opened this issue Mar 4, 2018 · 54 comments

Comments

@zenparsing
Copy link
Owner

zenparsing commented Mar 4, 2018

As an alternative to #17, I would like to explore the possibility of making instance variables more "variable-like" by allowing unqualified access to instance variable names within class methods.

For example:

class C {
  var x, y;
  
  constructor() {
    x = 1;
    y = 2;
  }

  m() {
    // Within nested functions, unqualified instance var references
    // refer to the `this` value captured by the class method
    function f() { return x + y }
    f(); // 3
  }

  static {
    // Within the class initializer (and within class computed property 
    // name evaluations), unqualified instance var references result in a 
    // ReferenceError.

    // This, for example, would throw:
    // console.log(x);
  }
}

Unqualified access would apply to instance variables only, and not hidden methods.

Now that we've formalized the concept of "hidden member descriptors" (HMDs), I think we can do it like this:

For all methods defined in a class definition that has instance variable declarations, we store a list of those instance var HMDs in an internal slot, say [[InstanceVariableParams]].

On function declaration instantiation, for each HMD in [[InstanceVariableParams]], we create an immutable binding in the function environment record from the instance var name to the instance var HMD.

We modify GetHiddenValue and SetHiddenValue such that if the reference's base value is an environment record (i.e. an unqualified reference), we use the result of base.GetThisBinding() as the hidden reference receiver.

Thoughts, @littledan @erights @BrendanEich ?

@littledan
Copy link
Collaborator

Similarly to the concern I raised in #17 (comment) , I would worry that programmers would misinterpret these variable references as being "lexically scoped" (and therefore, that the method is somehow bound to the instance), rather than based on this. Probably skilled educators would be better than me in assessing this risk; if we can get their opinion, it would be helpful.

@zenparsing
Copy link
Owner Author

So, if I understand correctly, the concern is that users might expect the following to work, because there's no this token present:

class C {
  var x;
  m() { return x }
}

let { m } = new C();
m();

I think this is yet another case of JS wanting proper support for method extraction/binding.

@littledan
Copy link
Collaborator

@zenparsing Yes, that's exactly the concern.

Even if we add a method extraction/binding feature, I'm not sure if this would really be enough to change the intuition people might have about what lexical scoping means (though maybe it'd reduce the frequency of this coming up as an issue).

@zenparsing
Copy link
Owner Author

Even if we add a method extraction/binding feature, I'm not sure if this would really be enough to change the intuition people might have about what lexical scoping means

I see what you're saying, but I'm tempted to disagree here. The lack of a method extraction feature is causing developers to reach for things that wouldn't make sense otherwise. Another example would be the anti-pattern of using public fields and arrow functions to create per-instance "methods".

But I agree that a broader range of opinions would help us.

@allenwb
Copy link
Collaborator

allenwb commented Mar 4, 2018

As I think we've discussed in the context of an earlier proposal, this would seem to mean refactoring

constructor() {
     x = 0;
     y = 0;
  }

into

constructor(x, y) {
     x = x;
     y = x;
  }

becomes problematic. More generally it opens up the possibility for all sorts of unintended confusion between lexical variables and instance variables. For example, consider a method in a large class that makes a single use of some imported binding. Latter somebody (maybe the same dev) decides to add a new instance variable to that class and just happens to choose the same name as the imported binding because they are unaware (or don't remember) that single use buried in the method.

More generally, this seems very unfriendly to people reading code. Reason for assigning to an instance variable is usually very different from the reason for assigning to an up-level lexical variable. When reading code, it is very helpful to be able to directly see this distinction rather than have to look somewhere else to see what kind of binding is being assigned to.

I've argued in another thread that is it very important that we emphasis the difference between properties and instance variables and that using -> instead of . as the access operator is an important part of making that distinction. I think it is equally import to emphasis the distinction between block allocated lexical variables and instance variables. Making them indistinguishable at the point of use seems counterproductive from that perspective.

This approach also seems counter to our minimizing complexity goals. It adds specification (and possibly runtime) complexity and it also seems to add complexity to the user's conceptual model. I think we a near a sweet spot with what we have currently spec'ed. It's good to consider this alternative but I don't think its a path we should take.

@zenparsing
Copy link
Owner Author

You're correct that unqualified instance vars opens up the possibility of unintentional (or perhaps just annoying) lexical shadowing. But aren't these issues are familiar in any language that allows unqualified member access (e.g. C++, C#, and Java)? Furthermore, I think our choice of var as the declarative keyword reduces the potential for confusion by matching our existing intuitions about variable declarations and scope boundaries.

When reading code, it is very helpful to be able to directly see this distinction rather than have to look somewhere else to see what kind of binding is being assigned to.

I agree that, in some cases, readability is better when one always uses the this qualifier. I did a quick search for style guidelines regarding the this keyword in C++, Java, and C#, and found the same debate. From what I can tell, Java and C++ programmers tend to leave out the this, the internal MS C# codebase uses this, and in general there is no agreement on which style is better.

It should be pointed out that, since unqualified instance variable references are statically resolvable, syntax highlighters can choose to use a different color for those identifiers.

This approach also seems counter to our minimizing complexity goals. It adds specification (and possibly runtime) complexity and it also seems to add complexity to the user's conceptual model. I think we a near a sweet spot with what we have currently spec'ed.

I agree with all of this. We are currently at a sweet spot. But I think that this will be our only opportunity to give users the this shortcut that a certain segment have been wanting for a long time. We should continue to explore this possibility for now.

@allenwb
Copy link
Collaborator

allenwb commented Mar 5, 2018

I think that this will be our only opportunity to give users the this shortcut that a certain segment have been wanting for a long time. We should continue to explore this possibility for now.

I agree. And I think that #17 is a better fit, even with the ASI hazard.

@zenparsing
Copy link
Owner Author

@littledan It seems like over in #17 you reject unqualified instance vars based on "the problems of complicating lexical scope." Clearly it does complicate lexical scope, but is there an example or further arguments that might clarify the resulting problems?

@littledan
Copy link
Collaborator

@zenparsing My main concern is #18 (comment) . This is why we removed shorthand support from the current class fields proposal.

An unprefixed variable seems "even worse" since it's even harder to see the implicit this--you can't make a rule like, "if you see a #x, it's short for this.#x", you have to actually find the surrounding var declaration to see whether it implicitly refers to this or not.

@zenparsing
Copy link
Owner Author

you have to actually find the surrounding var declaration to see whether it implicitly refers to this or not.

Is that substantially different than the situation in Java, C++, or C#?

@littledan
Copy link
Collaborator

Is that substantially different than the situation in Java, C++, or C#?

JavaScript scoping is very different from those languages. They are more complicated in some ways, and simpler in other ways. JS has single-namespace lexical scoping, but then there is complexity from eval, this/Function.prototype.call, and the lack of an always-on static scope checking system. Those other languages have a bunch of namespaces, and disambiguation that's based on an ordering of them, in addition to the order based on nested lexical scope--more complicated at a basic level, but at least you don't have to worry about those three issues that JS has.

@allenwb
Copy link
Collaborator

allenwb commented Mar 5, 2018

One reason the situation is different for (original) Java is that it lacked global variables, standalone functions, or really any common usage of deep lexical nesting. If you see a syntactic uncapitalized identifier you only have to check the immediately enclosing function to determine it is not a class field.

Also see https://docs.oracle.com/javase/specs/jls/se7/html/jls-6.html#jls-6.4 which places restrictions on some declarations that would shadow field names.

@zenparsing
Copy link
Owner Author

Certainly the languages are different in how they handle scoping, and perhaps a lack of deep lexical nesting patterns in those languages is significant. I'm not sure yet.

I do know that the closure-class pattern was popular before ES2015 classes.

function OldSchoolClass() {
  var a = 1, b = 2;
  this.m = function() {
    return a + b;
  };
}

new OldSchoolClass().m();

And if users find that pattern intuitive (and in general they do), it seems to me that they will find this version intuitive as well:

class NewSchoolClass {
  var a, b;
  constructor() {
    a = 1;
    b = 2;
  }
  m() {
    return a + b;
  }
}

And so I'm still struggling to see the problem here.

The main behavioral difference between NewSchoolClass and OldSchoolClass is that newSchoolClass.m will throw if not provided with a correctly-constructed this value. Is that difference enough to throw out unqualified instance vars?

@littledan
Copy link
Collaborator

@zenparsing The problem is, once they are so happy with the parallelism there, they may use let fn = new NewSchoolClass().m; return fn();. They may be surprised that this worked with the closure-class pattern but not with instance variables.

@zenparsing
Copy link
Owner Author

zenparsing commented Mar 5, 2018

@littledan And you think that users will continue to fall into that trap if we provide them with a suitable method extraction syntax?

Let's say that we have:

let fn = new NewSchoolClass()::m;

If this becomes the canonical way to extract a method from a class instance, then does your objection still apply?

@littledan
Copy link
Collaborator

@zenparsing It's definitely a mitigation, but I think people will continue to use the syntax instance.method and develop intuitions about its semantics.

@zenparsing
Copy link
Owner Author

Sorry, I'm still not quite understanding the strength of the objection.

A typical class definition will contain many this references that are not elided (e.g. invocations of hidden or public methods and accessors). It's certainly not the case that all this references would be elided. Given the need for those occurrences of this, why would unqualified instance vars cause users to develop additional intuitions about being about to call the method unbound (beyond the mistaken intuitions that they already have)?

@littledan
Copy link
Collaborator

littledan commented Mar 5, 2018

Sorry, I'm still not quite understanding the strength of the objection.

I'm not going to stand up in committee and veto prefix-less shorthand if everyone else likes the idea. However, I worked on the removal of this feature from the class fields proposal because the incorrect mental model seemed extremely prevalent (and most prevalent among the strongest supporters of the shorthand).

@zenparsing
Copy link
Owner Author

@littledan Oh, I was more referring to the strength of the argument rather than level of personal objection, but that's good to know, nonetheless 😄

I think I understand now, though. It might be a matter of reaching out to more people (particularly those "strongest supporters") to find out their reactions.

@BrendanEich
Copy link
Collaborator

BrendanEich commented Mar 5, 2018

Catching up here (thanks for pointer in #17). I mostly agree with @zenparsing that we can separate concerns and address method extraction independently, and with no greater or lesser urgency (which is to say "not much" judging by TC39 sentiment!). Two things rankle here:

  1. The constructor example @allenwb gave, which may bug-burn some people without tool-time help. CoffeeScript made a sweet shorthand: constructor(@x, @y) to take params that initialize the ivars of those names. I can't "size" the problem here but it seems big enough in downside, and missed upside, to talk about.

  2. The beauty of this issue's extension to the proposal is the quasi-lexical connections among uses and declarations (including nice revival of var as @zenparsing noted). However, the ivars are still in this, so the unbound method hazard may seem more urgent (I don't agree), but what's more: tools and humans reading the code will have a harder time finding any bugs of the wrong-this or (point 1 immediately above) x = x in constructor kind. Is not-quite-lexical worse than verbose-dynamic? It could be, worst case!

@allenwb
Copy link
Collaborator

allenwb commented Mar 5, 2018

I toyed with suggesting constructor(->x, ->y) earlier and designed not to because of my general concern about keeping things "max-min".

consider: constructor(->x, ->y} {/*...*/ super()}
The assignment to instance variables has to occur as part of or after the super() call.

@littledan
Copy link
Collaborator

Not sure if there's an issue, but I think we've chatted at some point about creating constructor(#x, #y) syntax. It would definitely make a common case shorter!

But it seemed to sort of confuse the distinction between destructuring bind and assignment in a way that seemed hard to generalize fully. For example, should we support constructor(this.x, this.y)? That would definitely be useful too.

@zenparsing
Copy link
Owner Author

Another syntactic option for shortening constructors might be something like:

class Point {
  var x, y;
  constructor(var x = 0, var y = 0) {}
}

Or some variation thereof.

As @allenwb says.

The assignment to instance variables has to occur as part of or after the super() call.

@littledan asks:

should we support constructor(this.x, this.y)?

I don't think so. Here is another instance where using the .# syntax tends to conflate object properties with instance vars, leading to dilemmas that otherwise wouldn't trouble us.

My feeling here is that we should leave off auto-assignment to instance vars from this proposal for the sake of minimalism. The lack of support for this kind of feature in C# and Java (please correct me if I'm wrong about that!) makes me feel confidant that programmers will get along without it just fine.

@zenparsing
Copy link
Owner Author

@BrendanEich speculates:

tools and humans reading the code will have a harder time finding any bugs of the wrong-this or (point 1 immediately above) x = x in constructor kind

How can we get a better understanding of the extent to which this will be a problem?

It seems to me that linters should be able to detect unsafe shadowing of instance variables, as in the obvious case of x = x.

For wrong-this runtime errors, the engine can generate an appropriately worded error message, but maybe that's not enough help to users?

@zenparsing
Copy link
Owner Author

Closing in preparation for public review. Feel free to continue discussion here; we may choose to re-open at a later time.

@zenparsing
Copy link
Owner Author

@hax

I can live without shortcuts, it could be separate proposal.

Unqualified access would actually have to be a part of this proposal, since it would be a backwards-incompatible change to add it later.

@littledan

If they do think in those terms, I'd be worried, as it would lead them to false conclusions about the semantics

What do you think would lead them to "false conclusions": the choice of var as the declarative keyword, or unqualified access?

@littledan
Copy link
Collaborator

littledan commented Mar 14, 2018

Well, I was referring to unqualified access (no matter what the keyword was), but with what @hax has written on several threads, it seems that at least they are coming to this false conclusion due to other things in this proposal; maybe the choice of var as a keyword is a factor.

@zenparsing
Copy link
Owner Author

@littledan By "false conclusions", do you mean the expectation of per-instance method binding? Or something more?

FWIW, I think that @hax has a pretty good understanding both the proposal and the extension described in this thread.

@littledan
Copy link
Collaborator

@zenparsing Yes, the expectation of per-instance method binding.

@hax
Copy link
Contributor

hax commented Mar 15, 2018

@littledan What I try to express in all my comments is:

When programmers use old closure private pattern, they do NOT expect per-instance method binding for its own purpose, this is why they will be trapped. So as the perspective of instance variable expectation which closure private pattern try to give, per-instance method binding is a bug in the language pattern. And this proposal fix it.

@hax
Copy link
Contributor

hax commented Mar 15, 2018

@zenparsing

Unqualified access would actually have to be a part of this proposal, since it would be a backwards-incompatible change to add it later.

After rethought of it, I believe even we do not support shortcut, we should throw SyntaxError for it. So it would still backwards-compatible change to add it later.

let x = 0
// ... many lines of code
class A {
  var x
  constructor(x = 0) {
    this->x = x // ok here
  }

  get x() {
    return x // I really mean this->x here
  }
  set x(value) {
    x = value // I really mean this->x here
  }
}

const a1 = new A()
const a2 = new A()
a1.x // 0, ok!
a2.x // 0, ok!
++a1.x // 1, ok!
++a2.x // 2, WTF??

It's ok to teach programmers you need this->x for var x in class. They will just think: I wish I can use x as shortcut, but how hard can this->x be? Unfortunately they may forget it in the beginning, it's very painful that you need to really learn things from a bug. It's even could be a big accident in production. So I think we'd better throw SyntaxError there, which protect programmers.

@littledan
Copy link
Collaborator

When programmers use old closure private pattern, they do NOT expect per-instance method binding for its own purpose, this is why they will be trapped. So as the perspective of instance variable expectation which closure private pattern try to give, per-instance method binding is a bug in the language pattern. And this proposal fix it.

This is a very specific claim about programmers and their intuitions.

I would've guessed the opposite--some group of programmers seem to always hope/assume that methods are bound (including from classes, where they never are); I'd expect that some of these are using the closure pattern and happy that they do, in fact, have these bound methods (with respect to closed-over values). If they use the closure desugaring as a mental model, they will be disappointed; @getify has pointed out that mental models that make incorrect predictions can be detrimental to learning.

Maybe we should talk to a larger sample of JS programmers about what their intuitions would be. It seems like we have the two hypotheses pretty clearly laid out here; let's get some data.

@getify
Copy link

getify commented Mar 15, 2018

I have several concerns to share, related to the idea of instance variables (aka, IIUC, "private members"), and shorthand/ellision of the qualifier. I'll try to organize these concerns into separate comments here to make them more digestable.

To start, a clarifying question: is "instance variable" indeed a "private member", or is there some other distinction I've missed?

If so, why then the distinction between "instance variables" (which are private) and "hidden methods" (which are also, basically, private)? Seems like it would be semantically cleaner to refer to both of these either as "hidden" or "private". What am I missing here?

@getify
Copy link

getify commented Mar 15, 2018

I wish shortcut syntax can also apply to hidden methods. It's a little inconsistent if we allow this->x shorten to x but not allow this->f() shorten to f().

I understand the desire for consistency and generally am very sympathetic to that argument.

But I find this particular case to be deeply troubling because it significantly obscures how I (and others) teach people to look at the call-site of a function to determine what that call's this is going to be.

f();

Is that call-site including an implicit this-> in front of it (symmetric with the this.f() binding)? Who knows!? It could just be a lexical reference to an f(), or it might be a "hidden method" this->f(). You'll have to look at the surrounding class definition to find out.

The single biggest confusion I find that both experienced and new-learners-of-JS have around this is that they're used to (from other languages) this being locked to a method -- that is, that a method belongs to a class and thus instance -- but in JS, methods aren't locked to classes/instances, and thus their this is always determined by the call-site.

It's already very difficult -- and by that, I mean only "uncommon" -- for people to fully grok the call-site this-binding rules, and there's only 4 of them. I'm extremely loathe to support any proposal that complicates these rules further by introducing yet another variation of call-site where there's a hidden/elided this binding.

It's also a slippery slope, because as soon as you do it for hidden methods, people will clamor for regular public instance methods to also have the same shorthand. And while we're at it, let's go ahead and shorthand instance properties!

@zenparsing
Copy link
Owner Author

Hi @getify !

To start, a clarifying question: is "instance variable" indeed a "private member", or is there some other distinction I've missed?

We intentionally avoid using the term "private" for both instance variables and hidden methods, because it tends to cause confusion in users that expect a statically-typed language version of "private", or TypeScript's version of "private".

Otherwise, the lookup semantics for instance variables match the lookup semantics for the current "private fields" proposal.

If so, why then the distinction between "instance variables" (which are private) and "hidden methods" (which are also, basically, private)?

Instance variables are attached to the instance. You can think of them either like "internal slots" or WeakMap entries, depending on your point of view. If the instance doesn't "have" the instance variable slot, then you get a TypeError.

Hidden methods are not attached to the instance at all. They are lexically scoped inside of the class body and can be used as if they were methods on the object by using the -> operator. They allow refactoring of method code within the class body without affecting the external interface of the class or its instances.

@getify
Copy link

getify commented Mar 15, 2018

@zenparsing I understand everything you just said, I believe, but it all sounds like spec-speak reasoning and not about naming that keeps things clean and clear for end-developers. IOW, those differences seem more to be related to implementation or internal details of JS than to the ergonomics of how an end-developer uses them and observes their behavior.

If you don't want to use "private", fine. But I still think that the end-developer will think about both of these as "hidden", and not pay that much attention to the nuance that one of them is on the prototype object and the other is on the this object.

From the perspective of a teacher, I would be more in favor of naming things in a way that makes sense to the end-developer than to the spec implementor. But I certainly understand why TC39's general inclination might be the opposite.

@getify
Copy link

getify commented Mar 15, 2018

With respect to shorthanding of "instance variables":

class A {
   var x;
   foo() {
      x = 2;
   }
}

I agree with @hax that this particular case of shorthanding is probably not that difficult for people to grok, if they're already familiar with the "closure-class" pattern. Of course, as time goes on, and more people either switch to ES6+ classes, or learn them exclusively, the number of people who will be familiar with that pattern will go down, relatively speaking.

But that's not what troubles me here. What troubles me is the idea that we're mixing concepts of closure and this binding, and that's especially acute because of the use of var as the "instance variable" declarator.

That confusion becomes more obvious when we introduce an inner function:

var x;

class A {
   var y;

   foo() {
     var z;

      x = 1;
      y = 2;
      this.y = 3;
      this->y  = 4;
      z = 5;

      return function inner() {
         x = 10;
         y = 20;
         this.y = 30;
         this->y  = 40;
         z = 50;
      };
   }

   output() {
      console.log( x, y, this.y, this->y, z );
   }
}

var o = new A();
var f = o.foo();

var m = {};
f.call( m );

o.output();     // ??
m;    // ??

Imagine having to explain the nuances of this example to a new learner of JS, specifically the differences between x,y and z here, and what that means for the outputs and the contents of m.

There's an open question... would the implicit this-> (like in the y = .. assignments) be static across the inner function boundary, or would it be dynamic according to the call-site like this is? In other words, is this-> going to be dynamic like this is, or is it going to be static more like super is?

I have experienced strong pushback from my students when they learn the this vs super differences, so I'm imagining the future nightmare of having to explain an implicit this-> as being like one or the other of those two, and why.

What I'm getting at is that conceptually, we really need to preserve a strong barrier between this oriented member access and lexical closure. Why? Because a core principle of JS's mechanics has always been that closure is always static and this is always dynamic.

Attempting to make a dynamic this variable look like it's a lexical variable, purely for convenience in saving keystrokes, means that the more a person understands about both this and lexical closure, the more they tend to reason about that usage as a static lexical closure. And that means they'll be more likely to get confused when the thing that looks like a static lexical closure is actually dynamic.

Unfortunately, the less they know about the differences, the easier it is to squint and pretend that one is the other. The more they learn, the more troubling the conflation will be.

In my courses, I spend significant effort helping create a strong semantic boundary between static lexical closure and dynamic this context. They're parallel systems that don't overlap. I think we should disfavor any change to the language that muddies these waters even further.

@getify
Copy link

getify commented Mar 15, 2018

Another open question: is the suggested shorthanding mandatory or optional? IOW, can x and this->x appear in the same code and both refer to the same instance variable?

I would find that optionality troubling as it creates a larger surface area for learners. If you're going to "allow" shorthanding (which I'm not generally in favor of), I'd suggest it should be mandatory and all the notions of this-> references should be dropped from the proposal.

@zenparsing
Copy link
Owner Author

There's an open question... would the implicit this-> (like in the y = .. assignments) be static across the inner function boundary, or would it be dynamic according to the call-site like this is?

If we went this direction the binding would have to close over the "this" value of the method definition directly inside of the class body (and not be simple sugar for this->). The OP has an example.

is the suggested shorthanding mandatory or optional?

The shorthand would have to be optional so that we can access the instance variables of other instances. That would be similar to other languages that allow eliding this references, e.g. C#, C++, Java, etc. Is it problematic in those languages to have both?

Regarding naming, early on we briefly considered hidden x, but the trouble there is that it leads to questions like: "why not static hidden y"?

Thank for the long example! Strangely enough, I find it pretty easy to follow. The f.call(m) expression would throw because the this value doesn't have a y instance variable. And output would throw because there's no z in scope. What do you think @hax ?

Just to be clear, I'm not necessary trying to advocate for this "C++-style" shorthand, but I do want to probe what really is wrong with it, rather than just assume it is wrong because it does something that hasn't been done before.

@getify Regarding the method extraction issue, do you think that we could solve that issue (not just for this shorthand, but for JS in general) by having a method-extraction operator?

For example:

let fn = &obj.method;

Could we then just tell developers that & is how you do method extraction, and leave it at that?

@js-choi
Copy link

js-choi commented Mar 15, 2018

@zenparsing @getify With regard to method extraction: For what it’s worth, smart pipelines with Additional Feature PF would address method extraction (along with function composition and partial application). It will be presented to TC39 next week as an option along with an alternative pipeline proposal.

let fn = +> obj.method;

@allenwb
Copy link
Collaborator

allenwb commented Mar 15, 2018

@getify

I understand everything you just said, I believe, but it all sounds like spec-speak reasoning and not about naming that keeps things clean and clear for end-developers.

Let me take a crack at it. First we haven't been using the term "members" instead we have say "class body elements" or "just class elements" but for the rest of this message I'll use "member" as a synonym for those terms.

Both instance variables and hidden method definitions are "hidden members" and are access using the -> operator. Typically, but not always, with this as the left operand.

Instance variables (as the name hopefully suggests) are per instance hidden members. Each instance of the class has its own distinct replication of each declared instance variable. Hidden methods are shared members. Each instance of the class share access to a single distinct function corresponding to each hidden method.

We could have (and might reconsider) substituted hidden for var as the keyword for instance variable member definitions. It's basically a difference in emphasis. We chose var because we thought it was more important to emphasize that these members had the per instance characteristic of instance VARiables. Hidden is implicit based upon the assumption that people would be taught that instance variables are always hidden. We used hidden explicitly for hidden method members because we already have concise method members that defined functions that are shared (although using a different mechanism) by all instances. The hidden keyword emphasizes what we thought was the most important distinction. That these are shared concise methods that are hidden from code outside of the class.

Thoughts?

BTW, Hope you've read the rationale document

@allenwb
Copy link
Collaborator

allenwb commented Mar 15, 2018

@getify The implicit access is not part of the current Classes 1.1 proposal, largely because of the sort of issues you have been mentioning. I don't think I could support a proposal that included them.

Yes, some (perhaps) many users think they would like to have them but IMO the conceptual complexity they introduce is not worth the minor textual brevity they provide.

@hax
Copy link
Contributor

hax commented Mar 16, 2018

@littledan

This is a very specific claim about programmers and their intuitions.

Sorry I didn't make it clear that it's not a claim of programmers intuitions, but the explanation of why they will be trapped in old closure private pattern, and how this proposal fix it.

I just realized we use the ambiguous word "per-instance method" here. When I talk about old closure private pattern, "per-instance method" is not bound because you do not have arrow function in ES3 era. Of coz you can make them bound, but it not belongs to closure private pattern. It's clear to programmers (I suppose) that normal functions (in old closure private pattern) and class methods (whatever it access instance variable with this->x or shortcut form x) are not bound, only arrow functions are bound.

@zenparsing "using public fields and arrow functions to create per-instance methods" which I see "per-instance bound method" may be much clear.

The problem of "per-instance method binding" in closure private pattern is, even it's not bound, but the consequence is very like half of it is bound, so let's say it's semi-bound 😜 .

This proposal fix it that you do not need to use buggy semi-bound "per-instance method binding" anymore but keep the good mapping of mental model of closure private pattern.

I would've guessed the opposite--some group of programmers seem to always hope/assume that methods are bound (including from classes, where they never are);

I understand it. If someone want to fix "semi-bound" problem before, the only choice is made them "full-bound", which will be clumsy in ES3 days but much easy by using arrow functions now. Unfortunately it can't solve the inconsistency between "per-instance bound method" and prototype-based/class methods which are always not bound. So they may hope/assume class methods are bound. I would sympathize with them, but if this was their "intuition", then such "intuition" is doomed and have to be conquered sooner or later.

I think most programmers use closure private pattern don't fully realize the "semi-bound" problem, this is why they will be trapped. And the "per-instance bound method" pattern in React is not for private, but for other reason.

So I believe this proposal's recall of closure private pattern does not imply "bound" semantic.

mental models that make incorrect predictions can be detrimental to learning.

I think the incorrect predictions come from other place like event listeners. Anyway, the keypoint is wherever such predictions/intuitions come from, they should be conquered because you should never expect bound semantic of class methods.

Maybe we should talk to a larger sample of JS programmers about what their intuitions would be. It seems like we have the two hypotheses pretty clearly laid out here; let's get some data.

As I explained (hope I make my words clear), even there are many programmers has such intuitions (I very doubt), it's not the problem of this proposal, it's the problem of other proposal like method extraction need to solve.

@hax
Copy link
Contributor

hax commented Mar 16, 2018

If you don't want to use "private", fine. But I still think that the end-developer will think about both of these as "hidden"

Some words about terms. As a non-English speaker, I'm fine with any keyword. But I try to give my understanding about the technical terms for your consideration.

When I see private it's very likely there are also public and protected. Though we always use "private" in JS as a analogy, but I don't think it's a good idea to keep use it if we have a "JS private" which very different to other languages, for example, no corresponding public and protected.

hidden is ok, but I still feel it's strongly convey "visibility" (css visibility:hidden and "hidden" in
page visibility API) which bring me back to public/private again.

I try to find a word in my poor English vocabulary, here is my list:

  • state, which is common word in component framework, and most agree states in component should be private and states are different to properties. This keep the advantage of var for differentiate property, without the risky shortcut lure, but you still need another keyword for methods.
  • internal, which convey internal api (implementation details) vs external api. It looks more straightforward/essential than hidden.
  • my, which convey ownership but won't bring us to property like own. my is not very common in programming language, I get it from perl and I just love it. It's a simple word, everyone (even a non-English pupil who start to learn programming with only basic vocabulary) understand it. And it's very ergonomic for read/write. This even made me consider my var though I rejected it soon.

With these words, there could be several options:

  1. Use var for instance variables with shortcut, internal for hidden methods without shortcut.
  2. Use var for vars with shortcut, prefix token like ~ for methods without shortcut. (I mentioned it before in other comments)
  3. Use my for both two. No shortcut.
  4. Use state + internal. No shortcut.
  5. Use state+ prefix token. No shortcut.

Here is my two rationale:

  • we should not support shortcut for hidden methods (I've been convinced by @getify comments, and I also find if we want to add support of limited initializer, var x = f() where f is a shortcut to hidden method would be a hole)
  • if use var, we should support shortcut for instance var; if not use var we'd better not support shortcut at all

@allenwb mentioned with is his alternative for var. I think if use with we also need to support shortcut because it's the original purpose of unlucky with syntax. (BTW, @getify it seems you didn't mention with and onevent html attributes which have implicit context semantic in your "there's only 4 of them" 😜 .)

I agree with @getify that people don't pay much attention on prototype or not, and many (like me) who never use "per-instance method" just use a easier way to differentiate: all methods/accessors are on prototype and the rest are on instance. So use two keywords seems not helpful for this purpose. But I understand var is chosen for make it far from property, which I'm very appreciate. And it also has the bonus of recall of old closure private pattern though it may not important.

I have to admit I'm deeply attracted to shortcut form of var so my first choice is always var + token. If we finally rule shortcut out, I would like to suggest one keyword my for all. And my third choice is state + internal.

Thank you for reading 🤓

@hax
Copy link
Contributor

hax commented Mar 16, 2018

What troubles me is the idea that we're mixing concepts of closure and this binding, and that's especially acute because of the use of var as the "instance variable" declarator.

@getify I would expect most js programmers have already dropped old var and switched to let/const in ES6+ class code. If they haven't, we must educated them before this proposal land. (Of coz, we also have the options of do not use var and shortcut, see my previous comment.)

I agree distinguish instanceVar, lexicalBinding will be a little trouble, you always need to check outer scopes (though IDE can help via highlighting/hinting), but most time you don't really need to distinguish them. When you do real coding (not a interview question which deliberately designed to confuse you), you refer a name for purpose, and it doesn't matter whether it's a instance var or lexical binding.

In most cases we write class as a self-contained module, and there are just instance variables and a few const bindings which easy to handle IMO. This very close to class practice in other languages, Java/C# for example. The omit of this. in these languages seems ok for programmers.

While we have the great power (with the risk of increasing difficulty to understand the code) of nesting classes in closures and versa, programmers use it for good purpose. If they use it wrong, nothing can save them. I don't think the nesting of closures and classes will be necessarily harder to understand than the corresponding deep nesting of only closures.

The burden of this->x vs this.x is inevitable, at least -> or :: will make programmers clear it's total different with . and [], while doomed this.#x vs this.x just confused more.

@getify
Copy link

getify commented Mar 19, 2018

@allenwb I still think it would be more clear to users to use "hidden instance variables" and "hidden methods". "Instance variables" is not going to sound different enough from "instance properties" to make it seem like they are hidden.

When the inevitable question comes up: "then what are non-hidden instance variables?", the simple answer is either "There aren't" or... "They're called 'instance properties'."

@getify
Copy link

getify commented Mar 19, 2018

@zenparsing I don't think "method extraction" mechanisms are going to do anything to address the inner function use-case I showed. Or am I missing something?

@getify
Copy link

getify commented Mar 19, 2018

Another thing I think is perhaps a confused concept in this whole class thing -- especially at it relates to mix the semantics with lexical scope -- is whether mentally, people should think of a class body itself as being "executed" each time the class is instantiated, or only once at class definition time.

Let me illustrate with the function-based module pattern, first:

var X = (function outer(){
   var a = 1;

   return {
      foo() { return ++a; }
   };
})();

X.foo();  // 2
X.foo();  // 3

Here, the var a = 1 is running each time the outer() IIFE runs, which in this case happens to run only once. The execution part is important, because it's the execution that actually causes the closure to come into existence and thus be observable.

Now consider:

function outer() {
   var a = 1;

   return {
      foo() { return ++a; }
   };
}

var X = outer();
var Y = outer();

X.foo();  // 2
Y.foo();  // 2

Point I'm making is, that var a in the outer() body is thought of as running each time the outer() runs, and that execution is critical to link it mentally to each "instance" of the module having its own closure. If you're thinking about how this code works, you have to be thinking about the entire outer() body running each time, otherwise the observable closure wouldn't make any sense.

Now, consider this snippet:

var inner = (function outer(){
   var a = 1;

   return function inner(){
      return {
         foo() { return ++a; }
      };
   };
})();


var X = inner();
var Y = inner();

X.foo();  // 2
Y.foo();  // 3!!

Here, we have a var a that only runs once, sort of at "definition time", but then the inner() body is the thing that runs each time. Again, if you're thinking about why a is observably shared across those two closures, the only way to properly reason about that is to think about when the execution of the var a happens.


So what does all of that have to do with the discussion of class?

class whatever {
   var a;

   constructor() { this->a = 1; }
   
   foo() { return ++this->a; }
}

var X = new whatever();
var Y = new whatever();

X.foo();  // 2
Y.foo();  // 2

For the same reasons as outlined above, to properly reason about the result there, you have to be thinking about the class body as executing each time the instantiation occurs. Otherwise, mentally, you're going to be expecting the var a to be static and shared across all instances.

But... thinking about the class body as executing each time seems quite foreign to not only traditional class thinking (the constructor is the only thing that runs each time) or JS's prototypal flavor, where the class body is all done at definition time and attached to the prototype.

I think it's troublesome and likely to lead to mental errors in new learners, to have to explain that the var .. parts of a class body run "each time", but only those parts. And I think that's the only way one could sanely explain the var .. and its access inside of class methods (especially with shorthand) as having to do at all with closure.

My point? Don't mix the idea of what's going on inside a class body with the idea of "closure". You're asking for people to trip over confusion if they try to think or reason about what's really going on. Moreover, don't use var as the keyword, in an attempt to sort of borrow familiarity from lexical closure thinking. Just stay entirely away from it.

I like using my a;, or maybe hidden a;, far better than var a.


Related: the "class initializer" in this proposal, which "only runs once at class definition time", is further reinforcement of the underlying assumed idea of this proposal, that everything in the class body is run "each" time, except for the stuff put inside that block. That's more confusing than it should be, and doesn't fit at all with my intuition.

I think it should be opposite: by default, everything in the class body happens once, at definition time, and for anything you want to happen each instantiation, that's either the job of the constructor, or if we really need a special declarative initializer block, then have an initializer block like instance { .. } or something.

@hax
Copy link
Contributor

hax commented Mar 21, 2018

@getify I think your examples in last comment are interesting, which I get a very different view from them.

Here is a modified example of class with instance var.

var foo = 0;

class Whatever {
   var bar;

   constructor(v) {
      bar = v;
   }
   
   test() {
      ++foo;
      ++bar;
      return `${foo}:${bar}`;
   }
}

var x = new Whatever(10);
x.test();  // 1:11
var y = new Whatever(20);
y.test();  // 2:21

x.test();  // 3:12
y.test();  // 4:22

I think it's not difficult for programmers to determine the values in this example, (I created a test to see how programmers use their intuition, if you like , you could spread the links to others.) Because

  1. new keyword in new Whatever will tell them x y are two different instance of Whatever.
  2. var bar in the class Whatever will be understand as a instance var bar which each instance has a individual value. If programmers want a cross-instances shared var, they will move var bar outside the class like var foo.

Note, I intentionally use shortcut syntax which omit this->, I guess if a programmer saw this example without knowledge of this proposal, he/she would very likely treat var bar as a instance var though there is no this-> to indicate instance semantic. Because we already have a way to share crossed-instances var now (write let/const outside the class), it seems very strange to add a syntax but provide no new functionality. Even if there may be a syntax for it, it should be static var which keep consistency of current static methods.


On the contrary, examples using nested closures are much harder to determine the semantics. But we need to figure out the factors. I try to restore the brain activity of reading them, see my comments to the code.

var X = (function outer(){  // <-  what's this function for? not very clear at this line
   var a = 1; // <- local var, but currently not know the life cycle 
              // whether it's a one-off or persistent 
   return {
      foo() { return ++a; } // <- ok, var a is captured, so it's a persistent private state.
   };
})(); // <- ok, this is a function module pattern, but what it return? 
      // let's move back to previous lines... 
      // ok it returns an object with a closure private state referenced by the foo method

X.foo();  // 2
X.foo();  // 3
function outer() { // <- quite good, it's a function declaration, 
                   // so don't need to consider function module pattern
   var a = 1; // <- local var, but currently not know the life cycle 

   return { // <- ok, each call of this function will give a new object
      foo() { return ++a; } // <- var a is captured, it's a closure private state
   };
} // <- ok, outer is a object maker

var X = outer();
var Y = outer();

X.foo();  // 2
Y.foo();  // 2
var inner = (function outer(){ // what's this function for? not very clear at this line
   var a = 1; // <- local var, but currently not know the life cycle 

   return function inner(){ // what's this inner function for? not very clear at this line
      return { // ok, each call of this function will give a new object
         foo() { return ++a; } // <- var a is captured, it's a closure private state, 
                       // but from which function? need to look backward...
                       // ok, come from outer, but still don't know what outer is
      };
   }; // <- ok, inner is a object maker
})(); // <- ok, outer is a function module pattern, but what it return? 
      // let's move back to previous lines... 
      // yes it's return inner function which is a object maker which return object with the state a
      // but where is state a come from? need to read again...

var X = inner();
var Y = inner();

X.foo();  // 2
Y.foo();  // 3!!

So I think these examples show at least 2 problems of nested closure:

  1. function is too overloaded for many things (the great power of closure!), and in many cases, you don't know what a function is really for at the beginning, you only get it when you reach the last line of it, and if you meet some other functions in this process, your brain will be easily overloaded.
  2. The life cycle of a var is not clear when it's declared.

Let's rewrite example 2 and 3 by introducing class with instance var (example 1 do not need class)

class outer { // <- quite clear, it's a class declaration
   var a = 1; // <- var in class, per-instance state
   foo() { return ++a; } // <- mutate per-instance a
}

var X = new outer();
var Y = new outer();

X.foo();  // 2
Y.foo();  // 2
var inner = (function outer(){ // IMO we do not need function module pattern because we
                           // already have real ESModule, but just leave it in this example
   let a = 1; // <- I change `var` to `let` which reflect current practice
             // `let` means it's very likely to be mutated at somewhere in the scope of `outer`

   return class inner { // ok, return a class
         foo() { return ++a; } // <- a is captured, and mutated, 
                       // it's not a instance var, so all instances will refer to the same a
   };
})();

var X = new inner();
var Y = new inner();

X.foo();  // 2
Y.foo();  // 3!!

IMHO, they are much clear than the closure versions, and unlikely to introduce the confusions.

What do you think?

@hax
Copy link
Contributor

hax commented Mar 21, 2018

A report of the investigation I created: hax/hax.github.com#44

Up to now, 132 people think instance variable (only use shortcut form) match their intuition, 20 people think class-level var should same as function var as their intuition. This poll also guide people use 👎 to denote they never think code like that. I intentionally add this to check how negative feeling would be. So even 20 people feel instance var does not match their first intuition but only 6 show they may think it's "wrong". On the contrary I'm not very surprised that 42 people think class-level var should never have same semantic as function var.

Note, most voters uptonow are Chinese programmers. But I keep the whole thread in bi-languages, so you can spread this investigation to other use your social media. And I guess/hope there will not be big difference between Chinese programmers and non-Chinese programmers 😅

@dou4cc
Copy link

dou4cc commented Mar 22, 2018

I come from the chinese thread. (hax/hax.github.com#44 (comment))

Regardless of symbol selection or psychology, instance variable is against not only prototype but also implementation of existing native methods.

Consider:

var foo = 0;

class Whatever {
   var bar;

   constructor(v) {
      bar = v;
   }
   
   test() {
      ++foo;
      ++bar;
      return `${foo}:${bar}`;
   }
}

var x = new Whatever(10);
x.test(); // "1:11"
var y = new Whatever(20);
y.test(); // "2:21"

Whatever.prototype.test.call(x); // ????
y.test.call(x); // ????
Whatever.prototype.test.call(class{var bar = 0}); // ????
Whatever.prototype.test.call(class{}); // ????
Whatever.prototype.test.call(class{var foo = 0}); // ????
(function(){return foo}).call(class{var foo = 0}); // new version of `with`?
(() => foo).call(class{var foo = 0}); // even in lambda?

However, usual code looks like:

[].map.call(["1", "2", "3"], a => +a); // [1, 2, 3]
[].map.call("123", a => +a); // [1, 2, 3]
[].map.call(null, a => +a); // TypeError

When binding context is needed, hax/hax.github.com#44 (comment) is just ok. To support switching to different context, what we need is not syntax.

@hax
Copy link
Contributor

hax commented Mar 22, 2018

@dou4cc
I think you may misunderstand instance var in this proposal. They are hidden/private states per-instance, so they have no relation with the prototype. If the method which access the hidden states is called on other context which is not the instance of the same class, it will throw runtime Error.

If you still have other questions I'm glad to answer it in my original thread hax/hax.github.com#44 , and you can also contact me via telegram.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants