Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec Preview: Union types #805

Closed
RyanCavanaugh opened this issue Oct 2, 2014 · 66 comments
Closed

Spec Preview: Union types #805

RyanCavanaugh opened this issue Oct 2, 2014 · 66 comments
Labels
In Discussion Not yet reached consensus Needs More Info The issue still hasn't been fully clarified Suggestion An idea for TypeScript

Comments

@RyanCavanaugh
Copy link
Member

Updated 10/3 (see comment for changelog)

| operator for types

This is a "spec preview" for a feature we're referring to as "union types". @ahejlsberg thought this up; I am merely providing the summary 😄

Use Cases

Many JavaScript libraries support taking in values of more than one type. For example, jQuery's AJAX settings object's jsonp property can either be false or a string. TypeScript definition files have to represent this property as type any, losing type safety.

Similarly, Angular's http service configuration (https://docs.angularjs.org/api/ng/service/$http#usage) has properties that of "either" types such as "boolean or Cache", or "number or Promise".

Current Workarounds

This shortcoming can often be worked around in function overloads, but there is no equivalent for object properties, type constraints, or other type positions.

Introduction

Syntax

The new | operator, when used to separate two types, produces a union type representing a value which is of one of the input types.

Example:

interface Settings {
    foo: number|string;
}
function setSetting(s: Settings) { /* ... */ }
setSettings({ foo: 42 }); // OK
setSettings({ foo: '100' }); // OK
setSettings({ foo: false }); // Error, false is not assignable to number|string

Multiple types can be combined this way:

function process(n: string|HTMLElement|JQuery) { /* ... */ }
process('foo'); // OK
process($('div')); // OK
process(42); // Error

Any type is a valid operand to the | operator. Some examples and how they would be parsed:

var x: number|string[]; // x is a number or a string[]
var y: number[]|string[]; // y is a number[] or a string[]
var z: Array<number|string>; // z is an array of number|string
var t: F|typeof G|H; // t is an F, typeof G, or H
var u: () => string|number; // u is a function that returns a string|number
var v: { (): string; }|number; // v is a function that returns a string, or a number

Note that parentheses are not needed to disambiguate, so they are not supported.

Interpretation

The meaning of A|B is a type that is either an A or a B. Notably, this is
different from a type that combines all the members of A and B. We'll explore
this more in samples later on.

Semantics

Basics

Some simple rules:

  • Identity: A|A is equivalent to A
  • Commutativity: A|B is equivalent to B|A
  • Associativity: (A|B)|C is equivalent to A|(B|C)
  • Subtype collapsing: A|B is equivalent to A if B is a subtype of A

Properties

The type A|B has a property P of type X|Y if A has a property P of type X and B has a property P of type Y. These properties must either both be public, or must come from the same declaration site (as specified in the rules for private/protected). If either property is optional, the resulting property is also optional.

Example:

interface Car {
    weight: number;
    gears: number;
    type: string;
}
interface Bicycle {
    weight: number;
    gears: boolean;
    size: string;
}
var transport: Car|Bicycle = /* ... */;
var w: number = transport.weight; // OK
var g = transport.gears; // OK, g is of type number|boolean

console.log(transport.type); // Error, transport does not have 'type' property
console.log((<Car>transport).type); // OK

Call and Construct Signatures

The type A|B has a call signature F if A has a call signature
F and B has a call signature F.

Example:

var t: string|boolean = /* ... */;
console.log(t.toString()); // OK (both string and boolean have a toString method)

The same rule is applied to construct signatures.

Index Signatures

The type A|B has an index signature [x: number]: T or [x: string]: T if both A and B have an index signature with that type.

Assignability and Subtyping

Here we describe assignability; subtyping is the same except that "is assignable to" is replaced with "is a subtype of".

The type S is assignable to the type T1|T2 if S is assignable to T1 or if S is assignable to T2.

Example:

var x: string|number;
x = 'hello'; // OK, can assign a string to string|number
x = 42; // OK
x = { }; // Error, { } is not assignable to string or assignable to number

The type S1|S2 is assignable to the type T if both S1 and S2 are assignable to T.

Example:

var x: string|number = /* ... */;
var y: string = x; // Error, number is not assignable to string
var z: number = x; // Error, string is not assignable to number

Combining the rules, the type S1|S2 is assignable to the type T1|T2 if S1 is assignable to T1 or T2 and S2 is assignable to T1 or T2. More generally, every type on the right hand side of the assignment must be assignable to at least one type on the left.

Example:

var x: string|number;
var y: string|number|boolean;
x = y; // Error, boolean is not assignable to string or number
y = x; // OK (both string and number are assignable to string|number)

Best Common Type

The current Best Common Type algorithm (spec section 3.10) is only capable of producing a type that was among the candidates, or {}. For example, the array [1, 2, "hello"] is of type {}[]. With the ability to represent union types, we can change the Best Common Type algorithm to produce a union type when presented with a set of candidates with no supertype.

Example:

class Animal { run(); }
class Dog extends Animal { woof(); }
class Cat extends Animal { meow(); }
class Rhino extends Animal { charge(); }
var x = [new Dog(), new Cat()];
// Current behavior: x is of type {}[]
// Proposed: x is of type Array<Dog|Cat>

Note that in this case, the type Dog|Cat is structurally equivalent to Animal in terms of its members, but it would still be an error to try to assign a Rhino to x[0] because Rhino is not assignable to Cat or Dog.

Best Common Type is used for several inferences in the language. In the cases of x || y, z ? x : y, and [x, y], the resulting type will be X | Y (where X is the type of x and Y is the type of y). For function return statements and generic type inference, we will require that a supertype exist among the candidates.

Example

// Error, no best common type among 'string' and 'number'
function fn() {
    if(Math.random() > 0.5) {
        return 'hello';
    } else { 
        return 42;
    }
}
// OK with type annotation
function fn(): string|number {
    /* ... same as above ... */
}

Possible Next Steps

Combining Types' Members

Other scenarios require a type constructed from A and B that has all members present in either type, rather than in both. Instead of adding new type syntax, we can represent this easily by removing the restriction that extends clauses may not reference their declaration's type parameters.

Example:

interface HasFoo<T> extends T {
    foo: string;
}
interface Point {
    x: number;
    y: number;
}
var p: HasFoo<Point> = /* ... */;
console.log(p.foo); // OK
console.log(p.x.toString(); // OK

Local Meanings of Union Types

For union types where an operand is a primitive, we could detect certain syntactic patterns and adjust the type of an identifier in conditional blocks.

Example:

var p: string|Point = /* ... */;
if(typeof p === 'string') {
    console.log(p); // OK, 'p' has type string in this block
} else {
    console.log(p.x.toString()); // OK, 'p' has type Point in this block
}

This might also extend to membership checks:

interface Animal { run(); }
interface Dog extends Animal { woof(); }
interface Cat extends Animal { meow(); }
var x: Cat|Dog = /* ... */;
if(x.woof) {
   // x is 'Dog' here
}
if(typeof x.meow !== 'undefined') {
   // x is 'Cat' here
}
@RyanCavanaugh RyanCavanaugh added Suggestion An idea for TypeScript Needs More Info The issue still hasn't been fully clarified In Discussion Not yet reached consensus labels Oct 2, 2014
@ivogabe
Copy link
Contributor

ivogabe commented Oct 2, 2014

If a property X exists in A or B but not both, the type A|B has an optional property X of type {} for the purposes of property access.

Why {}? It might be more logical to give it the type of the property X in A or B.

@johnnyreilly
Copy link

Yay!!!!!! Been waiting for this!

I was a bit confused that it says:

Note that parentheses are not needed to disambiguate, so they are not supported.

And then subsequently lists a rule which features parantheses:

Associativity: (A|B)|C is equivalent to A|(B|C)

@RyanCavanaugh
Copy link
Member Author

Why {}? It might be more logical to give it the type of the property X in A or B.

The intent is that you don't use foo as an A or a B until you've used a type assertion or other mechanism to actually "decide" which thing foo is. If we jammed on all the properties of A and B, you'd have a sort of nonsense object -- imagine code like this:

var x: Cat|Dog = /* ... */;
// One of these lines is guaranteed to fail
x.meow();
x.woof();

The other option on the table is to not have those properties at all, but there's concern that this makes code like if(x.meow) { /* x is Cat */ } too annoying to write.

And then subsequently lists a rule which features paratheses:

I couldn't come up with a more clear way to write this rule; the parens here are just for explanatory purposes. Consider code like this:

var x: string|number;
var y: number|boolean;
// a and b have the *identical* types string|number|boolean; the order of merging does not matter
var a: typeof x|boolean;
var b: string|typeof y;

@johnnyreilly
Copy link

Thanks for the clarification @RyanCavanaugh. I'm trying to think of scenarios where lack of parens would be a problem - instinctively I'm assuming there must be some! But it's first thing in the morning and I haven't had coffee yet... - I'm sure you guys covered that off.

I really like the "Local Meanings of Union Types" possible next step which adjust the type of an identifier in conditional blocks. I think this would be really useful. That said I think the rules that govern how this works need to be very clear. I'm also curious about the IDE experience - would hovering over the identifier in a conditional block reveal it as, for example, a Dog or a Cat|Dog. I'm hoping for the specific type rather than the union in this scenario.

@vvakame
Copy link
Contributor

vvakame commented Oct 2, 2014

Best Common Type
Combining Types' Members

Cool!!

Local Meanings of Union Types

please add instanceof to rule. 😉

and, I have a one question.

How do I can make type synonym for union types?

I came up with a hack of one.

// make synonym, but it is not exists actual library code.
declare var fooCommonReturnType: string | number;

interface IFoo {
    bar(): typeof fooBarReturnType;
    buzz(): typeof fooBarReturnType;
}

but it is not smart.
I want to use union type with #229.

I want to write the code for this image.

interface IFooCommonReturn {
    &this: string | number;
}

interface IFoo {
    bar(): IFooCommonReturn;
    buzz(): IFooCommonReturn;
}

@DanielRosenwasser
Copy link
Member

Do we have a special case for void | T? Should void | T end up being T, or is it helpful to maintain the void? I can see this as both useful as well as something that might turn out to be an anti-pattern.

@basarat
Copy link
Contributor

basarat commented Oct 2, 2014

Local Meanings of Union Types

An parentheis block { meaning of any variable type in general would be good:

var foo:number; 
if(true){
   // Do some magic here to make foo a string so we don't need casting below: 
   // I know someone said it was a number above ... but now I want to use it as a string
   var upper = foo.toUpperCase();
   var lower = foo.toLowerCase();
}

@samuelneff
Copy link

👍

@ivogabe
Copy link
Contributor

ivogabe commented Oct 2, 2014

The intent is that you don't use foo as an A or a B until you've used a type assertion or other mechanism to actually "decide" which thing foo is.

That sounds like a valid reason to me. But would this be allowed? My opinion would be yes, but following these rules it would be disallowed.

interface CanHaveXY {
    x?: number;
    y?: number;
}
interface HasX {
    x: number;
}
interface HasY {
    y: number;
}
var point1: HasX | HasY = ...;
var point2: CanHaveXY = point1;

@RyanCavanaugh
Copy link
Member Author

[incorrect response/example removed]

@ahejlsberg
Copy link
Member

@ivogabe It would be allowed. The proposed rule is that A|B is assignable to T if A and B are both assignable to T, and they would be in the given example.

@RyanCavanaugh The issue you call out really has nothing to do with union types. Consider:

var x: HasX = { x: 42, y: 'hello' };  // Forget about y
var point2: CanHaveXY = x;

This is already allowed today.

@danquirk
Copy link
Member

danquirk commented Oct 2, 2014

Another issue we talked about is the effect on generic type argument inference if we change best common type to return unions rather than {}. Consider:

declare function choose<T>(x: T, y:T): T;
var result = choose(1, "hm"); // today result is {}, with this it would be number|string

We'd previously considered adding an option to make it an error if type argument inference returns {}, we may just do the same thing here and make it an error for type argument inference to infer a union type unless that type exactly matches one of the candidate types. If anyone has specific uses for type argument inference to generate a union type that could be interesting.

@ahejlsberg
Copy link
Member

@DanielRosenwasser void is a subtype of any and no other types, so any|void becomes any but T|void never collapses to T otherwise.

@ivogabe
Copy link
Contributor

ivogabe commented Oct 2, 2014

👍

@ahejlsberg
Copy link
Member

@danquirk We definitely want to change best common type to be a union of the constituent types (which may collapse to a single type if one of the types is a supertype of all others). Type inference produces a type that is the best common type of the candidates, and that type will be a union type in cases where multiple unrelated types are candidates. I don't see how we can make it an error to infer a union type. Consider:

interface Animal {
    name: string;
}
interface Giraffe extends Animal {
    giraffeness: string;
}
interface Walrus extends Animal {
    walrusness: string;
}
function makePair<T>(a: T, b: T) {
    return [a, b];
}
var giraffe: Giraffe;
var walrus: Walrus;
var pair = makePair(giraffe, walrus);  // Array<Giraffe|Walrus>
var animals: Animal[] = pair;

This example fails without union types because pair is inferred to be {}[]. But with union types it succeeds because pair is an Array<Giraffe|Walrus> which is assignable to Array<Animal>. If we make it an error to infer a union type we'd be right back where we started (but for a different reason).

@danquirk
Copy link
Member

danquirk commented Oct 3, 2014

The argument is that that example is an error. If I wanted to allow makePair to be invoked with 2 arguments of different types then I would write the signature as:

declare function makePair<T, U>(a: T, b: U): Array<T|U>;

The version with a single type parameter is a subtype that is explicitly designed to only handle the case where both parameters are the same type and otherwise is an error. I don't think people write the single type argument case expecting it to allow different types for each argument. I do think people frequently use/write functional style combinators expecting an error if the provided list and predicate/selector are of incompatible types:

declare function isEmpty(s: string): boolean;
declare function map<T,U>(a: T[], f: (n:T) => U): U[];
var result = map([1,2,3], isEmpty); // result is boolean[], no error

Today we give no error here which is pretty bad. It's trivial to hit this kind of issue using underscore.d.ts and similar utility libraries. Currently you can only get safety here by using contextually typed lambas for your function typed arguments. I thought the frequency with which this has been an issue for us and users had led us to converging on a consensus that #360 was a good idea.

@ahejlsberg
Copy link
Member

If I wanted to allow makePair to be invoked with 2 arguments of different types...

But they're not of different types. They're both Animals, it just so happens that neither is a subtype of the other. This is not just a made up scenario, I run into this with regularity with the ?: operator where each branch is a subtype of some common type and I'm forced to add a type assertion.

In our offline discussion about the map example yesterday I pointed out that the real problem is the co- and contra-variant parameter rule and that perhaps we should give less preference to inferences made in such positions. We should think about that some more.

I do understand that sometimes a union type inference is not what you want, but I don't think it is always the case. I think this proposed change just trades one problem for another.

@RyanCavanaugh
Copy link
Member Author

I've logged #810 to track the suggestion from yesterday about only using function parameter types if there are no other inference candidates. I'll also be referencing #360 a bunch in this comment (Issue warning when generic type inference produces {}).

We've already split the baby, so to speak, by having different rules for x ? t : f than for [t, f]. Same for function return values and generic type inference - only one of those produces an error. Sometimes a failed BCT is an error, sometimes it isn't. As discussed earlier, it probably makes sense to continue enforcing that function return expressions produce a BCT that is in the candidate set, even in the presence of union types.

The question is simply which half of the baby generic type inference should end up in if we have union types - it's a heuristic question, and one that we should answer by looking at how generics are used in the wild.

I looked through JQuery and Knockout, minus Promises because they're too verbose, to look at where they used generics where the type parameter was consumed more than once somewhere in the parameter list. A rough scan of underscore shows approximately the same set of functions, though its definition file is a mess. I've grouped these functions according to their intended behavior from a programmer's perspective.

$.each<T>(collection: T[], callback: (index: number, elem: T) => any);
$.map<T, U>(array: T[], callback: (elementOfArray: T, indexInArray: number) => U): U[];
ko.utils.arrayForEach<T>(array: T[], action: (item: T) => void): void;
ko.utils.arrayMap<T, U>(array: T[], mapping: (item: T) => U): U[];

We've discussed this family at length. Obviously, $.each([1, 2, 3], (n: number, s: string) => { }) should not be a valid call. Today, the {} type does not flow out of the function, so it's impossible to produce a type error here from a mismatched function argument type. #360, #810, or a possible rule specifically disallowing the contravariance of union-typed parameters would fix these. Using union types here doesn't improve the situation without some separate fix.

$.inArray<T>(value: T, array: T[], fromIndex?: number): number;
ko.utils.arrayIndexOf<T>(array: T[], item: T): number;

A call of the form $.inArray('hello', [1, 2, 3]) should be rejected; it's nonsense to allow this. This doesn't even require parameter contravariance. Allowing T to form a type union here doesn't make this any better. This also extends to functions like areSame<T>(lhs: T, rhs: T): boolean. Only #360 would make these an error.

$.grep<T>(array: T[], func: (elementOfArray: T, indexInArray: number) => boolean, invert?: boolean): T[];
ko.utils.arrayFilter<T>(array: T[], predicate: (item: T) => boolean): T[];
ko.utils.arrayFirst<T>(array: T[], predicate: (item: T) => boolean, predicateOwner?: any): T;

These functions at least return {} today when the inputs do not match. I don't think it would be an improvement for arrayFilter([1, 2, 3], isUppercaseString) to return Array<number|string> (?!) instead of Array<{}>. These would be completely fixed by #360 or #810.

$.merge<T>(first: T[], second: T[]): T[];

It's possible that you want merge([1, 2], ['a', 'b']) to return string|number, though in that case you could write merge with two type arguments rather than one, as Dan mentioned above.

ko.utils.compareArrays<T>(a: T[], b: T[]): Array<KnockoutArrayChange<T>>;

You don't want compareArrays([1, 2], ['a', 'b']) to succeed and produce a plausible-looking type any more than you would want 1 !== 'x' to succeed. Again, #360.

ko.utils.arrayPushAll<T>(array: T[], valuesToPush: T[]): T[];

This function mutates the underlying array, so forming a union type in the case of ko.utils.pushAll([1, 2, 3], ['x', 'y']) is simply the wrong thing to do. We would correctly reject this outright if it were a call like [1, 2].concat(["x"]);. Again, no function parameter contravariance is needed for this error to be missed. This is only fixed by #810, and is made worse by union types because the return type isn't immediately obviously wrong ({}[]).


The only function that I see as being improved by creating union types (merge) is one that could have its definition changed to explicitly produce Array<T|U> if that was the desired behavior. It's definitely a trade-off, but it seems like nearly every function in common use loses from this instead of gains.

@tomByrer
Copy link

tomByrer commented Oct 3, 2014

Union types is a good idea.
I made a PR a few days ago where a perimeter previously only took a single number, I had expanded it to accept either a single number or a number array.

@RyanCavanaugh
Copy link
Member Author

Updates from 10/3 design meeting:

  • Disjoint properties are not present for the purposes of property access. {} was a bad idea here because anything is assignable to it. We may be able to use void instead, but will try just not having the properties at all at first to see how it goes
  • Due to implementation, number|string may appear as string|number if both types appear in a program. This should be nonconsequential
  • Merging of call signature sets occurs if their parameter lists are identical; return types of each signature get unioned together
  • The following BCT procedures will produce an error if the selected type is not in the candidate set:
    • Function return expressions
    • Generic type arguments
  • The remaining BCT procedures will union their types:
    • || operator
    • ? : operator
    • Array literal elements
    • Properties in an object literal contextually typed by a string indexer (probably unobservable, did not discuss)

@NoelAbrahams
Copy link

Regarding _Local Meanings of Union Types_, it is fairly common to return either a type T or an array of type T.

function foo(): string[]|string {
// ...
}

var result = foo();

if( Array.isArray(result)){
    result.map(item => item.replace(...));
}
else {
    result.replace(...);
}

Another alternative is result instanceof Array.

@DouglasLivingstone
Copy link

Perhaps this is too out-there, but if null was allowed as a type, it might be possible to redefine string as STRING|null, where STRING is a hypothetical non-nullable string, which could be used to check that, e.g., foo.bar.length has a STRING bar, never a null.

@NoelAbrahams
Copy link

@DouglasLivingstone see #185

@DouglasLivingstone
Copy link

@NoelAbrahams nice, thanks!

@jtheisen
Copy link

@RyanCavanaugh "Disjoint properties are not present for the purposes of property access."

That's very good, I hope that's how it stays. I was quite disturbed when I read the quoted bit of the first comment here. The disjoint type shouldn't have anything the only one of the summands have.

This is also important for something like intellisense, where you really don't want to have those non-properties listed.

This is an awesome feature. It makes TypeScript the first type-safe real-world imperative language with that kind of power in a type system.

@basarat
Copy link
Contributor

basarat commented Jan 2, 2015

What is the type guard syntax for array. I don't seem to find that in this thread. For example I get an error with the latest compiler on below:

function saySize(message: number | number[]) {
  if (message instanceof Array) {
    return message.length; // Error 
  }
}

@Arnavion
Copy link
Contributor

Arnavion commented Jan 2, 2015

@basarat You should also be getting "error TS2358: The left-hand side of an 'instanceof' expression must be of type 'any', an object type or a type parameter." on the previous line. Because of that it's not functioning as a type guard and the type isn't being narrowed inside the if block.

Using typeof instead of instanceof works: if (typeof message === "object") {

Edit: Note that this is only a problem because a primitive is one of the members of the union. If you had a class C and message was declared as having type C | number[] then both instanceof C and instanceof Array would work.

@basarat
Copy link
Contributor

basarat commented Jan 4, 2015

@Arnavion thanks. I did try it with classes, the following does not work

class Message {
    value: string;
}

function saySize(message: Message | Message[]) {
    if (message instanceof Array) {
        return message.length; // test.ts(7,24): error TS2339: Property 'length' does not exist on type 'Message | Message[]'.
    }
}

Not sure if its useful, the following works:

class Message {
}

function saySize(message: Message | Message[]) {
    if (message instanceof Array) {
        return message.length; // Okay
    }
}

@Arnavion
Copy link
Contributor

Arnavion commented Jan 4, 2015

Hmm, you're right. When I tested successfully with message: C | number[] and message instanceof Array, I did use an empty class for C. Adding a member to that class causes message instanceof Array to fail to narrow again.

Maybe open a new issue.

@eggers
Copy link

eggers commented Oct 28, 2015

Is it possible to have an interface extend a union type? For example:

interface Foo {
  foo: any;
}
interface Bar {
  bar: any;
}
interface FooBar extends Foo|Bar {
  fooBar: any;
}

I came across an issue with an DefinitelyTyped interface that has all optional params, so something like number will satisfy the interface, and won't result in a compile error. In reality, most properties are optional, but either uri or url must be specified. (The interface is request.Options is anyone is curious)

I'm aware that I could do the below, but then you FooBar isn't an interface anymore, and you can't further extend it (like request-promise does):

interface IFooBar {
  fooBar: any;
}

type FooBar = (Foo | Bar) & IFooBar

@mhegazy
Copy link
Contributor

mhegazy commented Oct 28, 2015

@eggers you can only use an object type (interface or class) in an extends clause. Also in the future, I would file these as a new issue instead of commenting on an outdated issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
In Discussion Not yet reached consensus Needs More Info The issue still hasn't been fully clarified Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests