Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully support generics, improve alignment with PHP #5317

Closed
mindplay-dk opened this issue May 8, 2015 · 15 comments
Closed

Fully support generics, improve alignment with PHP #5317

mindplay-dk opened this issue May 8, 2015 · 15 comments

Comments

@mindplay-dk
Copy link

After some days of testing, our team consistently found that the generic type system seems to basic and not very useful.

To give a couple of quick examples of things we tried that, to our suprise, didn't work:

<?hh

class Repository<T> {
  // ...
}

class User {
  // ...
}

$user_repo = new Repository<User>(); // nope :-(

And:

<?hh

class Hook<T> {
  // ...
}

function hook<T>() : Hook<T> {
  return new Hook<T>(); // nope :-(
}

$some_hook = hook<LoginEvent>(); // nope :-(

It seems that generics work only in declarations (extends, implements, etc.) but never in expressions? That's not very useful and prevents many common and useful patterns with generics.

It appears to be this way because generics are only type-hints and don't really seem to have a run-time footprint? To explain this better, here's an example of something class-related in PHP:

<?php

class Foo {
    public static function hello() {
        var_dump(get_called_class());
    }
}

class Bar extends Foo {
}

Foo::hello(); // => "Foo"

Bar::hello(); // => "Bar"

Point being, type-names generated at call-time actually "exist" and can be obtained at run-time. The same is true for most type-related things in PHP - reflection will let you discover lots of type-related information at run-time.

In contrast, generic type arguments in Hack don't seem to "exist" at run-time - they are static type-hints only? This doesn't seem true to PHP as such - for example, it would seem natural to expect something like the following to work:

class Foo<T> {}

class Bar {}

$foo = new Foo<Bar>();

$class = new ReflectionClass($foo);

echo $class->getTypeArgument(0)->getClass()->getName(); // => "Bar"

We encountered other surprising behaviors, but I'm not going to get into them all - bottom, there were too many surprises and shortcomings for developers who are familiar with generics in other languages like C# and TypeScript.

@paulbiss
Copy link
Contributor

paulbiss commented May 8, 2015

Hack generics aren't reified (this is intentional), and are inferred rather than explicit on objects. I would encourage you to look more carefully at the docs to understand how they're used (we've found them to be quite powerful).

http://docs.hhvm.com/manual/en/hack.generics.php

@paulbiss paulbiss added the hack label May 8, 2015
@paulbiss
Copy link
Contributor

paulbiss commented May 8, 2015

cc @jwatzman

@jwatzman
Copy link
Contributor

jwatzman commented May 8, 2015

Can you give a realistic example of something you tried to do and couldn't? We've found them to cover about 95% of the use cases we have here. So while we love using them extensively at FB, there are places they don't cover, and I'm interested to see specifically what you are running into. (The examples above are just toy examples that don't tell me anything about what you're actually trying to do.)

As to why they aren't reified, it turns out to be an extremely complex, involved feature to add to the runtime. We hope to be able to do it eventually, but not in the short or medium term.

@mindplay-dk
Copy link
Author

Can you give a realistic example of something you tried to do and couldn't?

The two examples I just gave highlight my main point - the inability to explicitly pass type-arguments to functions or constructors. I could post more code, but that wouldn't highlight the problem.

I can deal with generics not getting reified - the same is true for e.g. TypeScript, and I found that to be just as useful as other languages with generics. Reified is not the main issue for me, and it sounds like it might be more difficult to implement than would be worthwhile, so I'm not going to push for that.

As far as type-hinting though, it seems really crippling, being forced to extend classes or implement interfaces just to be allowed to provide type arguments. It's extremely counter-intuitive - the lack of means to provide type-arguments in statements should not drive that kind of decision-making.

Here's an example of a pattern in TypeScript that I was trying to implement in Hack. As you can see, TS provides IDE support with auto-complete and type-checking for every line of code - and as the JS output demonstrates, basically nothing is reified.

If Hack isn't going to provide generics that work all the time, everywhere, then we're going to need something like php-doc for type-hints everywhere else, to get the same level of IDE support and static type-checking we have in Php Storm with plain PHP now. We settled on Atom and the steelbrain add-ons, which seems to be the best option right now (?) although that means no static checking or IDE support for e.g. Composer libraries and other PHP code annotated with php-doc, which, we concluded, is worse than the status quo. (don't take that the wrong way, please - we concluded the same thing about TypeScript + Node while evaluating that as an alternative to PHP; while we found that TS has a more complete type-system than Hack, other aspects of TS made it not suitable for what we're trying to do.)

@jwatzman
Copy link
Contributor

The two examples I just gave highlight my main point - the inability to explicitly pass type-arguments to functions or constructors. [...] As far as type-hinting though, it seems really crippling, being forced to extend classes or implement interfaces just to be allowed to provide type arguments.

I still don't understand why this is the case -- you haven't actually provided enough code in your examples for me to point to "here is what you're doing wrong". For example, here's my attempt at filling out a (still trivial) body for the Repostitory<T> example, and showing how type inference just Does The Right Thing:

<?hh // strict

class Repository<T> {
  private Vector<T> $repo = Vector {};

  public function add(T $x): void {
    $this->repo[] = $x;
  }

  public function get(): T {
    return $this->repo[0];
  }
}

class User {
  public function frob(): void {
    // ...
  }
}

function f(): void {
  $repo = new Repository(); // Inferred to be Repository<User>
  $repo->add(new User());
  $repo->get()->frob(); // No error
}

function g(): void {
  $repo = new Repository(); // Inferred to be Repository<int>
  $repo->add(42);
  // $repo->get()->frob(); // Error: can't call frob() on an int
}

function h(): Repository<User> {
  $repo = new Repository(); // Inferred to be Repository<int>
  $repo->add(42);
  // return $repo; // Error: int is incompatible with User
}

@mindplay-dk
Copy link
Author

It wasn't working for me at all because there's no type-checking in global scope...

But anyway.

So in f(), it's able to "infer" User for $repo = new Repository(); from the subsequent $repo->add(new User());? So this works just because those two statements are in the same block scope? That's odd. I've never seen type inference implemented that way in any language.

But that will only work as long as you're doing something to trigger the type-hinting.

So in the case of a repository, what happens if you only need to get from the repository within the same block scope?

function f(): void {
  $repo = new Repository();
  $repo->get()->frob();
}

There's no way that could work?

Your example seems contrived - in the case of a repository, needing to add and then fetch an object from the same repository, within the same block scope, is an unlikely use-case.

The much more likely use-cases are (1) you only need to write, in which case it just infers whatever you pass, whether that happens to be correct or not, and (2) you only need to read, in which case you can't infer a return type.

Inference is a nice supplement to type-hinting for some cases - it's not an alternative, there will always be cases where you need to explicitly type-hint.

Point me to any other language with generics where type-hinting via inference is the only option?

@jwatzman
Copy link
Contributor

But that will only work as long as you're doing something to trigger the type-hinting.

Well, yes, otherwise, how can the callee possibly know anything about the type parameter, i.e., have a value of type T, in a language with type-erasure semantics for generics? In your example for function f, I assume your Repository<T> has a get function which returns T. But how does it get a T at all in the first place, since T is only used in covariant positions? Can you sketch out an actual implementation of Repository?

@mindplay-dk
Copy link
Author

Let's say the back-end behind repository is a key-value store with serialized objects - and the primary keys are globally unique, e.g. GUIDs. The implementation doesn't need to know the type of T at all - but the consumer very much does need to know.

Also consider the other example, the TypeScript hook implementation - there is no alternative to explicit type-hinting in this case, short of passing a dummy model instance to the constructor or something, just as a means of type-hinting, which would be totally bizarre.

The point is, these were two of the three first things I attempted in Hack that I've been doing in other languages with generics, and they can't be done - at least not in any way that makes sense. I'm sure these wouldn't be the last cases I encounter. My colleagues were equally baffled.

Generics don't make sense to me in a gradually-typed language without the ability to type-hint explicitly - as demonstrated, there will be cases where it doesn't suffice. Having to rethink your entire problem domain in order to get around this shortcoming isn't acceptable. I bet that's why most other languages have this feature. TypeScript, Dart, Haxe and ActionScript to name a few...

@Arilas
Copy link

Arilas commented May 18, 2015

I've also investigate to current Generics implementation and have some example based on Doctrine ORM (sorry If I've have mistakes in "as" operator):

interface EntityManagerInterface {
    public function getRepository(string $className):?EntityRepositoryInterface;
}

and have some Repository interface:

interface EntityRepositoryInterface<T as EntityInterface> {
    public function find(int $id): ?T;
}

So in EntityManager Implementation we may have many Repositories, that are related to Entity:

class User implements EntityInterface {}
class AbstractRepository<T> implements EntityRepositoryInterface<T> {
    //Implementation
}
class EntityManager implements EntityManagerInterface {
    public function getRepository(string $entityClassName): ?EntityRepositoryInterface {
        $repository = new AbstractRepository<$entityClassName>($this);
        return $repository;
    }
    //..
}

Currently we may work-around for this:

class AbstractRepository<T> implements EntityRepositoryInterface<T> {
    public function __construct(EntityManagerInterface $em, T $obj) {
        $this->em = $em;
        //$obj is only memory waste
    }
    //....
}
$repository = new AbstractRepository($this, new $entityClassName());

So in current example you may see that:

  1. You may statically check all code that are related to working with Repositories and that output of Repository find have correct class
  2. Generics is implemented like on another languages (Java for example)
  3. It's more clear to understand.

@mindplay-dk
Copy link
Author

After giving this some thought, I do think that generic type-hints need to be more than static - the static-only approach is too far removed from the dynamic nature of PHP. Things have types in PHP, at run-time.

I understand that it's more work, but the current approach is inconsistent with PHP at large, and doesn't play with Reflection at all.

Just my $.02

@jwatzman
Copy link
Contributor

Let's say the back-end behind repository is a key-value store with serialized objects - and the primary keys are globally unique, e.g. GUIDs. The implementation doesn't need to know the type of T at all - but the consumer very much does need to know.

For this use case, you may be better off with an upcoming feature called "type constants", which we still need to finish up and document properly. They can't be arbitrarily specified by callers, but this constraint means that they can be reified, and I think they'll fit your use case here. Sorry we don't have great docs on it yet, look for them soonish.

I understand that it's more work

That's... an understatement :) Reified generics aren't just "more work", but "a very complicated, fundamental change to how the runtime works".

It's worth noting that Java has erasure for its generics as well, so while I agree I wish they were reified, there definitely is precedent for doing it this way in a major language, one with widespread use and generally working generics.

@mindplay-dk
Copy link
Author

It's worth noting that Java has erasure for its generics as well

Java has some type erasure, but it also has this - even if types aren't always fully reified, at least explicitly parametrized types can be type-checked and reflected at run-time.

Whether that's possible or not, explicit type-hinting is possible in Java and every other language I know of - including languages like TypeScript, where all type-checks are strictly annotations.

@jazzdan
Copy link
Contributor

jazzdan commented Dec 9, 2015

Just to update this is anyone is following along: type constants now are fully documented.

@lexidor
Copy link
Collaborator

lexidor commented May 23, 2020

I am going over old issues on this repository, to see which ones apply to the current versions of hhvm.

I have tried internalizing the core of this issue and it appears you want runtime generics.
When have runtime generics on an opt-in basis in hhvm 4.17.0 and above. These are reified generics. With the help of the <<__Newable>> and <<__Enforceable>> attributes, they behave just like "real" typenames, like User and Repository would.

About your mentions of how the type inference is weird. It was more confusing back in the day, when [unresolved] was a common sight. Since hhvm 4.7.0, this concept is modeled differently and in a sound way. You are also allowed to specify the generics yourself (helping the inference out), which may not have been possible in the hh_client version you are talking about.

<<__EntryPoint>>
function main(): void {
    $user_repo = new Repository<User>();
    $user_repo->t = 0;
    //    Typing[4110] Invalid assignment
    //   --> file.hack
    //  3 |     $user_repo = new Repository<User>();
    //    |                                 ^^^^ Expected User
    //  4 |     $user_repo->t = 0;
    //    |     ^^^^^^^^^^^^^^^^^
    //    |                     ^ But got int
    //  7 |     public ?T $t;
    //    |             ^   via this generic T
}
class Repository<T> {
    public ?T $t;
}
class User {}

This issue is complex and I might be closing this prematurely. If this is the case, please reopen the issue and give me guidance on how to do better.

Please understand, I am from a different time in hhvm's life. The earliest hhvm version I have used is hhvm 3.25.3, which was in making a project forwards compatible. I have not used a version below 3.30 extensively. These versions were released in May and December of 2018 respectively. I might very well be missing a vital reference frame which is just completely lost in today's Zeitgeist.

@lexidor lexidor closed this as completed May 23, 2020
@azjezz
Copy link
Contributor

azjezz commented May 23, 2020

This issue has been solved thanks to reified generics.

a basic example :

interface Entity {}

interface IEntityManager {
  public function find<<<__Enforceable>> reify T as Entity>(
    classname<T> $class,
    string $identifier,
  ): T;
}

final class DummyEntityManager implements IEntityManager {
  public function find<<<__Enforceable>> reify T as Entity>(
    classname<T> $class,
    string $identifier,
  ): T {
    if ($class === User::class) {
      return new User($identifier) as T;
    }

    exit('dummy');
  }
}

interface IRepository<<<__Enforceable>> T as Entity> {
  public function find(string $identifier): T;

  public function findAll(dict<string, string> $criteria = dict[]): vec<T>;
}

class Repository<<<__Enforceable>> reify T as Entity>
  implements IRepository<T> {
  public function __construct(protected IEntityManager $em) {}

  public function find(string $identifier): T {
    return $this->em->find<T>(T::class, $identifier);
  }

  public function findAll(dict<string, string> $criteria = dict[]): vec<T> {
    return vec[
      $this->find($criteria['id']),
    ];
  }
}

class User implements Entity {
  public function __construct(private string $id) {}
}

class UserRepository extends Repository<User> {}

<<__EntryPoint>>
function main(): void {
    $repo = new Repository<User>(new DummyEntityManager());
    $user = $repo->find('foo');

    $repo = new UserRepository(new DummyEntityManager());
    $user = $repo->find('bar');
}

However, the need for a base class for entities is still required as hack still doesn't have to object type.

this would be completely solved once #8461 is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants