Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REQUEST: Pattern matching/better type inferencing with generics. #5023

Closed
TonyValenti opened this issue Sep 4, 2015 · 34 comments
Closed

REQUEST: Pattern matching/better type inferencing with generics. #5023

TonyValenti opened this issue Sep 4, 2015 · 34 comments
Assignees
Labels
4 - In Review A fix for the issue is submitted for review. Area-Language Design Feature Request

Comments

@TonyValenti
Copy link

Hi Roslyn Team:
When I was reading one of the feature requests from earlier:
#2212 (comment)

It made me think of something I have really been wanting as of late:

Pattern matching/better type inferencing with generics.

Today I can write code like this:

        private static void Foo<T, X>(T Object) where T : IEnumerable<X>, IComparable<X> {
            //Do Something
        }

        private class MyClass : IComparable<String>, IEnumerable<String> {
            int IComparable<string>.CompareTo(string other) {
                throw new NotImplementedException();
            }

            IEnumerator IEnumerable.GetEnumerator() {
                throw new NotImplementedException();
            }

            IEnumerator<string> IEnumerable<string>.GetEnumerator() {
                throw new NotImplementedException();
            }
        }

        private static void CallMyFunction() {
            var C = new MyClass();

            Foo<MyClass, String>(C);
        }

There is nastiness is in how I have to call the function.

        private static void CallMyFunction() {
            var C = new MyClass();

            Foo<MyClass, String>(C);
        }

In this example, I feel as though Foo should automatically figure out the type parameters and I should be able to call the method as follows:

        private static void CallMyFunction() {
            var C = new MyClass();

            Foo(C);
        }
@gafter
Copy link
Member

gafter commented Sep 6, 2015

It sounds like you want the bounds of type variables to participate in type inference. I can see us taking such a well-specified and prototyped change.

@gafter gafter added the help wanted The issue is "up for grabs" - add a comment if you are interested in working on it label Sep 6, 2015
@TonyValenti
Copy link
Author

That sounds great! This will make my code so much cleaner. Thank you!

On Sunday, September 6, 2015, Neal Gafter [email protected] wrote:

It sounds like you want the bounds of type variables to participate in
type inference. I can see us taking such a well-specified and prototyped
change.


Reply to this email directly or view it on GitHub
#5023 (comment).

Tony Valenti

@gafter
Copy link
Member

gafter commented Sep 6, 2015

@TonyValenti I've labeled this "up for grabs" in case someone from the community has the time and interest to take it on.

@HellBrick
Copy link

I think I might be able to take a shot at this. I'm currently toying with a prototype of this that works by promoting generic constraints of inferred type parameters to fake arguments. The idea is to add a 4th step to the phase 2 of the inference that works this way:

  1. For every constraint type of every type parameter that has been successfully fixed a new fake argument is added. The fixed type is used as an argument type, its default value is passed as an argument expression and the formal argument type is set to the constraint type. This is done no more than once per type parameter.
  2. Phase 1 is invoked for the newly added arguments.
  3. Phase 2 body is considered to make some progress if any constraints were promoted, and is considered to make no progress otherwise.

Here's an example to illustrate the idea:

    class Item : IComparable<string>
    {
    }

    class Container : IEnumerable<Item>
    {
    }

    class Whatever
    {
        void DoSomething<TContainer, TItem, TComparand>( TContainer container )
            where TContainer : IEnumerable<TItem>
            where TItem : IComparable<TComparand>
        {
        }

        void Invoke()
        {
            DoSomething( new Container() );
        }
    }

The type inferrer that exists today is able to infer that TContainer is Container and then fails. Here's what happens if the idea I've described is implemented:

  • We have a type parameter TContainer that's fixed to Container. It has IEnumerable<TItem> constraint which is promoted to a fake argument like this:
    class Whatever
    {
        void DoSomething<TContainer, TItem, TComparand>( TContainer container, IEnumerable<TItem> x )
            where TContainer : IEnumerable<TItem>
            where TItem : IComparable<TComparand>
        {
        }

        void Invoke()
        {
            DoSomething( new Container(), default( Container ) );
        }
    }
  • Phase 1 is invoked for the argument x, providing us bounds for TItem type.
  • Phase 2 body is repeated like it typically does, fixing TItem to Item.
  • Old phase 2 logic can't progress it any further, but now that TItem is fixed, we can promote its constraint like this:
    class Whatever
    {
        void DoSomething<TContainer, TItem, TComparand>( TContainer container, IEnumerable<TItem> x, IComparable<TComparand> y )
            where TContainer : IEnumerable<TItem>
            where TItem : IComparable<TComparand>
        {
        }

        void Invoke()
        {
            DoSomething( new Container(), default( Container ), default( Item ) );
        }
    }
  • Phase 1 is invoked for the argument y
  • Phase 2 continues its loop and is now able to fix TComparand to string
  • The fake arguments obviously exist only in the type inferrer's imagination =)

So what do you guys think? This seems to be working for the simple scenarios it's designed for, but there's a lot of scary stuff in the type inference that I haven't even begun to understand, so it would be useful to get input from someone who has been familiar with this part of the compiler for more than a day ;)

@bbarry
Copy link

bbarry commented Nov 7, 2015

I presume your phase 2 would still fail if Item were this instead:

class Item : IComparable<string>, IComparable<IEnumerable<char>>
{
}

@HellBrick
Copy link

Yep, it fails for the same reason the following fails today (because in the end it boils down to exactly the same problem):

    class Item : IComparable<string>, IComparable<IEnumerable<char>> {}
    void Compare<T>( IComparable<T> comparable ) {}

    void Call()
    {
        Compare( default(Item) );
    }

@HellBrick
Copy link

I've submitted a PR with the prototype to provide something less abstract to discuss.

@MrJul
Copy link
Contributor

MrJul commented Nov 16, 2015

I know the Roslyn team is probably busy with Update 1, but it would be nice to get some feedback on @HellBrick's great PR :)

@gafter
Copy link
Member

gafter commented Nov 16, 2015

@HellBrick We'll take a look at this, thanks!

@gafter
Copy link
Member

gafter commented Nov 16, 2015

Removing "Up For Grabs" as @HellBrick has grabbed it!

@gafter gafter removed the help wanted The issue is "up for grabs" - add a comment if you are interested in working on it label Nov 16, 2015
@gafter gafter self-assigned this Nov 20, 2015
@gafter gafter added the 4 - In Review A fix for the issue is submitted for review. label Nov 20, 2015
@gafter
Copy link
Member

gafter commented Nov 20, 2015

@HellBrick Nice work! We're still discussing it (but mostly we have vacations that prevent us from meeting much until January).

@HellBrick
Copy link

@TonyValenti

How would I test your enhancements out myself?

When I was working on my prototype, I had to invoke the test cases manually to try out the changes I had made. But since then update 1 has happened, and according to the Getting started page it's now possible to debug a custom version of Roslyn via an experimental instance of Visual Studio. I haven't tried that myself yet, but I believe this would be the way to go today. So I would suggest following the document I've mentioned to set up the debugging experience, and after that you'll need to cherry pick my commits from https://github.com/dotnet/roslyn/pull/6644/commits to test the compiler with the improvements I've proposed.

Here is one of the cases I'm talking about:

//Templated Class Definition
public class TemplatedClass<TData> {
 public TData Parameter {get; private set;}

 public TemplatedClass(TData Parameter){
   this.Parameter = Parameter;
 }
}

//Creating a new instance of the class:
public static void Main(){
 //This should infer to TemplatedClass<String>("Hello!");
 var Test = new TemplatedClass("Hello!");
}

This doesn't really seem related to the the problem you originally described in this issue =) We've been talking about inferring generic types from the type constraints before (and it's precisely what my PR addresses); the constructor example you've just shown may deserve a separate issue.

@RedwoodForest
Copy link

I believe the constructor example is a duplicate of #35.

@gafter
Copy link
Member

gafter commented Jan 8, 2016

We just looked at the proposal. It appears to be a breaking language change (i.e. will change the behavior of existing code), and there doesn't appear to be any nice way to fix that.

Here is an example of how the break arises. Given the two methods

void M(object) {}
void M<T, U>(T t) where T : IEnumerable<U> {}

and existing invocation

M("foo");

the previous version of the language (without this change) would call the former method. With the new rules in place, it would call the latter method.

Our customers tar and feather us when we take breaking changes, so this, unfortunately, might kill the idea of this feature.

Having said that, don't lose all hope. We (the members of the language design meeting) are sad that we cannot take any breaking changes, even one such as this that seems like an improvement. We're wondering if it will be possible for us to make breaking changes if we also ship a tool that updates your code so that it has the same meaning as before. Having a mechanism for such a tool to exist would have made it easier for us to do many things in the past, and would have enabled us to avoid language-design contortions, for example due to new keywords (such as var, nameof and async) being identifiers in the previous version.

/cc @MadsTorgersen

@gafter
Copy link
Member

gafter commented Jan 8, 2016

/cc @jaredpar Please see the previous comment about a hypothetical "user-code-compatibility-updater".

@TonyValenti
Copy link
Author

I think having a tool that upgrades your code is an awesome idea and I've wanted something like that in the past for sure. I'm not the one building it, but it sounds like with Roslyn, it should not be an epic task to complete either. It also answers a question that I've had in my mind: "How messy will C# get?"

I love C# but I've seen how different languages add new things and I think, "Wow! What a great language feature but C# will never have that because it would be a breaking change to lots of existing stuff." Such a tool would enable C# to keep momentum moving forward and not be shackled by design decisions that were made 15 years ago.

@svick
Copy link
Contributor

svick commented Jan 9, 2016

@gafter I don't know much about it, but as far as I can tell, that's pretty much what Python 3 did with its 2to3 tool and the result wasn't very good.

Though probably the biggest difference between Python 3 and potential breaking C# 7 is that C# uses IL and so libraries don't need to be updated to be usable from the new language version.

(There really should be a separate issue for this.)

@Inverness
Copy link

A tool to upgrade code would certainly ease the burden and make the concept more attractive to code maintainers. In fact I'd say it's essential if you want to make backwards incompatible changes seem like a good idea to companies or organizations that have monolithic projects.

@svick I think the fact that this will not break binary compatibility and the ability for native libraries to work will allow C# to avoid the fire that started when Python went from 2 to 3.

As @TonyValenti mentioned, I too was worried about how new features could be added in ways that don't create code messes.

I'm guessing backwards incompatible changes will be tied to the C# language version you select at build time?

@iskiselev
Copy link

Probably generic type infer could be enabled with some explicit opt-in syntax, such as typing <> for it after class/function name. In this case original example would look like:

        private static void CallMyFunction() {
            var C = new MyClass();

            Foo<>(C);
        }

It looks worse, than fully automatic type reference, but it is backward compatible.

@TonyValenti
Copy link
Author

Hi All,
I wanted to post another scenario that would be awesome to support.

I've done some work in SMLNJ and Prolog one of the things I love about those languages is how well they can solve what variables/values go where.

Here is a somewhat nasty class series of class definitions.

    public class Revision {
    }

    public class Revision<TForInstance> : Revision {
    }

    public class Instance<TRevisionValue> {
    }

    public class VersionedObjectHelper<TObject, TValue>
        where TObject : Instance<TValue>, new()
        where TValue : Revision<TObject>, new()
        {
    }

   public partial class EntityHelper : VersionedObjectHelper<EntityObject, EntityValue> {
   }

    public class HelperController<THelper, TObject, TValue> : ApiController 
        where THelper : VersionedObjectHelper<TObject,TValue>, new()
        where TObject : Instance<TValue>, new()
        where TValue : Revision<TObject>, new()  {
    }

    public class EntityController : HelperController<EntityHelper, EntityObject, EntityValue> {

    }

I am using Generics as essentially a way of having compile-time "code gen" for identical operations across multiple types.

Here are two things that I find nasty about how I have to write my code:

  1. When it comes to constraints, I feel as though a constraint in a base class should be propagated back up into derived classes. For example, I should be able to have the following definitions:
public class Base<T> where T : new() {
}

public class Derived<T> : Base<T> {
}

In that example, Derived's T should "inherit" the new() constraint automatically because it is used in Base.

  1. Another thing that I've found that I have to do is create "dummy" generic parameters because those generic parameters might be used in a base class. In the example above, I have this definition that has a ton of unnecessary stuff in it:
  public class HelperController<THelper, TObject, TValue> : ApiController 
        where THelper : VersionedObjectHelper<TObject,TValue>, new()
        where TObject : Instance<TValue>, new()
        where TValue : Revision<TObject>, new()  {
    }

Instead, I would like to be able to write the definition as follows:

  public class HelperController<THelper> : ApiController 
        where THelper : VersionedObjectHelper<TObject,TValue>, new()
    }

and then use it by doing this:

    public class EntityController : HelperController<EntityHelper> {

    }

In my use of it, the EntityHelper is deconstructed/pattern matched and the TObject and TValue values are satisfied that way.

Things like that would certainly clean up a lot of Generics-based code.

@iam3yal
Copy link

iam3yal commented Jun 16, 2016

@gafter It's always possible to have something like M<>("foo"); to disambiguate, it was suggested in multiple posts before, what do you think about it?

Personally, I feel like this feature should be considered it takes away the verbosity of generics and hopefully you will come up with ways to do it! I'm really looking forward to this one.

p.s. A tool to modify the code is a great idea the question is whether people will want to run such a tool.

@gafter
Copy link
Member

gafter commented Jun 16, 2016

@eyalsk How does that make this not a breaking change?

@TonyValenti
Copy link
Author

Because today you have to specify type parameters when you call a
function. You can't simply do Foo<>(Param)

On Thu, Jun 16, 2016 at 10:47 AM, Neal Gafter [email protected]
wrote:

@eyalsk https://github.com/eyalsk How does that make this not a
breaking change?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#5023 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AM-qVl43IPPrkwASXIUQwbYVYevUvADsks5qMXAogaJpZM4F4Szu
.

Tony Valenti

@iam3yal
Copy link

iam3yal commented Jun 16, 2016

@gafter Well, can you do M<>("foo") today? afaik it's an error but dunno maybe I'm unaware and it's actually a breaking change, can you please elaborate?

@HellBrick
Copy link

I think adding a new syntax for enabling new type inference features is not a good idea. First, it would be very confusing to have two subtly different type inference mechanisms for M("foo") and M<>("foo"). Second, it doesn't fix the long-term problem of handling the breaking changes: even if the new syntax ships, there can easily be future improvements that will introduce new breaking changes for this new syntax and we will once again have exactly the same problem we're discussing right now.

So I believe introducing an analyzer-like mechanism of detecting and fixing the code affected by the breaking changes is the way to go here, like mentioned in #7850 (comment). I'm really curious about the details of how this is supposed to work though. @gafter, is there an issue/document that tracks/describes this? And if there isn't one, can you tell us what the current thinking on this matter is? I guess it has been discussed by LDM quite a few times during the years of developing the language, so you guys should have some knowledge of what the main challenges are and some ideas on solving them.

@iam3yal
Copy link

iam3yal commented Jun 16, 2016

@HellBrick

I think adding a new syntax for enabling new type inference features is not a good idea. First, it would be very confusing to have two subtly different type inference mechanisms for M("foo") and M<>("foo")

I totally agree with you that having a tool to modify the code and accommodate the required changes is way better than introducing a new syntax, simply because if there's no need for a new syntax then no one needs to learn anything new, the old way was always to introduce new syntax to evolve a language if and only if it's necessary so my proposal came before I knew a tool is proposed.

However, in cases that a tool wouldn't be feasible for whatever reason (and in all honesty I don't think that there should be a reason not to change it but still), would you rather not have it at all?

@HellBrick
Copy link

HellBrick commented Jun 17, 2016

However, in cases that a tool wouldn't be feasible for whatever reason (and in all honesty I don't think that there should be a reason not to change it but still), would you rather not have it at all?

If we're talking about the breaking changes in general, then I'd say it depends on the discoverability of the breaks. If the code just stops compiling after an upgrade, that's okay: you have a clear error and a pretty good idea how to fix it. But some breaking changes are next to impossible to detect manually, even if they are properly documented and understood. The inference changes caused by the current issue is a good example of that. Combing though all the overload calls that could have changed their behavior after recompiling is very difficult and error-prone even if you know the codebase well. So I think it wouldn't be nice to ship such a feature without an automatic way to find the breaks.

I don't think the tool is too difficult to write, though. Even if designing a proper upgrade path that provides a nice user experience across all possible IDEs and console compilers could be tricky, writing a simple Roslyn analyzer + code fix per a breaking change seems quite possible.

@ThatRendle
Copy link

ThatRendle commented Jun 21, 2016

This could really optimize LINQ extension methods (and similar).

Example: given this slightly more involved version of IEnumerable

    public interface IExplicitEnumerable<out T, out TEnumerator>
        where TEnumerator : IEnumerator<T>
    {
        TEnumerator GetEnumerator();
    }

one could write generic forms of LINQ extension methods that look something like this:

    public static class InferredLinq
    {
        public static void Run<T, TAble, TAtor>(this TAble source, Action<T> func)
            where TAtor: IEnumerator<T>
            where TAble : IExplicitEnumerable<T, TAtor>
        {
            var enumerator = source.GetEnumerator();
            while (enumerator.MoveNext())
            {
                func(enumerator.Current);
            }
            enumerator.Dispose();
        } 
    }

The authors of types that implement IEnumerable<T> then have the option of implementing this extra interface with IExplicitEnumerable<T, CustomEnumerator<T>>.

Advantages:

  • For all types, method invocation against the type itself instead of indirect via the interface (i.e. faster)
  • For value types, no boxing allocations

It's possible at present if you supply the explicit type parameters:

    range.Run<int, Range, Range.RangeEnumerator>((n) => Console.WriteLine(n.ToString()));

but that's hideous, especially since you have to remember what the CustomEnumerator<T> class was for any implementing type. Compiler inference as proposed here would take care of that.

@masaeedu
Copy link

masaeedu commented Jul 15, 2016

+1 for letting the pressure for a backwards-compat analyzer continue to build, rather than relieving it by contorting language features to squeak past backwards-compat issues. @gafter would you folks be more open to proposals with backward compat issues if they came with Roslyn refactorings to address them? It shouldn't be too difficult to write one that addresses the problem you brought up, for example, although I haven't thought about all edge cases.

@Thaina
Copy link

Thaina commented Dec 24, 2016

Wish for this too

@gafter
Copy link
Member

gafter commented Mar 27, 2017

We are now taking language feature discussion in other repositories:

Features that are under active design or development, or which are "championed" by someone on the language design team, have already been moved either as issues or as checked-in design documents. For example, the proposal in this repo "Proposal: Partial interface implementation a.k.a. Traits" (issue 16139 and a few other issues that request the same thing) are now tracked by the language team at issue 52 in https://github.com/dotnet/csharplang/issues, and there is a draft spec at https://github.com/dotnet/csharplang/blob/master/proposals/default-interface-methods.md and further discussion at issue 288 in https://github.com/dotnet/csharplang/issues. Prototyping of the compiler portion of language features is still tracked here; see, for example, https://github.com/dotnet/roslyn/tree/features/DefaultInterfaceImplementation and issue 17952.

In order to facilitate that transition, we have started closing language design discussions from the roslyn repo with a note briefly explaining why. When we are aware of an existing discussion for the feature already in the new repo, we are adding a link to that. But we're not adding new issues to the new repos for existing discussions in this repo that the language design team does not currently envision taking on. Our intent is to eventually close the language design issues in the Roslyn repo and encourage discussion in one of the new repos instead.

Our intent is not to shut down discussion on language design - you can still continue discussion on the closed issues if you want - but rather we would like to encourage people to move discussion to where we are more likely to be paying attention (the new repo), or to abandon discussions that are no longer of interest to you.

If you happen to notice that one of the closed issues has a relevant issue in the new repo, and we have not added a link to the new issue, we would appreciate you providing a link from the old to the new discussion. That way people who are still interested in the discussion can start paying attention to the new issue.

Also, we'd welcome any ideas you might have on how we could better manage the transition. Comments and discussion about closing and/or moving issues should be directed to #18002. Comments and discussion about this issue can take place here or on an issue in the relevant repo.


As described in #5023 (comment), the LDM looked at this and found it to be a breaking change, and one that we would therefore be quite unlikely to ever take.

@declard
Copy link

declard commented Nov 9, 2019

I'm wondering why a syntax like Method<Type1, _, _, _, Type5<_>, _>(args) was not proposed. It seems to be backward compatible - just tell the compiler "well, i have some generic parameters there, could you infer them for me from the context?"

@ahmednfwela
Copy link

why was this ruled out as a breaking change ?
it appears the only case where this would be a breaking change is if there is ambiguity, so why not fallback to old behavior of needing to explicitly define all type arguments when there is ambiguity ?

so for this case:

void M(object) {}
void M<T, U>(T t) where T : IEnumerable<U> {}

M("foo")
  1. determine if there are overloads for M with ambiguous constraints
  2. if no: assign the type arguments based on input
  3. if yes: check for a flag (e.g. PreferMostStrictTypeArgumentOverload)
  4. if flag is set: use the M<T,U> overload
  5. if flag is not set (default): use the M(object) overload

@CyrusNajmabadi
Copy link
Member

@ahmednfwela language design happens over at dotnet/csharplang now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - In Review A fix for the issue is submitted for review. Area-Language Design Feature Request
Projects
None yet
Development

No branches or pull requests