Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex collections should implement generic collection interfaces #13933

Closed
justinvp opened this issue Dec 16, 2014 · 51 comments
Closed

Regex collections should implement generic collection interfaces #13933

justinvp opened this issue Dec 16, 2014 · 51 comments
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.RegularExpressions
Milestone

Comments

@justinvp
Copy link
Contributor

CaptureCollection, GroupCollection, and MatchCollection currently only implement the non-generic ICollection interface. These collections should implement the generic collection interfaces to better interoperate with more modern APIs, such as LINQ. Since these collections are already indexable, they should implement IList<T> and IReadOnlyList<T>, as well as the non-generic IList (to be consistent with the generic interfaces).

Rationale and Usage

This is certainly a nice-to-have, but it is a long-standing request that developers still ask about. Implementing the generic interfaces will allow these collections to be used more easily with LINQ and interoperate better with more modern framework and library APIs.

For example, to use these collections with LINQ right now you have to know about and remember to use Enumerable.Cast<TSource>() to cast the non-generic IEnumerable into an IEnumerable<T>:

var captures =
    from capture in match.Groups.Cast<Group>().Last().Captures.Cast<Capture>()
    select capture.Value;

With these changes you'd no longer have to do that:

var captures =
    from capture in match.Groups.Last().Captures
    select capture.Value;

Plus, in the above example, you'd get a performance improvement when using Enumerable.Last<TSource>.() as its implementation has a fast-path for collections that implement IList<T>.

Proposed API

// DebuggerDisplay and DebuggerTypeProxy added
[DebuggerDisplay("Count = {Count}")]
[DebuggerTypeProxy(typeof(RegexCollectionDebuggerProxy<Capture>))]
// Previously only implemented ICollection
public class CaptureCollection : IList<Capture>, IReadOnlyList<Capture>, IList
{
    // Existing members
    public int Count { get; }
    public Capture this[int i] { get; }
    public IEnumerator GetEnumerator();
    object ICollection.SyncRoot { get; }
    bool ICollection.IsSynchronized { get; }
    void ICollection.CopyTo(Array array, int arrayIndex);

    // Proposed members
    public void CopyTo(Capture[] array, int arrayIndex);
    IEnumerator<Capture> IEnumerable<Capture>.GetEnumerator();
    int IList<Capture>.IndexOf(Capture item);
    void IList<Capture>.Insert(int index, Capture item);
    void IList<Capture>.RemoveAt(int index);
    Capture IList<Capture>.this[int index] { get; set; }
    void ICollection<Capture>.Add(Capture item);
    void ICollection<Capture>.Clear();
    bool ICollection<Capture>.Contains(Capture item);
    bool ICollection<Capture>.IsReadOnly { get; }
    bool ICollection<Capture>.Remove(Capture item);
    int IList.Add(object value);
    void IList.Clear();
    bool IList.Contains(object value);
    int IList.IndexOf(object value);
    IList.Insert(int index, object value);
    bool IList.IsFixedSize { get; }
    bool IList.IsReadOnly { get; }
    void IList.Remove(object value);
    void IList.RemoveAt(int index);
    object IList.this[int index] { get; set; }
}

// DebuggerDisplay and DebuggerTypeProxy added
[DebuggerDisplay("Count = {Count}")]
[DebuggerTypeProxy(typeof(RegexCollectionDebuggerProxy<Group>))]
// Previously only implemented ICollection
public class GroupCollection : IList<Group>, IReadOnlyList<Group>, IList
{
    // Existing members
    public int Count { get; }
    public Group this[int groupnum] { get; }
    public Group this[String groupname] { get; }
    public IEnumerator GetEnumerator();
    object ICollection.SyncRoot { get; }
    bool ICollection.IsSynchronized { get; }
    void ICollection.CopyTo(Array array, int arrayIndex);

    // Proposed members
    public void CopyTo(Group[] array, int arrayIndex);
    IEnumerator<Group> IEnumerable<Group>.GetEnumerator();
    int IList<Group>.IndexOf(Group item);
    void IList<Group>.Insert(int index, Group item);
    void IList<Group>.RemoveAt(int index);
    Group IList<Group>.this[int index] { get; set; }
    void ICollection<Group>.Add(Group item);
    void ICollection<Group>.Clear();
    bool ICollection<Group>.Contains(Group item);
    bool ICollection<Group>.IsReadOnly { get; }
    bool ICollection<Group>.Remove(Group item);
    int IList.Add(object value);
    void IList.Clear();
    bool IList.Contains(object value);
    int IList.IndexOf(object value);
    IList.Insert(int index, object value);
    bool IList.IsFixedSize { get; }
    bool IList.IsReadOnly { get; }
    void IList.Remove(object value);
    void IList.RemoveAt(int index);
    object IList.this[int index] { get; set; }
}

// DebuggerDisplay and DebuggerTypeProxy added
[DebuggerDisplay("Count = {Count}")]
[DebuggerTypeProxy(typeof(RegexCollectionDebuggerProxy<Match>))]
// Previously only implemented ICollection
public class MatchCollection : IList<Match>, IReadOnlyList<Match>, IList
{
    // Existing members
    public int Count { get; }
    public virtual Match this[int i] { get; }
    public IEnumerator GetEnumerator();
    object ICollection.SyncRoot { get; }
    bool ICollection.IsSynchronized { get; }
    void ICollection.CopyTo(Array array, int arrayIndex);

    // Proposed members
    public void CopyTo(Match[] array, int arrayIndex);
    IEnumerator<Match> IEnumerable<Match>.GetEnumerator();
    int IList<Match>.IndexOf(Match item);
    void IList<Match>.Insert(int index, Match item);
    void IList<Match>.RemoveAt(int index);
    Match IList<Match>.this[int index] { get; set; }
    void ICollection<Match>.Add(Match item);
    void ICollection<Match>.Clear();
    bool ICollection<Match>.Contains(Match item);
    bool ICollection<Match>.IsReadOnly { get; }
    bool ICollection<Match>.Remove(Match item);
    int IList.Add(object value);
    void IList.Clear();
    bool IList.Contains(object value);
    int IList.IndexOf(object value);
    IList.Insert(int index, object value);
    bool IList.IsFixedSize { get; }
    bool IList.IsReadOnly { get; }
    void IList.Remove(object value);
    void IList.RemoveAt(int index);
    object IList.this[int index] { get; set; }
}

Details

  • There was some discussion as to whether only the read-only interfaces should be implemented, or both the read-only and mutable interfaces. The consensus is to implement both the read-only and mutable interfaces. This is consistent with other collections in the framework. The mutable interfaces are implemented as read-only: mutable members are implemented explicitly and throw NotSupportedException (like ReadOnlyCollection<T>).
  • There was an open question as to whether the non-generic IList should be implemented as well. These collections are indexable and if IList<T> and IReadOnlyList<T> are being implemented, IList should be implemented as well. This does add several more members, but they are all implemented explicitly so they don't add any new public members to intellisense, and the implementations are very straightforward.
  • ICollection<T>.CopyTo is implemented implicitly (public).
  • All other new members are implemented explicitly (non-public):
    • Mutable members are implemented explicitly because these collections are read-only and the mutable members throw NotSupportedException (like ReadOnlyCollection<T>).
    • IList members are implemented explicitly to hide non-generic members from intellisense.
    • IList<T>.IndexOf and ICollection<T>.Contains are implemented explicitly because these methods aren't very useful for these collections and should not be visible in intellisense by default. They're not useful because an implementation using EqualityComparer<T>.Default (consistent with other collections) will search the collection using reference equality due to the fact that Capture, Group, and Match do not implement IEquatable<T> and do not override Equals() and GetHashCode(). Further, these types do not have public constructors -- they are created internally by the regex engine, making it very unlikely that you'd want to search for an item in a collection "A" that was obtained from collection "B".
    • IEnumerable<T>.GetEnumerator() must be implemented explicitly because the non-generic IEnumerable.GetEnumerator() is already implemented implicitly and we can't overload on return type. This also precludes returning a struct Enumerator (for better enumeration performance) because changing the return type of the existing method would be a binary breaking change. As a result, you'll still have to specify the type when using foreach (e.g. foreach (Capture capture in captures)); you won't be able to use var (e.g. foreach (var capture in captures)), unfortunately.

Open Questions

  • Should GroupCollection implement IDictionary<string, Group>, IReadOnlyDictionary<string, Group>, and IDictionary? GroupCollection already has a string indexer. Is it worth implementing the dictionary interfaces as part of this? Personally, I'm leaning toward "no" because there isn't a compelling scenario for the dictionary interfaces, and they can always be added in the future when needed.

Pull Request

A PR with the proposed changes is available: dotnet/corefx#1756

Updates

  • Edited this description to make it more of a speclet, based on the discussion below and the proposed API Review process.
  • Some improvements based on feedback from @sharwell.
  • Fixed existing members.
  • Added IList. These collections are indexable and it would be strange if IList<T> and IReadOnlyList<T> were implemented alongside ICollection but without IList.
  • Added DebuggerDisplay and DebuggerTypeProxy attributes.
  • Made ICollection<T>.CopyTo implicit (public).
@joshfree
Copy link
Member

Hey Justin!
We'd be happy to work with you on this API addition in January (things are slowing down right now with the upcoming holidays. 🎄 ).

/cc @terrajobst @KrzysztofCwalina

@sharwell
Copy link
Member

Love the idea. I'm not sure how strict the backwards compatibility policy is for this particular library, but I'm going to assume it's pretty high considering how widespread use of this library is. Based on that, I believe the path with the highest chance of success is adding the following interfaces:

Type New Interface(s)
CaptureCollection IReadOnlyList<Capture>
MatchCollection IReadOnlyList<Match>
GroupCollection IReadOnlyList<Group>, IReadOnlyDictionary<string, Group>

In particular, the above intentionally avoid adding ICollection<T>, specifically due to the fact that ICollection<T> does not extend ICollection and therefore runs a risk of causing problems with overload resolution for some users.

Also note that it won't be possible to use the struct-style enumerators like you see for List<T>, etc., because this would also be a breaking change.

@justinvp
Copy link
Contributor Author

@sharwell what overloads from ICollection<T> are you concerned about? All members from ICollection<T> would be implemented explicitly.

public class CaptureCollection : ICollection, ICollection<Capture>, IReadOnlyCollection<Capture>
{
    // Existing...
    public Capture this[int i] { get; }
    public int Count { get; }
    public IEnumerator GetEnumerator();
    Object ICollection.SyncRoot { get; }
    bool ICollection.IsSynchronized { get; }
    void ICollection.CopyTo(Array array, int arrayIndex);

    // Proposed...
    bool ICollection<Capture>.IsReadOnly { get; }
    void ICollection<Capture>.Add(Capture item);
    void ICollection<Capture>.Clear();
    bool ICollection<Capture>.Contains(Capture item);
    void ICollection<Capture>.CopyTo(Capture[] array, int arrayIndex);
    bool ICollection<Capture>.Remove(Capture item);
    IEnumerator<Capture> IEnumerable<Capture>.GetEnumerator();
}

@sharwell
Copy link
Member

@justinvp It's not the methods of ICollection<T> I'm worried about. Consider the following two extension methods:

public static void Foo(this ICollection collection)
{
}

public static void Foo<T>(this ICollection<T> collection)
{
}

Currently, if you call captureCollection.Foo(), you'll call the first overload. If you update CaptureCollection to implement ICollection<T>, the code will no longer compile because there is no best match. While a similar scenario is still possible, implementing IReadOnlyList<T> instead of ICollection<T> has two benefits over the above:

  1. These are read-only collections, so the new interfaces are a more accurate representation of the available behavior.
  2. IReadOnlyCollection<T>, which is the interface capable of causing a problem like the above, is less likely to result in an overload conflict even if the only reason is the ~10 year age difference between it and ICollection<T>. In addition, code designed to work with IReadOnlyCollection<T> is often focused on the read-only aspect of these interfaces so is less likely to appear side by side with non-generic, read-write interfaces.

@justinvp
Copy link
Contributor Author

@sharwell Fair enough :)

sharwell referenced this issue in sharwell/corefx Dec 17, 2014
…(fixes #271)

* CaptureCollection implements IReadOnlyList<Capture>
* GroupCollection implements IReadOnlyList<Group>
* MatchCollection implements IReadOnlyList<Match>
@justinvp
Copy link
Contributor Author

How common are such overloads in practice?

public static void Foo(this ICollection collection)
{
}

public static void Foo<T>(this ICollection<T> collection)
{
}

These would already be problematic as they are, because you would already run into the "ambiguous call" issue when used with the most commonly used collections (T[], List<T>, Collection<T>, ReadOnlyCollection<T>, etc.), forcing you to always disambiguate the calls.

I realize that for existing collections, such calls would already be disambiguated in order for the the code to compile, and that by implementing ICollection<T> on the regex collections, it would require someone to update their code to disambiguate the calls when upgrading to a new version of the API.

Would this be a binary breaking change or just a potential source breaking change?

By not implementing ICollection<T> (or IList<T>), we're sacrificing being able to easily make use of these collections in places that do operate on an ICollection<T> (or IList<T>) -- places that haven't been updated to use IReadOnlyList<T>.

@sharwell
Copy link
Member

That may be true, but you do get all the functionality that operates on IEnumerable<T>. I think that covers all the major cases I've run into personally.

Since I'm not on the team (just a random [hopeful] contributor), I wanted to start with what seemed to be a path of least resistance. Do you have a motivating example for why it should implement additional interfaces?

@justinvp
Copy link
Contributor Author

I agree with you on IEnumerable<T>. I think that alone is a huge improvement!

However, there are places that make use of ICollection<T> and IList<T>, that haven't been updated (yet) to take advantage of IReadOnlyCollection<T> and IReadOnlyList<T>.

For example, LINQ's Enumerable.Last<TSource>() has a fast path for IList<T>:

        public static TSource Last<TSource>(this IEnumerable<TSource> source) {
            if (source == null) throw Error.ArgumentNull("source");
            IList<TSource> list = source as IList<TSource>;
            if (list != null) {
                int count = list.Count;
                if (count > 0) return list[count - 1];
            }
            else {
                using (IEnumerator<TSource> e = source.GetEnumerator()) {
                    if (e.MoveNext()) {
                        TSource result;
                        do {
                            result = e.Current;
                        } while (e.MoveNext());
                        return result;
                    }
                }
            }
            throw Error.NoElements();
        }

Enumerable.Last<TSource>() should probably be updated to check for IReadOnlyList<T>, but I'm sure there are other places that do similar things with ICollection<T> and IList<T> that haven't been updated for the newer read-only collection interfaces.

@davkean
Copy link
Member

davkean commented Dec 17, 2014

@sharwell This is a compile-time break. However, unless it is proven to be prevalent, is not we consider a breaking change from our stand point (theoretically, any addition could be a breaking change.)

If we do this, I'd like make sure there's no serialization impact - I think there is none[1], but I'd like someone to make explicitly sure there isn't; XML serializer doesn't require an opt in by the type and we have we've broken a few consumers in the past (and backed them out before we shipped).

[1] In this case, I think the XML serializer needs a public "Add" before it will attempt to serialize this, but please confirm.

@sharwell
Copy link
Member

Even if IList<T> or ICollection<T> were allowed to be added, I lean against them here mostly because each of these is a read-only collection. Implementing these interfaces would be reminiscent of ReadOnlyCollection<T>, where a large number of members needed to be added only to throw NotSupportedException in the end.


Asterisks make bad footnotes in Markdown¹.

¹ Alt+0185 (number pad only) makes for a good one²
² Alt+0178 is another³
³ So is Alt+0179

😄

@davkean
Copy link
Member

davkean commented Dec 17, 2014

If we had a time machine and added IReadOnlyList<T> and IReadOnlyCollection<T> back in v1 (and had the mutable versions implement them) I would agree with you. But unfortunately, we've not yet developed such technology (or someone developed it, and then used it to prevent it from being developed), but I digress.

From our API review standpoint however, it is perfectly fine to opt into ICollection<T>/IList<T> for the sole reason of interoperability. Even today, a consumer of these interfaces needs to make a choice; use stronger semantics and clear intentions I get from IReadOnlyCollection<T> but accept that not everyone can interop with me, or accept the weaker semantics by taking ICollection<T> but be able to interop with more things.

@sharwell
Copy link
Member

So your vote is for implementing both IList<T> and IReadOnlyList<T> here?

Either way dotnet/corefx#277 should be a good starting point. 😉

@terrajobst
Copy link
Member

We treat IReadOnlyXxx as views of collections that could be mutable, read-only, or immutable. See this article for more details.

As far the serialization impact goes: I don't think implementing the read-only interfaces make a difference. AFAIK making a type implement IEnumerable or IEnumerable<T> would impact serailization but this change shouldn't.

EDIT

My point is that we should implement both, the read-only interface as well as mutable-interfaces. That's the same thing we did for immutable and it's important for interop with existing APIs.

@davkean
Copy link
Member

davkean commented Dec 19, 2014

@terrajobst Yep, but my comment was around implementing the mutable versions as well as the readonly versions.

@terrajobst
Copy link
Member

Agreed. I've clarified my comment above.

@sharwell
Copy link
Member

@terrajobst Your edit made a huge difference.

@davkean and @terrajobst : Since I haven't worked with you directly, it can be tough to understand from the wording up until that edit when you are simply pointing out a historical observation, and when you are for/against taking a particular action. I took a conservative viewpoint because I've found it to be generally more acceptable for arbitrary 3rd party projects I contribute to. Since you were both for addressing the issue with a more comprehensive solution, it helps to be more clear about it.

@davkean
Copy link
Member

davkean commented Dec 19, 2014

@sharwell Sorry, these sorts of things will come up in the API review/speclet review, where it will be a little clearer; feedback is usually given in the form of "you must do this", "you must not do that" and "you should consider this". Very similar style to the Framework Design Guidelines themselves.

@sharwell
Copy link
Member

If it makes you feel any better, it wasn't long ago (a few months maybe?) that I was about the worst at this aspect. 😯

@justinvp
Copy link
Contributor Author

I edited the description for this issue to make it more of a speclet, based on the discussion so far and the proposed API Review process.

@justinvp
Copy link
Contributor Author

@sharwell In the original description I had said I would work on the implementation for this. I was planning to submit a PR after the issue had been reviewed (per the Contributing guide), but you jumped in with an initial PR. I guess there needs to be a better way of claiming "dibs" on a contribution? Not sure how we'll proceed with this after the issue is tagged "Accepting PRs".

@sharwell
Copy link
Member

@justinvp I missed that part. I closed my pull request so it's all yours. I'll reopen it only if you find you aren't going to have time to get it done and post a message back here saying so.

@sharwell
Copy link
Member

@justinvp IMO your spec reads well and is comprehensive. The only thing I can even think to change is to clarify that the new explicitly-implemented members are due to overload conflicts (GetEnumerator()) or because they represent methods that modify a collection, but these collections are read-only.

You might also state that methods which change the collection throw NotSupportedException in the same manner as ReadOnlyCollection<T> (though this might be considered obvious).

@justinvp
Copy link
Contributor Author

@sharwell Thanks! I updated the description based on your feedback.

@justinvp
Copy link
Contributor Author

As I started working on an initial implementation, I decided to go ahead and implement IList. These collections are indexable and it would be strange if IList<T> and IReadOnlyList<T> were implemented alongside ICollection but without IList. This does add several more members, but they are all implemented explicitly so they don't add any new public members to intellisense, and the implementations are very straightforward (most throw NotSupportedException). I've updated the speclet.

@eatdrinksleepcode
Copy link

For what it's worth, I am very strongly against implementing mutable interfaces on immutable collections. I am well aware of the interoperability argument, but I believe that intentionally breaking the semantic contract of an interface carries with it a very high burden of proof, and I haven't seen any compelling evidence that such burden has been met, especially given that interoperability concerns have multiple possible mitigations. It is bad enough that we have to live with the existing semantic dissonance caused by years of not having a non-writable indexable interface. We should not be making the problem even worse.

@terrajobst
Copy link
Member

For what it's worth, I am very strongly against implementing mutable interfaces on immutable collections.

I'm afraid that ship has already sailed. Immutable collection is marked as stable and already implements the mutable interfaces.

believe that intentionally breaking the semantic contract of an interface carries with it a very high burden of proof

We don't break the semantic contract of an interface. The .NET collection interfaces don't guarantee that the consumer can mutate the instance. That's why both ICollection and ICollection<T> have an IsReadOnly property. The immutable collections implement all mutable interfaces in a read-only fashion so you can't mutate the collections, as this would indeed be a violation of the immutability property.

The Regex collections should follow this approach.

@eatdrinksleepcode
Copy link

I'm afraid that ship has already sailed. Immutable collection is marked as stable and already implements the mutable interfaces.

That ship may have sailed for Immutable Collections, but here we are talking about existing Regex collections. I think it is still a valid discussion to have, nor am the first to suggest it; I am throwing my (limited) weight behind @sharwell's point earlier in this thread.

We don't break the semantic contract of an interface. The .NET collection interfaces don't guarantee that the consumer can mutate the instance.

The .NET collection interfaces don't guarantee mutability according to their documentation; but when half of the methods on the type are for mutating the instance, I would argue that the interface does convey mutability simply by its contents. From an API design perspective the names and signatures of members are a far more important communication mechanism with the API consumer than the documentation is.

That's why both ICollection and ICollection have an IsReadOnly property.

I have never in my career seen the IsReadOnly property checked by non-Framework code, nor can I think of many cases where it would make sense. The number of use cases that can reasonably alter their behavior based on the IsReadOnly property is vanishingly small; the vast majority of implementations either write to the list, or they don't.

This is also the same argument that was used in 2004 to justify the lack of a read-only interface; yet the arrival of IReadOnlyList in .NET 4.5 is compelling evidence that it doesn't hold water.

I am afraid that we have become numb to the absurdity of implementing an interface and then throwing NotSupportedException from half of its members, because we were forced to do exactly that for so long. But we are not forced to do it any longer, and we shouldn't keep doing it just because it is what we have always done.

@terrajobst
Copy link
Member

We believe that consistency trumps most other aspects, which is why following existing patterns is virtually always good unless there are compelling reasons to abandon them. We re-evaluated whether abandoning would be worthwhile when we added immutable collections and decided against it. I don't see a reason why Regex would be different.

Checking IsReadOnly is probably a good practice in order to find contract breaks early on. Just because folks don't follow proper coding practices doesn't make them mood; I've also seen plenty of code that doesn't check for null either. Doesn't mean we should stop doing that.

It's also worth pointing out that read-only collections exist since .NET 1.0. So it's not that the property IsReadOnly only returns true for collections introduced in the last couple of years -- they are around since ever.

@davkean
Copy link
Member

davkean commented Jan 11, 2015

You need to o be able to build against this library and run on .NET Framework 4.6.

Sent from my Windows Phone


From: Justin Van Pattenmailto:[email protected]
Sent: ý1/ý10/ý2015 5:01 PM
To: dotnet/corefxmailto:[email protected]
Cc: David Keanmailto:[email protected]
Subject: Re: [corefx] Regex collections should implement generic collection interfaces (#271)

But is it possible for an assembly compiled against .NET 4.5 (or earlier) to run against .NET Core? If not, the argument of binary compatibility is moot, since the code will have to be recompiled anyway... Implementing the generic GetEnumerator() implicitly would be great, since it would make it possible to write foreach (var capture in captures).

That is an excellent point, @thomaslevesquehttps://github.com/thomaslevesque! Personally, I'd really like to be able to expose the generic GetEnumerator() to allow var in foreach. Since this is a new library in .NET Core, this may be the best opportunity to make such a change (before developers take a binary dependency on this library). It'd be a source compatible change, so any existing code ported to .NET Core would continue to work when recompiled, without the developer having to make any changes.

Further, if this change is possible, we may also want to take the opportunity to expose a struct enumerator to avoid the enumerator allocation.

I'd be happy to make such changes to the PR, if the owners agree the change can be made.


Reply to this email directly or view it on GitHubhttps://github.com/dotnet/corefx/issues/271#issuecomment-69479088.

@thomaslevesque
Copy link
Member

You need to o be able to build against this library and run on .NET Framework 4.6.

Ah, I knew I was missing something ;)

I guess it wouldn't matter for reference types, but it would break for value types...

EDIT: Actually my last comment doesn't make sense... we're talking about specific collections, and there are no value types involved here.

@justinvp
Copy link
Contributor Author

You need to o be able to build against this library and run on .NET Framework 4.6.

So if you accept this suggestion (adding the generic interfaces to these types), the changes will have to be added to .NET 4.6 as well? How much time is left to make changes to .NET 4.6?

@thomaslevesque
Copy link
Member

Actually my previous comment doesn't make sense... we're talking about specific collections, and there are no value types involved here. But then, couldn't it work in .NET 4.6 ? If the code is bound at compile time to a GetEnumerator method that returns a non-generic IEnumerator, but the actual method at runtime returns a generic IEnumerator<Capture>, it should work, right? Since IEnumerator<Capture> inherits IEnumerator...

@weshaggard
Copy link
Member

Yes @justinvp any types that already exist on the full desktop framework will need to be updated there as well if they are updated in .NET Core. We have a high level goal of making code written for .NET work no matter what .NET platform you are targeting. As for how much time is left to add new public APIs to .NET 4.6 at this point we are not accepting any more unless they are critical.

With that said we fully expect .NET Core to evolve much more quickly then the full .NET framework that ships in Windows. We just have to be aware that any changes need to eventually make it there as well so while thinking about breaking changes you cannot just consider .NET Core you must consider the other platforms as well. If we don't do that then we risk diverging the platforms and making .NET development as a whole more difficult.

@justinvp
Copy link
Contributor Author

Thanks, @weshaggard. I think it'd be worth updating one of the wiki pages with this information as I don't believe it's explicitly stated anywhere currently. I opened dotnet/corefx#392 to track that.

@justinvp
Copy link
Contributor Author

@davkean & @weshaggard, this is getting off-topic, but one more related question about the full .NET framework: if I have some code that I compile against the .NET Core regex library, and I run my compiled assembly on the full .NET framework, does it use the regex types in the .NET Core regex library or the regex types in the full .NET framework at runtime?

@weshaggard
Copy link
Member

@justinvp it would run the code that ships in the .NET Framework. The assembly identity System.Text.RegularExpressions ships in the box and is in the GAC so unless there were other changes to the identity then you wouldn't be running the code from this .NET Core library.

@justinvp
Copy link
Contributor Author

From the API Review:

The implemented strongly typed, non-mutating methods should be implemented implicitly, i.e. publicly, not explicitly

These are the three methods:

  • CopyTo
  • Contains
  • IndexOf

Why? Is this just to match ReadOnlyCollection<T>?

I can see CopyTo, but I'm having a hard time coming up with how anyone would use Contains or IndexOf with these collections.

@terrajobst
Copy link
Member

Yes, it's basically consistency with ReadOnlyCollection<T>. It also seems weird to go out of ones way to exclude these APIs from being directly callable.

@justinvp
Copy link
Contributor Author

Can you show me sample code of how you would use Contains or IndexOf? How are you getting an instance of Capture, Group, or Match to pass into Contains or IndexOf?

@justinvp
Copy link
Contributor Author

@davkean, in the API Review, you call out that CopyTo, Contains, and IndexOf should be implicit... I can get on board with CopyTo, but I'm not sure about Contains and IndexOf. I made these explicit for a reason, mentioned in the speclet above:

IList.IndexOf and ICollection.Contains are implemented explicitly because these methods aren't very useful for these collections and should not be visible in intellisense by default. They're not useful because an implementation using EqualityComparer.Default (consistent with other collections) will search the collection using reference equality due to the fact that Capture, Group, and Match do not implement IEquatable and do not override Equals() and GetHashCode(). Further, these types do not have public constructors -- they are created internally by the regex engine, making it very unlikely that you'd want to search for an item in a collection "A" that was obtained from collection "B".

@justinvp
Copy link
Contributor Author

@davkean ?

@davkean
Copy link
Member

davkean commented Jan 26, 2015

Sorry, for the late reply. I agreed with your assessment - I can think of contrived examples, but nothing solid.

@terrajobst
Copy link
Member

This issue was reviewed today. It looks good as proposed.

justinvp referenced this issue in justinvp/corefx May 15, 2015
Implement IList<T>, IReadOnlyList<T>, and IList on the Regex collections.

Fixes #271
@justinvp
Copy link
Contributor Author

PR dotnet/corefx#1756 is ready for review (against the future branch).

Note: I just want to call out that the PR has CopyTo as implicit (public) and Contains and IndexOf as explicit (non-public). This is a slight deviation from what was discussed in the most recent API review.

I had originally proposed CopyTo, Contains, and IndexOf as all being explicit. The feedback from the first API review was to make these implicit to match ReadOnlyCollection<T>, but the reasoning for why these were proposed in the speclet as explicit was not discussed in the review.

  • CopyTo was proposed as explicit to match the non-generic CopyTo which was explicit.
  • Contains and IndexOf were proposed as explicit because these methods aren't very useful for these collections (see details in the speclet above).

I was actually on board with the feedback to make CopyTo implicit (my original reasoning for CopyTo was flawed), but still had some concerns about Contains and IndexOf that hadn't been discussed in the review. In the follow-up discussion in the comments (above) @davkean agreed with my assessment.

I was planning to make CopyTo implicit, but didn't update the speclet to reflect this, so when this was discussed again in the most recent API review, you just discussed having everything as explicit as originally proposed, and agreed that this was OK.

I just want to call out that I've deviated slightly from this, making CopyTo implicit, but keeping Contains and IndexOf explicit. I've updated the speclet above to show CopyTo as being implicit to match the PR.

Let me know if you'd like me to change CopyTo back to being explicit as originally proposed.

@davkean
Copy link
Member

davkean commented May 15, 2015

@justinvp Nope no need to change anything, agree with what you've done here. Can't review the code, but someone on the team will be able to.

@stephentoub
Copy link
Member

Fixed with dotnet/corefx#1756

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.