Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GroupCollection should implement IReadOnlyDictionary interface to align with its Dictionary-Type Usage #23186

Closed
michael-hawker opened this issue Aug 15, 2017 · 18 comments
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.RegularExpressions Hackathon Issues picked for Hackathon help wanted [up-for-grabs] Good issue for external contributors
Milestone

Comments

@michael-hawker
Copy link

GroupCollection acts like a dictionary, but doesn't provide all of the familiar methods found on Dictionary collections. This makes it harder to inspect contents to make informed decisions on its contents to drive logic. For instance, determining if a key was present in a named capture group before trying to perform logic with the result. This can be seen as a follow-on to issue #13933 (it was listed in the Open Questions section).

Rationale and Usage

This was something I stumbled upon when trying to make a RegEx expression in two forms, one with a lot of named parameters, but that were all optional. I wanted to check if the parameters existed before proceeding to perform logic with them but discovered that there was no method to check if a named capture group was actually captured. Currently, I can use an indexer to access a specific group name like

match.Groups["options"]

But unlike a normal dictionary, I can't call match.Groups.ContainsKey('options") to see if it exists first before retrieval if I have optional named capture groups. The current Contains method only checks on the object not the named key.

Currently, the group if not captured returns an empty string, but that is a bit misleading, as the group could have been captured but contained no data. So, there's an ambiguous state. If we had methods to check if the capture was performed, we could leave the group out and have KeyNotFoundExceptions or nulls to differentiate from existence vs. no data. This would let us then do:

if (match.Groups.ContainsKey("options")) {
    // do something with match.Groups["options"]
}

OR

Group options;
if (match.Groups.TryGetValue("options", out options)) {
    // do something with options group now.
}

Proposed API

public GroupCollection : IReadOnlyDictionary<string, Group>, ... {
    bool ContainsKey(string key);
    bool TryGetValue(string key, out Group group);
    ...
}

Details

Updates

  • 8/16/17 Updated request to align with review process format and changed to explicitly call out IReadOnlyDictionary interface.
@danmoseley
Copy link
Member

@jnm2
Copy link
Contributor

jnm2 commented Aug 16, 2017

match.Groups.ContainsKey("options") and match.Groups["options"] != null are equivalent, so this would be API sugar.

The sugar I'd prefer would be bool TryGetValue(out Group group). Ever since out declarations, this has become one of my favorite patterns.

  • It doesn't result in a double lookup like if (match.Groups.ContainsKey("options")) { match.Groups["options"] } does.
  • You know what to expect with TryGetValue, but most dictionaries throw KeyNotFoundException rather than returning null from indexers so the intent is clearer when reading code using TryGetValue.

@michael-hawker
Copy link
Author

@danmosemsft will do.

@jnm2 I'll add that to my request as well though I think it'd be bool TryGetValue(string key, out Group group), eh?

I did notice that currently the non-existence groups are empty strings and not null (at least in UWP), so it's a little ambiguous currently as to the result (i.e. if the group captured but contained nothing vs. didn't exist at all).

@michael-hawker michael-hawker changed the title GroupCollection should have ContainsKey method GroupCollection should have ContainsKey and TryGetValue methods to align with Dictionary-Type Usage Aug 16, 2017
@michael-hawker michael-hawker changed the title GroupCollection should have ContainsKey and TryGetValue methods to align with Dictionary-Type Usage GroupCollection should implement IReadOnlyDictionary interface to align with its Dictionary-Type Usage Aug 16, 2017
@ViktorHofer ViktorHofer self-assigned this Nov 30, 2017
@ViktorHofer
Copy link
Member

@michael-hawker thanks for the suggestion! I will look into and will comment as soon as I fully reviewed it.

@ViktorHofer ViktorHofer removed their assignment May 24, 2018
@ViktorHofer
Copy link
Member

Potential hackathon candidate if we review the API addition in time.

cc @karelz

@terrajobst
Copy link
Member

terrajobst commented May 29, 2018

Video

Looks good.

  • We should also implement IDictionary<K,V> as the BCL exposes the notion of read-onlyness as a mode (IsReadOnly returns true)
  • Given that we implemen IList<>, should we also implement IReadOnlyList<>?

@ViktorHofer
Copy link
Member

Given that we implemen IList<>, should we also implement IReadOnlyList<>?

For clarity, he meant "Given that we implement IDictionary<>, should we also implement IReadOnlyDictionary<>?".

@michael-hawker
Copy link
Author

I also wanted to clarify:

I did notice that currently the non-existence groups are empty strings and not null (at least in UWP), so it's a little ambiguous currently as to the result (i.e. if the group captured but contained nothing vs. didn't exist at all).

Is this a separate bug and should be filed elsewhere or expected?

@karelz
Copy link
Member

karelz commented May 29, 2018

@michael-hawker that sounds like separate issue to me.

@michael-hawker
Copy link
Author

@karelz I didn't realize there was a Success property on the Group as well, so I think that's where I was misled, as I was only looking at the Value and expecting different results based on Match vs. No Match.

        Regex r = new Regex("\\<(?<Type>(?:Button)|(?:TextBlock))?\\>");

        var match = r.Match("<TextBlock>");

        Console.WriteLine("Match: " + match.Success);
        Console.WriteLine("Type: " + match.Groups["Type"]);
        Console.WriteLine(" TypeType: " + match.Groups["Type"].GetType());
        Console.WriteLine(" TypeMatch: " + match.Groups["Type"].Success);
        Console.WriteLine(" IsNull: " + (match.Groups["Type"] == null));
        Console.WriteLine(" IsEmptyString: " + (match.Groups["Type"]?.Value == string.Empty));

        Console.WriteLine();
        match = r.Match("<>");

        Console.WriteLine("Match: " + match.Success);
        Console.WriteLine("Type: " + match.Groups["Type"]);
        Console.WriteLine(" TypeType: " + match.Groups["Type"].GetType());
        Console.WriteLine(" TypeMatch: " + match.Groups["Type"].Success);
        Console.WriteLine(" IsNull: " + (match.Groups["Type"] == null));
        Console.WriteLine(" IsEmptyString: " + (match.Groups["Type"]?.Value == string.Empty));

        r = new Regex("\\<(?<Type>(?:Button)|(?:TextBlock)|\\s*)?\\>");

        Console.WriteLine();
        match = r.Match("<>");

        Console.WriteLine("Match: " + match.Success);
        Console.WriteLine("Type: " + match.Groups["Type"]);
        Console.WriteLine(" TypeType: " + match.Groups["Type"].GetType());
        Console.WriteLine(" TypeMatch: " + match.Groups["Type"].Success);
        Console.WriteLine(" IsNull: " + (match.Groups["Type"] == null));
        Console.WriteLine(" IsEmptyString: " + (match.Groups["Type"]?.Value == string.Empty));

Output:

Match: True
Type: TextBlock
 TypeType: System.Text.RegularExpressions.Group
 TypeMatch: True
 IsNull: False
 IsEmptyString: False

Match: True
Type:
 TypeType: System.Text.RegularExpressions.Group
 TypeMatch: False
 IsNull: False
 IsEmptyString: True

Match: True
Type:
 TypeType: System.Text.RegularExpressions.Group
 TypeMatch: True
 IsNull: False
 IsEmptyString: True

For example in the 2nd case above, I was expecting the Value to be null as the capture was unsuccessful rather than also having to check the Success property first. (vs. the third case where the capture could be successful but has no length and was an empty string.)

It feels like this proposal would want to factor the Success value into the result of TryGetValue too, no? Otherwise, it'd be the same 'trap'.

Thoughts?

I'm happy to open another issue to discuss this behavior, but it definitely seems related as if you use the TryGetValue paradigm and they only look at Value still they'd both return empty strings, but I wouldn't have expected for it to go into my if block in the 2nd scenario.

@ViktorHofer
Copy link
Member

It feels like this proposal would want to factor the Success value into the result of TryGetValue too, no? Otherwise, it'd be the same 'trap'.

I don't think that TryGetValue should also consider the Success value. That's not the notion of the ContainsKey or TryGetValue pattern. I expect consumers first check if success and only then call TryGetValue.

@peltco
Copy link
Contributor

peltco commented May 31, 2018

i'll work on this one during the hackaton

@jnm2
Copy link
Contributor

jnm2 commented May 31, 2018

For some reason I'm finding it unintuitive. If I see a TryGetValue method in intellisense, I'm going to think it's primarily about .Success, and I'd be wrong.

@ViktorHofer
Copy link
Member

ViktorHofer commented May 31, 2018

TryGetValue operates on collections that contain a key and a value, i.e. Hashtable and Dictionary. In our case the key is the group name and the value is the group object. If the key exists then the value should be returned without taking its internal state into consideration (in our case the Success property).

You will get the Group object back, same as you would do GroupCollection["group1"] today. You still want to check the Success property in both scenarios.

@jnm2
Copy link
Contributor

jnm2 commented May 31, 2018

Oops, I was picturing TryGetValue on Group, lol! I'm sorry!

@karelz
Copy link
Member

karelz commented May 31, 2018

@karelz karelz self-assigned this May 31, 2018
@ViktorHofer ViktorHofer assigned peltco and unassigned karelz Jun 1, 2018
@michael-hawker
Copy link
Author

Yeah, I think the crux of the issue is that groups will appear regardless of if they're captured (as long as they're defined in the RegEx string), so it's a bit of understanding the API.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 3.0 milestone Jan 31, 2020
@ghost ghost locked as resolved and limited conversation to collaborators Dec 21, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Text.RegularExpressions Hackathon Issues picked for Hackathon help wanted [up-for-grabs] Good issue for external contributors
Projects
None yet
Development

No branches or pull requests

8 participants