-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Developers can use CollectionsMarshal ref accessors for Dictionary<TKey, TValue> #27062
Comments
Same comment https://github.com/dotnet/corefx/issues/31597#issuecomment-410452767 applies here. |
Updated |
Did you mean to give the dictionary a parameter name and have two different parameters? Like public ref TValue ValueRef<TKey, TValue>(Dictionary<TKey, TValue> dictionary, TKey key); |
@benaadams does it make sense to create 1 issue for: #27061 and this guy, to propose adding Marking as 3.0 for now to not loose attention and then we can decide if we bring them individually or together as 1 issue to API Review. |
Not required for 3.0 release |
@benaadams same question as on the other issue for |
struct LargeStruct
{
// ...
int Value;
} 1 hash lookup, no copies to update a struct dict.ValueRef(key).Value++; // hash lookup vs 2 hash lookup, 2 copies to update a struct var s = dict[key]; // hash lookup, struct copy
s.Value++;
dict[key] = s; // hash lookup, struct copy |
So if we were to expose an API that would allow you to cache the hash key, would this address your need? IOW, between struct copies and hash lookups, what hurts more? |
The hashing isn't particularly expensive (e.g. if its an The struct copies are expensive as if you want to modify one property of the struct its disproportionately expensive. I do accept its inherently unsafe if you mix it with other operations on the list or dictionary; and then try to keep using the |
That makes sense.
Other people might feel differently but the whole point is that this should be a good enough speed bump. Putting this on a type in The larger question is: where do we draw the line with escape hatches like this? In a sense, we're chasing the 1% cases here, but I'm a bit scared that we're creating a mess the more APIs like this we're exposing throughout the BCL. And I don't have a good handle on this where we hold the bar. Do you have any real workload where this would show up? Of course, one can always fabricate a micro benchmark; I don't doubt there are cases where this API would speed things up. The point is: how much does this matter in real code? |
Looking at these apis more holistically: For the dictionary you can achieve a similar; but safer, effect by wrapping the struct in a class then using the class handle. Its safer because the reference will remain valid across dictionary operations; and then a "live" reference can be more easily passed around and stored. For the list For the dictionary using a class indirection doesn't cause too much issue because the data layout isn't important to the calling code (as how its internally stored in dictionary can't be consumed). For list; that its a straight in-order array is useful, as then a consumer if they had access to it, can do things like vectorize the processing of the data, which a class indirection would break. So in preference I'd drop these apis and add the e.g. using the other api to achieve the same AsSpan(list)[index].Value++; // though likely would cache the span if it was in a loop vs ItemRef(list, index).Value++ |
@benaadams, could you update this proposal to just be for Seeing as we already reviewed/approved the |
Done |
Also update the proposed API list in the top post? |
Dropped List and ConcurrentDictionary (as getting a live ref for |
If we approve this API we'd also need to approve |
@benaadams What's the expected behavior when the key is not found? Return a null ref, throw an exception, insert default(T) into the collection, or something else? Would a caller need to distinguish between "not found" and "I just created a dummy placeholder for you" scenarios? |
How about a method similar to the specialized SpanAction string allocator? Dictionary.AddOrUpdate(TKey key, TState is provided to help in no allocation scenarios. Then use it like: var dict = new Dictionary<long, long>(); dict.AddOrUpdate(56, state, (value, s, b) => b ? ++value[0] : value[0] = state.Value); |
namespace System.Runtime.InteropServices
{
public static class CollectionsMarshal
{
public static ref TValue GetValueRefOrNullRef<TKey, TValue>(Dictionary<TKey, TValue> dictionary, [NotNull] TKey key);
public static ref TValue GetValueRefOrAddDefault<TKey, TValue>(Dictionary<TKey, TValue> dictionary, [NotNull] TKey key, out bool exists);
}
} EDIT by @stephentoub: Removed GetValueRef method after subsequent discussion. |
Hey @benaadams, are you still currently working on this? I saw you've contributed #49388 which has already been merged, was wondering whether you also planned to do the other two or wanted some help in case you were busy? I have some spare time on my end so I'd be happy to help contribute with at least an implementation of
|
@Sergio0694 feel free; don't really have the time to devote to doing the benchmarking to ensure its a good change
What's this deadline? |
Alright, no worries! I'll give that one a shot myself then 😄
That's the deadline for all feature work for .NET 6, at which point they'll take a snapshot of |
To clarify, July 13 is our feature complete date, when we switch over entirely to bugs, Aug 17 (when we aim to get to zero bugs for 6.0) is when we branch 6.0 from main. As last year however, we would like to continue to merge community PRs after July 13, unless they'll need more stabilizing time in which case we might pause merging until after the branching. |
We should definitely add a paragraph for this feature in the preview 7 article. |
This is niche unsafe API that 99% of .NET developers should not ever use. We do not want to encourage people to use it just because of they can. |
Edited by @layomia. Original proposal by @benaadams (click to view)
They aren't "safe" operations as the backing store can change if items are added, removed, etc and the changes aren't trackedHowever it is useful to get a reference to modify struct
TValue
entries without a full copy and replace update./cc @jkotas
/cc @stephentoub is it valid for
ConcurrentDictionary
?CollectionsMarshal ref accessors for
Dictionary<TKey, TValue>
Attempting to obtain a value in a dictionary, and adding it if not present, is a common scenario in .NET applications. The existing mechanism for achieving this today is by using
Dictionary<TKey, TValue>.TryGetValue(Tkey key, out TValue value)
, and adding the value to the dictionary if not present. This causes duplicate hash lookups:Another scenario is updating
struct
values in dictionaries. The existing pattern for achieving this causesstruct
copies and duplicate hash lookups, which potentially have non-trivial performance costs for large structs:Motivation
struct
dictionary values.API proposal
CollectionsMarshal
is an unsafe class that provides a set of methods to access the underlying data representations of collections.API usages
Updating
struct
value in dictionary:KeyNotFoundException
thrown when key not present in dictionaryThis pattern is helpful when caller wants to optimally update a
struct
in a scenario where the key being absent is an error state. Creating a value and adding it to the dictionary, if not already present, is not desired.Unsafe.NullRef<TValue>()
returned when key not present in dictionaryThis pattern satisfies both the optimal
struct
value update and optimal "get or add" value scenarios.default(TValue)
returned when key not present in dictionaryThis pattern also satisfies both the optimal
struct
value update and optimal "get or add" value scenarios. Astruct
default
value always being instantiated may cause theTryGetValueRef
to be preferred, depending on the perf scenario.Alternative design
GetOrAdd
methods, similar to those onConcurrentDictionary<TKey, TValue>
, were proposed in #15059.Upsides
ConcurrentDictionary<TKey, TValue>
.Downsides
System.Private.CoreLib
, generic expansion issues due tostruct
-based generics will affect all users ofDictionary<TKey, TValue>
, not just those that use them.public
members.struct
values.Open questions
Should the
out bool exists
parameter onTryGetValueRef
be removed?Since a call to
Unsafe.IsNullRef<T>(ref value)
can indicate whether the value exists in the dictionary, the second method could simply be:Usage
Any concerns about API bloat in the
CollectionsMarshal
type?The generic expansion highlighted in the
GetOrAdd
-based alternative doesn't apply much here since the new methods will live inCollectionsMarshal
and will be pay-for-play. However, are there concerns about bloating the type?One answer here states as follows:
The text was updated successfully, but these errors were encountered: