-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: 'cache' keyword, lazy initialization, and immutable value caching #681
Comments
Under existing cases there's also Also, your example specification above is multiple levels of not-thread-safe. |
Thank you, I forgot to mention that. Updated the proposal. Using
Yes, I am aware of that (I pointed that out as the first thing in Notes). However, I think we should not add overhead for the 95% case where devs don't care about thread-safety. Devs who do care will be more prudent before they adopt this feature, and realize it's not thread-safe. @artcfa and @dstarkowski, I noticed you both downvoted the proposal. Please feel free to comment here on why you think this is not a good idea, or what I can do to improve it. Your feedback would be greatly appreciated. |
Where did this 5% come from? Can you back it with actual data or is it just your guess? Static field is shared between requests in web app and even single user can trigger multiple concurrent requests. It doesn't seem like such a borderline case to me. |
@jamesqo The main reason I downvoted this proposal is that I don't think there needs to be language support for this. Every feature is a liability and added baggage to long-term maintainability not only on the language dev side, but also for every single user of that language who will now be required to know its usage and implications - even if they don't use it themselves, they will have to deal with code that does. Looking at it this way, I think there needs to be a really, really good reason to introduce a new language feature, which I personally don't see here. Please don't read my downvote as a review of the proposal itself (structure, clarity, depth, style, etc.) but as a note on whether I would appreciate if something along those lines existed. |
I think you are slightly underestimating how many people value thread safety in our modern multi-threaded/multi-core/multi-CPU workloads. I do not see the value of this over |
Example: public class A {
private readonly int a;
private readonly int b;
public A(int a, int b)
{
this.a = a;
this,b = b;
}
private int Calculate()
{
return a + b;
}
// Compilation error!
private readonly Lazy<int> lc = Lazy<int>(Calculate); // Can't do that! Calculate should be static for C# to compile.
} so, you should assign lazy value inside constructor: public class A {
private readonly int a;
private readonly int b;
public A(int a, int b)
{
this.a = a;
this,b = b;
this.lc = new Lazy<int>(Calculate);
}
private int Calculate()
{
return a + b;
}
private readonly Lazy<int> lc;
} I think this feature (if implemented) should be syntactic sugar around |
What happens when the For example: int CachedFibonacci(int i) => cache Fibonacci(i); Is this allowed? Will it cache the value from the first evaluation? If that's the behavior, wouldn't it be confusing (as in the above example)? |
@svick I would expect the cache to hit if it had previously been evaluated with the same value for |
@bondsbw So, in such case, |
Here could the proposed public T Foo => LazyInitializer.EnsureInitialized(ref field, InitializeFoo); |
@svick To answer your question, the expression would not be able to access any local variables (no closures allowed), it would only be permitted to access static variables just as if it were a static readonly field, or static and instance variables/readonly field for instance-based caching. Anyway, given the mostly negative community feedback, perhaps this idea is not the best. (BTW: Despite what I said, I wasn't fundamentally opposed to making this proposal thread-safe, I just thought it would be adding unnecessary overhead.) I think @quinmars' one-liner above of using |
I don't think the cache is really meant to be a memoization of all possible outcomes. I think it was meant to be a stand in for compute-this-the-first-time-through and store-it-in-a-field-I-don't-have-to-declare. To be thread safe you need to be using a thread safe mechanism. The problem here is which one do you use. If you use Lazy, you have to be okay with the additional allocation. This might be an issue in some cases, not in others. You could use Interlocked.CompareExchange: public ExpensiveValueType ExpensiveValue =>
{
if (expensiveValue == default(ExpensiveValueType))
{
Interlocked.CompareExchange(ref expensiveValue, ComputeExpensiveValue(), default(ExpensiveValue));
}
return expensiveValue;
} This perfect if the type is a reference type, or a primitive supported by Interlocked, and you are more concerned with always returning the same instance (when a reference type), as you should be when using a property, but not so concerned if the computation happens more than once sometimes due to concurrency. If you require that the expensive computation only ever happen once, then you need to have some kind of locking. So you could write this: public ExpensiveValueType ExpensiveValue =>
{
if (expensiveValue == default(ExpensiveValueType))
{
lock (this)
{
if (expensiveValue == default(ExpensiveValueType))
{
expensiveValue = ComputeExpensiveValue();
}
}
}
return expensiveValue;
} This lock with a double check will work if the type is a reference type or an appropriately sized primitive (otherwise you'll likely get tearing and that would be bad.) Otherwise, you could write it without the unguarded check: public ExpensiveValueType ExpensiveValue =>
{
lock (this)
{
if (expensiveValue == default(ExpensiveValueType))
{
expensiveValue = ComputeExpensiveValue();
}
}
return expensiveValue;
} Simpler, but likely to cause scaling problems as you have lots of contentions around the lock. Also, the lock(this) pattern is frowned upon, because your users can also interact with this system supplied lock and cause you deadlocks. It's also really bad, because it is re-entrant, which is okay in situations where you know exactly what is happening inside ComputeExpensiveValue, but if its likely you are calling outside your codebase even to call system libraries, then you could be asking for trouble if this is ever run on a UI thread. You might be able to improve on the lock contention with a ReaderWriterLock or equivalent, because you only ever have one write, but this is hard to implement correctly, as you'll have to acquire the read lock to check the value and if its default/null you'll need to release the read lock and attempt to acquire the write lock, and then check again, etc. And you'll have to allocate this and store this lock somewhere. Is this going to be a per property per object instance lock? Are you going to share it with other cached items computed in the same object instance? Is it global? Which one of these implementations is the compiler going to choose for you? |
@mattwar if we're going to initialize something lazily, I'd think that However, as @quinmars suggested, a |
@yaakov-h There is too much variability to the kind of caching you may want for your expensively computed value for the compiler to pick a single encoding of caching. |
Background
Lazy initialization
There isn't really a concise way to do lazy initialization in .NET currently. The two popular patterns are 1)
or 2)
or 3)
In addition to being wordy, 2) is also quite expensive and (last I checked) involves several heap allocations. All three methods require you to declare backing fields.
Caching immutable values
(I got ahead of myself and wrote the Proposal part before the Background. It would be awkward to explain it here and then re-explain it below, so just keep reading.)
Proposal
Add a
cache
keyword to the language.cache
will accept a single expression and evaluate to an expression. This latter expression, when evaluated, will:Check a flag to see if the inner expression was run.
Will the value be cached for the current object instance or the current class?
cache <expr>
=> current instance.cache static <expr>
=> current class.Examples
Lazy initialization
ExpensiveValue
is=>
on purpose, sincecache
can be evaluated multiple times at little extra cost.{ get; } = cache <expr>
works too, but creates an unnecessary field.Better immutability support
=> Less caching stuff in
private static readonly s_
fields at the top of your file, then scrolling all the way back down.(Note: I didn't just come up with the above example. I have an actual place in a project I'm currently working on where it would be really, really nice to use
cache static
. Ask for it in the comments and I'll post further details.)This doesn't apply to just immutable collections, of course, or even just immutable types, but any place where the caller is known not to modify the parameter in question and a constant value is passed. For example, arrays are mutable, but
cache
is still perfect for use withstring.Split
:I seem to recall @NickCraver commenting how he had to cache the arrays for
string.Split
a lot. I couldn't find his comment, but I did find this.Specification
cache static
would be the same code, except the fields would be static.Notes
cache
andcache static
will not be thread-safe. It's possible the initializer expression could be called more than once. For the minority of cases where thread-safety is important, developers will be more prudent in adopting new language features and realize this. It's OK if we don't extend the benefits of this proposal to them, since they're the 5% case.I decided to make instance-based caching the default with
cache
. Otherwise, writingas above would cause
ComputeExpensiveValue()
to be run once for all instances, so whichever object hasExpensiveValue
called first populates that field for all objects globally. I feel it would be too easy to fall into that trap if you had to writecache this
orthis.cache
or something longer thancache
.is replaced with
cache new object()
. The compiler doesn't know that the initialization expression doesn't compute to null, so we can't just take null to mean uninitialized. We have to introduce a separate bool field to determine that, which takes up more space.null
to mean uninitialized.The text was updated successfully, but these errors were encountered: