C# 8 provides an opt-in feature that allows for the compiler to track reference type nullability in order to catch potential null dereferences. We are adopting that feature across .NET's libraries, working up from the bottom of the stack. We're doing this for three primary reasons, in order of importance:
- To annotate the .NET surface area with appropriate nullability annotations. While this could be done solely in the reference assemblies, we're doing it first in the implementation to help validate the selected annotations.
- To help validate the nullability feature itself. With millions of lines of C# code, we have a very large and robust codebase with which to try out the feature and find areas in which it shines and areas in which we can improve it. The work to annotate System.Private.CoreLib in .NET Core 3.0 helped to improve the feature as shipped in C# 8, and annotating the rest of the libraries will continue to be helpful in this regard.
- To find null-related bugs in .NET Runtime itself. We expect to find relatively few meaningful bugs, due to how relatively well-tested the codebases are and how long they've been around.
We are striving to get annotations correct the "first time" and are doing due-diligence in an attempt to do so. However, we acknowledge that we are likely to need to augment and change some annotations in the future:
- Mistakes. Given the sheer number of APIs being reviewed and annotated, we are likely to make some errors, and we'd like to be able to fix them so that long-term customers get the greatest benefit.
- Breadth. Annotation takes time, so annotations will roll out over time rather than being released all at once.
- Feedback. We may need to revisit some "gray area" decisions as to whether a parameter or return type should be nullable or non-nullable (more details later).
Any such additions or changes to annotations can impact the warnings consuming code receives if that code has opted in to nullability analysis and warnings. Even so, for at least the foreseeable future we may still do so. We will be very thoughtful about when and how we do.
Nullability annotations are considered to represent intent: they represent the nullability contract for the member. Any deviation from that intent on the part of an implementation should be considered an implementation bug, and the compiler will help to minimize the chances of such bugs via its flow analysis and nullability warnings. At the same time, it's important to recognize that the validation performed by the compiler isn't perfect; it can have both false positive warnings (suggesting that something may be null even when it isn't) and false negatives (not warning when something that may be null is dereferenced). The compiler cannot guarantee that an API declared as returning a non-nullable reference never returns null, just as it can't validate that an implementation declared as accepting nulls always behaves correctly when given them. When deciding how to annotate APIs, it's important then to consider the desired contract rather than the current implementation; in other words, prefer to first annotate the API surface area the way that's desired, and only then work to address any warnings in the codebase, rather than driving the API surface area annotations based on where those warnings lead.
- DO annotate all new APIs with the desired contract.
- CONSIDER changing that contract if overwhelming use suggests a different de facto contract. This is particularly relevant to virtual/abstract/interface methods defined in a library where all implementations may not be under your control, and derived implementations may not have adhered to the original intent.
- DO respect documented behaviors. Nullability annotations are about codifying a contract, and documentation is a description of that contract. If, for example, a base abstract or virtual method says that it may throw an exception if
null
is passed in, that dictates that the argument should be non-nullable. - DO continue to validate all arguments as you would have prior to nullability warnings. In particular, if you would have checked an argument for
null
and thrown anArgumentNullException
if it wasnull
, continue to do so, even if the parameter is defined as non-nullable. Many consumers will not have nullability checked enabled, either because they're using a language other than C#, they're using an older version of C#, they simply haven't opted-in to nullability analysis, or the compiler isn't able to detect the misuse. - DO NOT remove existing argument validation when annotating existing public/protected APIs. If through the course of annotating it becomes clear that internal/private surface area is unnecessarily checking for nulls when nulls aren't possible (i.e. you found dead code), such use can be removed or replaced by asserts.
- AVOID making any changes while annotating that substantively impact the generated IL for an implementation (e.g.
some.Method()
tosome?.Method()
). Any such changes should be thoroughly analyzed and reviewed as a bug fix. In some cases, such changes may be warranted, but they're a red flag that should be scrutinized heavily.
The majority of reference type usage in our APIs is fairly clear as to whether it should be nullable or not. For parameters, these general guidelines cover the majority of cases:
- DO define a parameter as non-nullable if the method checks for
null
and throws anArgument{Null}Exception
ifnull
is passed in for that parameter (whether explicitly in that same method or implicitly as part of some method it calls), such that there's no waynull
could be passed in and have the method return successfully. - DO define a parameter as non-nullable if the method fails to check for
null
but instead will always end up dereferencing thenull
and throwing aNullReferenceException
. - Do define a parameter as non-nullable if the documentation explicitly states that an exception may be thrown for a
null
argument. - DO define a parameter as nullable if the parameter is explicitly documented to accept
null
. - DO define a parameter as nullable if the method checks the parameter for
null
and does something other than throw. This may include normalizing the input, e.g. treatingnull
asstring.Empty
. - DO define a parameter as nullable if the parameter is optional and has a default value of
null
. - DO define
string message
andException innerException
arguments toException
-derived types as nullable. Additional reference type arguments toException
-derived types should, in general, also be nullable unless not doing so is required for compatibility. - DO prefer nullable over non-nullable if there's any disagreement between the previous guidelines. For example, if a non-virtual method has documentation that suggests
null
isn't accepted but the implementation explicitly checks for, normalizes, and accepts anull
input, the parameter should be defined nullable.
However, there are some gray areas that require case-by-case analysis to determine intent. In particular, if a parameter isn't validated nor sanitized nor documented regarding null
, but in some cases simply ignored such that a null
doesn't currently cause any problems, several factors should be considered when determining whether to annotate it as null
.
- Is
null
ever passed in our own code bases? If yes, it likely should be nullable. - Is
null
ever passed in prominent 3rd-party code bases? If yes, it likely should be nullable. - Is
null
likely to be interpreted as a default / nop placeholder by callers? If yes, it likely should be nullable. - Is
null
accepted by other methods in a similar area or that have a similar purposes in the same code base? If yes, it likely should be nullable. - If the method is largely oblivious to
null
and just happens to still work ifnull
is passed, and if the API's purpose wouldn't make sense ifnull
were used, the parameter likely should be non-nullable.
Things are generally easier when looking at return values (and out
parameters), as those can largely be driven by what the API's implementation is capable of:
- DO define a return value or out/ref parameter as nullable if
null
may be returned under any circumstance. - DO define all other return values or out parameters as non-nullable.
- DO define the type of events as nullable. (The only incredibly rare exception to this is if the implementation guarantees that the event will always have at least one delegate registered with it.)
- CONSIDER special-casing public properties and fields on structs. By default, reference type fields of structs are null, and so if those fields are public, or if a property exposes such a field without performing further validation or manipulation, they technically must be nullable, and when annotating, that is how such fields should be annotated initially. However, we must also consider expected usage, and whether the default value of such structs should ever be used. In limited circumstances, we may decide that, although technically
default(Struct)
could be used to create a default instance and thus its exposed fields could be null and thus they should be nullable, such a use is considered invalid and such we shouldn't use it for annotation purposes. Let's consider two examples on opposite ends of the spectrum. First,System.Threading.CancellationToken
.CancellationToken
has aCancellationTokenSource
field, anddefault(CancellationToken)
is considered a perfectly valid instance to use... itsCancellationTokenSource
field must be nullable. But second, let's considerSystem.Reflection.InterfaceMapping
.InterfaceMapping
is a simple wrapper around four public reference type fields, and so technically they should all be nullable. But it's considered invalid to just use adefault(InterfaceMapping)
; no one ever does it, as you only ever get anInterfaceMapping
back from some reflection method calls that produce proper instances, and those instances will have all of these fields initialized to non-null. Making them nullable, while technically correct, would harm the 100% correct use case in favor of the 0% invalid use case. - DO NOT be concerned with the validity of annotations after an object has been Dispose'd. In general in .NET, using an instance after it's been disposed is a violation of its contract, and so similarly we generally (if the object isn't supposed to be used after disposal) don't consider the nullable annotations to be meaningful after disposal. This means, for example, that if a property is initialized in a ctor and is always non-null until disposal but then may return null after disposal, it should still be annotated as non-nullable.
- DO NOT be concerned with the validity of an enumerator's Current annotations when used before MoveNext has been called or after MoveNext has returned false. As with Dispose, such use is considered invalid, and we don't want to harm the correct consumption of Current by catering to incorrect consumption.
Annotating one of our return types as non-nullable is equivalent to documenting a guarantee that it will never return null
. Violations of that guarantee are bugs to be fixed in the implementation. However, there is a huge gap here, in the form of overridable members…
For virtual members, annotating a return type as non-nullable places a requirement on all overrides to meet those same guarantees, just as any other documented behaviors of a virtuals apply to all overrides, whether those stated guarantees can be enforced by the compiler or not. An override that doesn't abide by these guarantees has a bug. For existing virtual APIs that have already documented a guarantee about a non-nullable return, it's expected that the return type will be annotated as non-nullable, and derived types must continue to respect that guarantee, albeit now with the compiler's assistance.
However, for existing virtual APIs that do not have any such strong guarantee documented but where the intent was for the return value to be non-null
, it is a grayer area. The most accurate return type would be T?
, whereas the intent-based return type would be T
. For T?
, the pros are that it accurately reflects that null
s may emerge, but at the expense of consumers that know a null
will never emerge having to use !
or some other suppression when dereferencing. For T
, the pros are that it accurately conveys the intent to overriders and allows consumers to avoid needing any form of suppression, but ironically at the expense of potential increases in occurrences of NullReferenceException
s due to consumers then not validating the return type to be non-null
and not being able to trust in the meaning of a method returning non-nullable. As such, there are several factors to consider when deciding which return type to use for an existing virtual/abstract/interface method:
- How common is it that an existing override written before the guarantee was put in effect would return
null
? - How widespread are overrides of the method in question? This contributes to (1).
- How common is it to invoke the method via the base vs via a derived type that may narrow the return type to
T
fromT?
? - How common is it in the case of (3) for such invocations to then dereference the result rather than passing it off to something else that accepts a
T?
?
Object.ToString
is arguably the most extreme case. Answering the above questions:
- It is fairly easy in any reasonably-sized code base to find cases, intentional or otherwise, where
ToString
returnsnull
in some cases (we've found examples in dotnet/corefx, dotnet/roslyn, NuGet/NuGet.Client, aspnet/AspNetCore, and so on). One of the most prevalent conditions for this are types that just return the value in a string field which may contain its default value ofnull
, and in particular for structs where a ctor may not have even had a chance to run and validate an input. Guidance in the docs suggests thatToString
shouldn't returnnull
orstring.Empty
, but even the docs don't follow its own guidance. - Thousands upon thousands of types we don't control override this method today.
- It's common for helper routines to invoke via the base
object.ToString
, but manyToString
uses are actually on derived types. This is particularly true when working in a code base that both defines a type and consumes itsToString
. - Based on examination of several large code bases, we believe it to be relatively rare that the result of an
Object.ToString
call (made on the base) to be directly dereferenced. It's much more common to pass it to another method that acceptsstring?
, such asString.Concat
,String.Format
,Console.WriteLine
, logging utilities, and so on. And while we advocate thatToString
results shouldn't be assumed to be in a particular machine-readable format and parsed, it's certainly the case that code bases do, such as usingSubstring
on the result, but in such cases, the caller needs to understand the format of what's being rendered, which generally means they're working with a derived type rather than calling through the baseObject.ToString
.
As such, for now, we've annotated Object.ToString
as returning string?
. We can re-evaluate this decision as we get more experience with consumers of the feature.
In contrast, we've annotated Exception.Message
(which is also virtual) as being non-nullable, even though technically a derived class could override it to return null
, because doing so is so rare that we couldn't find any meaningful examples where null
is returned.
- DO implement
IComparable<T?>
instead ofIComparable<T>
on a reference typeT
. - DO implement
IEquatable<T?>
instead ofIEquatable<T>
on a reference typeT
. e.g.
public sealed class Version : IComparable<Version?>, IEquatable<Version?>, ...
The C# compiler respects a set of attributes that impact its flow analysis. We make use of these attributes when simple annotation is insufficient to fully express the contract of a method.
- DO annotate any methods that always throw (e.g.
ThrowHelper.ThrowArgumentException
) or that always exit the process (e.g.Environment.Exit
) with[DoesNotReturn]
. - DO annotate generic
out
arguments onTry
methods with[MaybeNullWhen(false)]
. In general, such arguments may benull
when the method returnsfalse
because the method was unable to fetch/compute the desired data. If the consumer defined the generic type as being non-nullable, then such attribution will highlight that theout
argument may benull
when the method returnsfalse
but will be non-null
when the method returns true. If the consumer defined the generic type as being nullable, then the attribute simply ends up being a nop, because a value of that generic type could always benull
(e.g. ifnull
s are allowed as values in a dictionary, where a successfulTryGetValue
call returningtrue
could reasonably produce null). - DO annotate non-generic
out
reference type arguments onTry
methods as being nullable and with[NotNullWhen(true)]
(e.g.TryCreate([NotNullWhen(true)] out Semaphore? semaphore)
). For non-generic arguments, the argument should be nullable because a failed call will generally storenull
into the argument, and it should be[NotNullWhen(true)]
because a successful call will store non-null
. - DO annotate
ref
arguments that guarantee the argument will be non-null
upon exit (e.g. lazy initialization helpers) with[NotNull]
. - DO annotate properties where a getter will never return
null
but a setter allowsnull
as being non-nullable but also[AllowNull]
. - DO annotate properties where a getter may return
null
but a setter throws fornull
as being nullable but also[DisallowNull]
. - DO add
[NotNullWhen(true)]
to nullable arguments ofTry
methods that will definitively be non-null
if the method returnstrue
. For example, ifInt32.TryParse(string? s)
returnstrue
,s
is known to not benull
, and so the method should bepublic static bool TryParse([NotNullWhen(true)] string? s, out int result)
. - DO add
[NotNullIfNotNull(string)]
if nullable ref argument will be non-null
upon exit, when an other argument passed evaluated to non-null
, pass that argument name as string. Example:public void Exchange([NotNullIfNotNull("value")] ref object? location, object? value);
. - DO add
[return: NotNullIfNotNull(string)]
if a method would not returnnull
in case an argument passed evaluated to non-null
, pass that argument name as string. Example:[return: NotNullIfNotNull("name")] public string? FormatName(string? name);
- DO add
[MemberNotNull(string fieldName)]
to a helper method which initializes member field(s), passing in the field name. Example:[MemberNotNull("_buffer")] private void InitializeBuffer()
. This will help to avoid spurious warnings at call sites that call the initialization method and then proceed to use the specified field. Note that there are two constructors toMemberNotNull
; one that takes a singlestring
, and one that takes aparams string[]
. When the number of fields initialized is small (e.g. <= 3), it's preferable to use multiple[MemberNotNull(string)]
attributes on the method rather than one[MemberNotNull(string, string, string, ...)]
attribute, as the latter is not CLS compliant and will likely require#pragma warning disable
and#pragma warning restore
around the line to suppress warnings.
Code reviews for enabling the nullability warnings are particularly interesting in that they often differ significantly from general code reviews. Typically, a code reviewer focuses only on the code actually being modified (e.g. the lines highlighted by the code diffing tool); however, enabling the nullability feature has a much broader impact, in that it effectively inverts the meaning of every reference type use in the codebase (or, more specifically, in the scope at which the nullability warning context was applied). So, for example, if you turn on nullability for a whole file (#enable nullable
at the top of the file) and then touch no other lines in the file, every method that accepts a string
is now accepting a non-nullable string
; whereas previously passing in null
to that argument would be fine, now the compiler will warn about it, and to allow nulls, the argument must be changed to string?
. This means that enabling nullability checking requires reviewing all exposed APIs in that context, regardless of whether they were modified or not, as the contract exposed by the API may have been implicitly modified.
A code review for enabling nullability generally involves three passes:
-
Review all implementation changes made in the code. Except when explicitly fixing a bug (which should be rare), the annotations employed for nullability should have zero impact on the generated IL (other than potentially some added attributes in the metadata). The most common changes are:
-
Adding
?
to reference type parameters and local symbols. These inform the compiler that nulls are allowed. For locals, they evaporate entirely at compile time. For parameters, they impact the [Nullable(...)] attributes emitted into the metadata by the compiler, but have no effect on the implementation IL. -
Adding
!
to reference type usage. These essentially suppress the null warning, telling the compiler to treat the expression as if it's non-null. These evaporate at compile-time. -
Adding
Debug.Assert(reference != null);
statements. These inform the compiler that the mentioned reference is non-null
, which will cause the compiler to factor that in and have the effect of suppressing subsequent warnings on that reference (until the flow analysis suggests that could change). As with anyDebug.Assert
, these evaporate at compile-time in release builds (whereDEBUG
isn't defined). -
Most any other changes have the potential to change the IL, which should not be necessary for the feature. In particular, it's common for
?
s on dereferences to sneak in, e.g. changingsomeVar.SomeMethod()
tosomeVar?.SomeMethod()
; that is a change to the IL, and should only be employed when there's an actual known bug that's important to fix, as otherwise we're incurring unnecessary cost. Similarly, it's easy to accidentally add?
to value types, which has a significant impact, changing theT
to aNullable<T>
and should be avoided. -
Any
!
s added that should have been unnecessary and are required due to either a compiler issue or due to lack of expressibility about annotations should have a// TODO-NULLABLE: http://link/to/relevant/issue
comment added on the same line.
-
-
Review the API changes explicitly made. These are the ones that show up in the diff. They should be reviewed to validate that they make sense from a contract perspective. Do we expect/allow
null
s everywhere a parameter reference type was augmented with?
? Do we potentially returnnull
s everywhere a return type was augmented with?
? Was anything else changed that could be an accidental breaking change (e.g. a value type parameter getting annotated to become aNullable<T>
instead of aT
)? -
Review all other exported APIs (e.g. public and protected on public types) for all reference types in both return and parameter positions. Anything that wasn't changed to be
?
is now defined as non-nullable. For parameters, that means consuming code will now get a harsh warning if it tries to passnull
, and thus these should be changed ifnull
actually is allowed / expected. For returns, it means the API will never returnnull
; if it might returnnull
in some circumstance, the API should be changed to return?
. This is the most time consuming and tedious part of the review.