Skip to content

Conversation

@jasonmalinowski
Copy link
Member

@jasonmalinowski jasonmalinowski commented Dec 3, 2021

Our FatalError type had two "handlers", the "fatal handler" and "non fatal handler." If you called any ReportAndCatch method, that would be counted as "non fatal" because you're catching the exception and presumably dealing with it. If you called any ReportAndPropagate, that would call the "fatal" handler.

This was confusing from a few perspectives. In the IDE, both the "fatal" and "non-fatal" handlers actually did the same thing; they both reported a non-fatal report via telemetry and kept the process alive. At first, I was tempted to use fatal vs. non-fatal as way to report different severities via telemetry, but the concept doesn't jive well there. Sometimes we were calling ReportAndCatch when the feature absolutely broke and the user would have known -- in that case we want the severity to be high. Sometimes we're also catching an exception and gracefully dealing with it, so the severity should be lower. Similarly, propagating an exception usually means it's a severe problem, but if it's a background job that has a higher-up error handler, we may not want to treat it as high severity. In the end, the conclusion was the fatal vs. non-fatal split didn't make sense in relation to severity, and since it doesn't correlate to whether you want exceptions to flow or not, it's easier to just add severity as a separate concept when reporting, and unify the code so everything is going through a single reporting path.

The compiler story is a bit different and code reviewers should read the commit message of the last commit in this PR for the tactical fix we're doing for now.

#pragma warning restore CS0618 // ReportIfNonFatalAndCatchUnlessCanceled is obsolete
{
// Log a Non-fatal-watson and then ignore the crash in the attempt of getting flow graph.
Debug.Assert(false, "\n" + e.ToString());
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This Debug.Assert prevents the compiler from flagging that the "return null" below is actually disallowed by the nullable annotations.

{
return LanguageDebugInfo.GetLanguageID(pBuffer, iLine, iCol, out pguidLanguageID);
}
catch (Exception e) when (FatalError.ReportAndCatch(e) && false)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These were all doing ReportAndCatch because they wanted the "non fatal" behavior (which mind you was the case either way) but still wanted to propagate so that way the COM error handling still happens. We can call the regular method now.

Comment on lines +58 to 62
Action<Exception> fatalHandler = e => FaultReporter.ReportFault(e, VisualStudio.Telemetry.FaultSeverity.Critical, forceDump: false);
Action<Exception> nonFatalHandler = e => FaultReporter.ReportFault(e, VisualStudio.Telemetry.FaultSeverity.General, forceDump: false);

SetErrorHandlers(typeof(IInteractiveWindow).Assembly, fatalHandler, nonFatalHandler);
SetErrorHandlers(typeof(IVsInteractiveWindow).Assembly, fatalHandler, nonFatalHandler);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a separate piece of work (outside the scope of this PR) to remove the reflection against the interactive window package. The interactive window package should just be doing it's own thing directly.


internal enum ErrorSeverity
{
Uncategorized,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Uncategorized" is the default for now, but the VS APIs recommends not using that for anything new. My plan is to merge this PR, do another cycle or two getting everything else annotated, and then removing "Uncategorized" entirely along with the defaults of the parameters. I don't want to bloat this PR too much annotating everything, and also doing the switch to remove the defaults will break in-flight PRs trying to call the Report APIs so we need to do that quickly.

Copy link
Member

@333fred 333fred left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compiler changes LGTM (commit 7)

#else
public
#endif
static bool ReportWithDumpAndCatch(Exception exception, ErrorSeverity severity = ErrorSeverity.Uncategorized)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we use this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the one use got removed, so we could get rid of this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right: I wanted to keep it around until we have confidence that the dump telemetry requests are working again, as this is the backup until they are.

This hasn't directly used Watson in quite some time.
Our FatalError type had two "handlers", the "fatal handler" and "non
fatal handler." If you called any ReportAndCatch method, that would
be counted as "non fatal" because you're catching the exception and
presumably dealing with it. If you called any ReportAndPropagate,
that would call the "fatal" handler.

This was confusing from a few perspectives. In the IDE, both the "fatal"
and "non-fatal" handlers actually did the same thing; they both reported
a non-fatal report via telemetry and kept the process alive. At first,
I was tempted to use fatal vs. non-fatal as way to report different
severities via telemetry, but the concept doesn't jive well there.
Sometimes we were calling ReportAndCatch when the feature absolutely
broke and the user would have known -- in that case we want the severity
to be high. Sometimes we're also catching an exception and gracefully
dealing with it, so the severity should be lower. Similarly, propagating
an exception usually means it's a severe problem, but if it's a
background job that has a higher-up error handler, we may not want to
treat it as high severity. In the end, the conclusion was the
fatal vs. non-fatal split didn't make sense in relation to severity,
and since it doesn't correlate to whether you want exceptions to
flow or not, it's easier to just add severity as a separate concept
when reporting, and unify the code so everything is going through a
single reporting path.

The compiler story was a bit different: the compiler would only set
the "fatal" handler; the non-fatal handler was never set. This meant
that a call to ReportAndPropagate would crash the process,
ReportAndCatch would keep it going. This seems sensible at first glance,
except the ReportAndCatches in the compiler often were swallowing
exceptions and returning placeholder results from the SemanticModel;
in that case we're OK with the compiler crashing as well, since the
compiler is the source of those issues.
My logic for severities was this:

1. If it's anything in the core workspace that will break all features
   (like parsing, semantic model, etc.) it's Critical.
2. If the fault is going to prevent an explicit user gesture from
   working, it's Critical. So an exception breaking go to definition,
   a refactoring, etc., is Critical. We failed to do what the user told
   us.
3. Situations where we failed to do part of a feature are General.
   If we triggered completion but a single provider failed, we are
   missing content, but there's a good chance the user was still able to
   have happen what they wanted.
4. Errors while computing tags for features are General, since it's
   possible we'll catch up if the user had broken code.
5. Situations where we're updating background stuff and the user might
   never notice are Diagnostic.
When I started down the path of unifying our fatal/non-fatal handlers,
we didn't believe there was anywhere the compiler was catching
exceptions while reporting them, because in the command line compiler
case that would simply be equivalent to catching and silently swallowing
the issue. Upon further investigation, there were a handful of places
where the compiler was doing exactly that! Worse off, those are places
where if an exception is thrown we'd return invalid nulls from some APIs
which probably would make the caller crash.

We agreed that we need to remove those catches -- some of them seem
to be in service of our IOperation APIs, which at some point must have
been fairly unstable and we didn't want that causing stability issues.
But by now, any issues should be worked out, and if they're not they're
causing analyzers to not analyze code which may be a problem. However,
since we don't know if there's surprises lurking, we're going to keep
the existing behavior for now and then remove those try/catches at
the start of the 17.2 cycle where we'll have sufficient time to react.

This commit does the following then:

1. Makes all the ReportAndCatch* APIs private if we're compiling at
   the compiler layer, to prevent any future use of them.
2. Adds an [Obsolete] API for the current behavior under a new name.
3. Moves the callers over to it.

Once 17.2 cycle starts we will:

1. Delete these new APIs.
2. Clean up the scoping around ReportAndCatch to simply
   #if !COMPILERCORE around the whole thing.
3. Remove the flag added that's set by the IDE.

At that point, the command line compiler has no way to catch and report
non-fatal errors that wouldn't be fatal if it's the command line
compiler. If we decide we need that, we can change this.
@jasonmalinowski jasonmalinowski force-pushed the include-severities-in-faults branch from 9f0ea98 to 409f385 Compare January 4, 2022 19:28
@jasonmalinowski jasonmalinowski merged commit cb30fc7 into dotnet:main Jan 4, 2022
@ghost ghost added this to the Next milestone Jan 4, 2022
@jasonmalinowski jasonmalinowski deleted the include-severities-in-faults branch January 4, 2022 21:49
@Cosifne Cosifne modified the milestones: Next, 17.1.P3 Jan 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants