Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with event counters #23630

Closed
roji opened this issue Dec 8, 2020 · 6 comments
Closed

Issues with event counters #23630

roji opened this issue Dec 8, 2020 · 6 comments
Assignees
Labels
area-infrastructure closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Milestone

Comments

@roji
Copy link
Member

roji commented Dec 8, 2020

The following counters don't get updated when the async API is used (as opposed to the sync one):

In addition, the "Active DbContexts" counter is incorrect, since it's incremented only when the context's InternalServiceProvider is accessed, but decremented when disposing (regardless of whether InternalServiceProvider was accessed). Am moving the increment to occur in the constructor, /cc @ajcvickers

Note: I've spent some considerable time preparing automated tests for event counters, however:

  • Due to the nature of event counters (captured out of process over networking), they seem like they will may be the source of a lot of flakiness and headaches.
  • They're really slow (counters get emitted at most once per second).
  • Need to be totally isolated from any other EF Core test activity, as the counters are application-global.
  • The value we get from preventing regressions doesn't seem very high.
  • FWIW neither ASP.NET nor runtime have tests for their event counters.

So am dropping this for now, but we can rediscuss if we feel we need to.

@ajcvickers
Copy link
Contributor

@roji As I understand it, all we do in EF for event counters is to increment a value in EntityFrameworkEventSource. Leaving aside for the moment that there is a shared static instance of this, would it not be possible to write tests where, for example, the code calls SaveChanges 10 times and then we check that the value has been incremented 10 times? This should not involve trying to actually use event counters in the test, but would still have caught these issues had the tests been written.

Coming back to the static/global issue, it seems to me that some refactoring and changing of private code to pubternal could allow a test to be setup with an isolated EntityFrameworkEventSource that could be used in tests.

This doesn't tell us that event counters are working end-to-end, but as you say others are not testing that either because of the issues you outline. However, it would give us a pattern for testing new event counters that verify at least that the "EF part" is working as expected.

@roji
Copy link
Member Author

roji commented Dec 10, 2020

Leaving aside for the moment that there is a shared static instance of this, would it not be possible to write tests where, for example, the code calls SaveChanges 10 times and then we check that the value has been incremented 10 times?

We could do this, but these counters aren't exposed publicly (at least not at the moment) - the only way to observe them right now is via the perf counters. If you think we should expose them, let me know and I'll do it (that would be introducing public surface in a patch).

Coming back to the static/global issue, it seems to me that some refactoring and changing of private code to pubternal could allow a test to be setup with an isolated EntityFrameworkEventSource that could be used in tests.

I'm not sure the static/global issue is actually blocking testing - xunit does allow turning off parallelization for a specific test collection (with [CollectionDefinition(DisableParallelization = true)]), this should allow us to just test directly on the static/global instance. Other than that, I'm not sure exactly what refactoring you have in mind, but I wanted to avoid doing virtual dispatch etc. in the product just for the sake of testing etc.

Basically I'm fine with anything here (especially after seeing the errors I missed) - let me know what you prefer me to do. Since the window for 5.0.2 is closing very soon, we may also want to merge this and add test coverage later in main.

@ErikEJ
Copy link
Contributor

ErikEJ commented Dec 10, 2020

@roji And there is no way to inspect the perf counters programmatically?

@roji
Copy link
Member Author

roji commented Dec 10, 2020

It's possible to connect and consume the counters just like the dotnet counters tool does - but see my original comment above with reliability issues etc. It's probably doable, but not sure the effort is worth it (and in any case I wouldn't be comfortable with introducing potentially unreliable tests into a release branch...).

@roji
Copy link
Member Author

roji commented Dec 10, 2020

In case anyone's interested, here's the incomplete attempt at proper end-to-end coverage: roji@38305bb roji@de7a95b

@ajcvickers ajcvickers modified the milestones: 5.0.2, 5.0.3 Dec 10, 2020
@ajcvickers
Copy link
Contributor

Punting this out to 5.0.3; we don't need to rush this into 5.0.2.

ajcvickers added a commit that referenced this issue Jan 6, 2021
…ce testing

Fixes #23630

**Description**

Three event counters don't get updated when the async API is used (as opposed to the sync one). An additional event counter is incorrectly updated, and so show wrong results.

**Customer Impact**

Our newly-introduced event counters show incorrect data which does not take into account async calls.

**How found**

Customer reported an issue on 5.0.0 with one counter, the rest discovered via due diligence.

**Test coverage**

Test coverage added after investigation and discussion of the best way to do it. We settled on using reflection to access the counters and then configuring the tests to not run in parallel. These tests are both for detecting regressions, and so that going forward we can test new event counters to avoid mistakes like this in the future.

**Regression?**

No, event counters are new in 5.0.0.

**Risk**

Very low, add missing counter updates which are already in place and working for the sync versions, and move the location of another counter update. Only affects the new event counters feature.
@ajcvickers ajcvickers added the closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. label Jan 11, 2021
This was referenced Mar 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-infrastructure closed-fixed The issue has been fixed and is/will be included in the release indicated by the issue milestone. customer-reported Servicing-approved type-bug
Projects
None yet
Development

No branches or pull requests

3 participants