-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"compared to previous week" percentages are high even when absolute change is low #1178
Comments
This may actually be less a small-counts issue and more a batch-reporting issue, which we already know is common in this dataset. Here's a view of the raw death counts for the same region -- it seems the actual increase in incident deaths between May 23 and May 30 is not 1-2, but 15-20. Depending on the reason for the spike on May 25, this may be a good candidate for the anomalies spreadsheet that feeds annotations in the web visualizations. It is still worth discussing whether to censor certain information for small-population regions or for small-count signals. We should decide:
|
Added Roni so he can follow the discussion. @RoniRos |
Thanks! Based on the raw counts you shared:
In any case, my point is that percentile change starting from a total 7day count of 1 is uninformative, and arguably misleading or at least distracting. We can decide not to calculate percentile change if the previous 7day total is less than, say, 10. Note that the condition is only on the previous 7day total (the denominator in the percentile calculation), not the current 7day total.
True. But note that this is fairly orthogonal to my point. My point would have been the same if in the most recent 7days, instead of (0,21,0,0,0,0,0), we had, say, (2,4,3,4,3,2,3).
Actually, I prefer not to censor counts, merely to avoid displaying percentages when they are based on a small-count denominator.
Definitely not censor the figure (top row). That figure is based on the current 7day total, which may actually be quite large. But even if it's small, I wouldn't censor it
Raw count, and I suggest <10. I don't think population size is very relevant to this issue, except that low-pop counties are more likely to have low raw counts.
I agree, and suggest something like "Small Counts", maybe in a two-line, tiny font like the one we use for "per 100k". This will hopefully become recognizable as an icon that means "not calculated because small counts make this value uninformative". |
I was talking about censoring any information, not just counts. I don't understand how avoiding displaying percentages is different from censoring those percentages. If the distinction is important to you, could you explain?
I've looked into what it would take to do this, and we have a few options. The change since last week display is based on the
The above is taken from the actual query performed by the frontend in determining the "change since last week" for deaths and results in "+42.3%" (
pinging @sgratzl to weigh in |
Revisiting this issue. I understand and appreciate the overhead incurred by these solutions. I am not happy about it, but am also not happy about letting "+424.0%" stand; it doesn't reflect well on our system. Since the I understand this is not trivial to do right. Let's let this issue sleep until we have to revamp related code for other needs, too. |
On the COVIDcast dashboard for Allegheny County the current deaths (relative change to 7 days ago) are displayed as a very large percentage change (at this time we took the screenshot it was +424 .0% change in number of deaths.) @RoniRos suggested seeing this large number may be confusing as at first glance it appears deaths are dramatically increasing when the number went only from 0 to 1-2 deaths. It may be less confusing for viewers to see N/A for such small changes.
Go to https://delphi.cmu.edu/covidcast/?region=42003 for Allegheny County or use any other county dashboard.
Included screenshot from June 1 for Allegheny County as an example. When the deaths moved from 0 deaths to 1-2 the viewer sees it jumps up by a huge number like +424% for this example.
Rating scale 1-2 minor issue
The text was updated successfully, but these errors were encountered: