-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Indicator coverage: visualisation in browser #103
Comments
@sabahfromlondon If you have time, happy to get your input on how to integrate this information. I can give more background on a call. |
As in #102 (comment), have all the percentages mean the same thing. That is: a high % means high coverage. Here, the missing fields table has high % meaning low coverage. |
@jpmckinney yes let's have a call to discuss. Happy to provide input :) |
I agree that consistency is good, but I'm not clear what the best avenue for consistency is here - whether it should be that higher means the user can/should take action, or whether higher is 'better' on the calculation. Higher means action point:
Higher is better on calculation:
Would it help to rename the "missing fields coverage" so that we are not using the term 'coverage' in both? |
I am not familiar with a mental model in which a big bar means "take action". Typically, in visualizations, the results that require action are indicated using color (red), icons (alert) or text (the action to take). Having the bar match field coverage fits a mental model that works in the real world. If a basket is 5% full of apples, the apples will cover a short amount of the basket. 5% is small, amount of apples is small: we have consistency. If a person's job is to fill the baskets to 100%, "small = action" comes very naturally. We could say the basket is 95% empty of apples, but we typically do not think in terms of the share of missingness, whether in real-world or data scenarios. In many dashboards (including Pelican), higher = better, consistently, and users are comfortable and quick to identify "oh, that bar is short, I should work to fix that." Bar length doesn't have a strong association with a need for action, one way or another; it requires interpretation. For indicator sorting, I think it is better to maintain the same order across presentations. People who use the tool more than once will start to remember which indicator appears where. Using the same order avoids user errors like assuming that the first indicator is the same as on their last visit. |
For this table, there are a few simplifications to make: When there are many columns and rows, the user has a harder time determining what to focus on. We can reduce the number of columns by:
The three-level hierarchy (topic, indicator, fields) with two expandable elements (topic, fields) is also an issue. Having expandable topics is good, since we have 7-8 topics and 37 indicators; a user might only be interested in specific topics, for example. Having a second expandable element, however, introduces UX issues (@sabahfromlondon can share from our user testing of the DRT error report.) Turning to the missing fields table: I am not clear on why the rows have multiple fields. method_1 for cost overruns depends on 5 fields: id, budget/amount/amount, budget/amount/currency, completion/finalValue/amount, completion/finalValue/currency. If my data has 95% coverage for amounts, but 2% coverage for currencies (maybe I only set the currency when it is a foreign currency), then reporting an amount-currency pair together will show 2%. As a user, I might have a hard time reconciling this with my prior knowledge, since I might know that my amount coverage is high. Furthermore, knowing that the pair is 2% doesn't tell me whether I need to improve amount, currency, or both. So, I think each field needs its own row. Of course, another dataset could have 50% amount coverage and 50% currency coverage, but 0% combined coverage (if they are never used together), in which case the user could be confused why the overall coverage isn't 50%. That said, I would still report each field individually because, (1) I believe this scenario is extremely rare - most of the time, the overall coverage will just be a bit lower than the minimum coverage of the required fields - and (2) we end up swapping this (rarely encountered) confusing scenario for the (frequently encountered) confusing scenario in the previous paragraph. |
Also, like with field coverage, we should use a bar to show coverage, which makes it much faster to identify low coverage than numbers (especially if all numbers have the same number of digits like 10-99). |
I'm new to OC4IDS, so will limit myself to UX comments. With the testing that we did for the regular Data Review Tool, we found that it was easy for issues to get missed. I think that having the second expandable element will exacerbate this issue. The one dropdown e.g. Efficiency should expand all the issues within that category. I would also add the number of issue types found for that category e.g. Efficiency - Three issue types found. Second removing columns that we don't need! These really need to be tested with users. We found that some the regular DRT columns were not clear to users and others did not add value. High being good for some contexts and bad for others won't work. It places too much cognitive load on the user. Number of issues keeps things simple, and, at a high level, for users to know if their dataset is perfect, good, bad or uncheckable, we used a meter visual. Happy to share screens. Hope this helps. |
Thanks both. Given the sprint focus on field level coverage, I'll defer to that for now and we can pick up this discussion in future, if that suits you. |
Suits me. Feel free to reach out directly any time :) |
We could:
e.g.
(image updated to show indicators ordered by indicator coverage desc)
The text was updated successfully, but these errors were encountered: