Initial design for displaying tests! #15

spectranaut · 2019-10-30T19:01:15Z

Hey ya'll!

Please provide feedback on the "displaying" of the following tests. To make it clear what text the test harness is providing and what is supplied via the API, I added <em> tags to all of the text that is specific to the tests (so provided by either the test author or anything AT specific).

Here are three different "tests":
reading checkbox
operating checkbox
reading checkbox group

A few things to consider:

@jfhector designs discussed here and what we should incorporate from them.
Exactly what information we want recorded in each case ("all pass", "all fail", "some fail")

Soon to come:

The buttons for "all pass", "all fail" and "some fail" will be moved to tabs or a radio group.

The text was updated successfully, but these errors were encountered:

ghost · 2019-11-08T20:04:32Z

Hi all,

I went through the initial design @spectranaut had updated, pretending I'm a tester and summarized my thoughts as follows. For each test, I categorized my thoughts into four groups.

1.Overall design: Presentational aspect of the test.
2.Instruction: Thoughts on Instruction and Success Criteria
3.Record Results: Thoughts on Record Results
4.Results: Thoughts on screen after submitting test

Perhaps of help, I recorded my user testing and put them in the following folder.
https://uofi.box.com/s/3b0i23zsgiqx8bt1f46k7cl5rtxh9mgv

[Reading checkbox]

[1. Overall design]
1.1 Test window will disappear every time a tester tried to type in information on a form, and I wish I could keep this open. (6:23)
1.2 A tester tried to understand the big picture of the test first but found it a little difficult to see the boundary between the instruction area and Record Results area. Perhaps having fieldset and categorizing into two groups (one for Instruction + Success Criteria and the other for Record Results) might help.
1.3 [Submit] button supposed to be disabled unless a tester fills in all the required fields.
1.4 Clarify if [Speech output after command] field is required or not. If this field was optional to capture output if the results were incomplete or incorrect,
can we show the field only when a tester selected Incomplete or Incorrect?

[2.Instruction]
2.1 In 1 of 2, the estimated time might help users expect the workload. e.g. Estimation for this test; 30m
2.2 For testers with fewer experience links to some reference guides may be expected.
2.3 In 2 of 2, it took me some time to understand the difference with test 1 of 2, since the state of the checkbox is subtle. Clarify the difference by making checked text bold. (23:00 )
2.4 Related to 2.3, the Test Page checkbox state can be confusing since the Lettuce checkbox is unchecked if a tester closes the test page and reopened then from the [Open Test Page] button on 2 of 2. I assume this is supposed to be checked. (25:43)

[3. Record Results]
3.1 I was slightly confused with the Other Details field, for what kind of information to be recorded. Perhaps adding some expectation or example?
3.2 Distinction between the 'All Incomplete Output' and ' Incorrect Output' seems difficult to comprehend without an example.

[4. Results for test (after submit)]
4.1 Once the test has submitted, I wish I could see some feedback such as Test submitted
4.2 The big [FAIL] line made me worry that the test submission has failed and all the data results for the test have deleted. Clarify by rephrasing [Test result: Failed]?(36:45)

[Operating checkbox]

[3. Record Results]
3.1 I wasn't quite sure whether the name of the sandwich condiment was supposed to be announced. My assumption behind marking them as All Correct Output was that the name of the condiment is already announced when moved the cursor so that it wasn't necessary for this testing. (3:00)
3.2 Do we need a separate speech output form for checked and unchecked?

[Read checkbox Grouping ]

[2.Instruction]
2.1 Is there a clear distinction between 'announced' and 'communicated'? If 'announced' means a screen reader has to announce the specific information (e.g. group, sandwich condiments), and 'communicated' means some sort of information has to be covered (e.g. something like before.after the checkbox), I wish the distinction be clarified.

[3. Record Results]
The first test, 3.1 Results for command: 'Tab/Shift+Tab seemed to me didn't work with JAWS, NVDA and VoiceOver, so not sure if we want to keep this test if my understanding is right. (11:45)

*Numbers in parenthesis corresponds to time on the recording.

spectranaut · 2019-11-13T19:28:15Z

Response to @Yohta89's experience:

Clear changes

Bugs to fix:

(read 1.1) Window should not disappear everytime you enter information in to the form.
(read 2.4) The checkbox should be checked every time you reopen the window from the second behavior verification.

Easy things to change:

(read 1.4) Make it clear the speech output is required for every key command (add word "required" somewhere)
(read 4.2) Make it clear that the test failed, not the submission of the test

Things I might not get to in the next two weeks, but should be implemented:

(read 1.3) Disable button until all required fields are filled
(read 4.1) Add a "test submitted" text appear after submission. We might just have this handled in the runner

Design issues I am not sure how to solve

(read 1.2) Clearer boundaries between "Instructions" area and "Record results" area
(read 2.1) Estimated time to complete test. Should the test authors provide this, or should we calculate based on number of key commands and assertions?
(read 2.2) Links to references for less experience users Which links should we add?
(read 2.3) When the instructions to verify a new behavior appear, make clear the difference between the two behaviors. We might need the test authors to supply this information somehow.
(operating 3.1) I don't understand Yohta's concern here: there is no assertion about the name of the checkbox being announced, so it's not necessary to pass. What exactly is unclear here?
(operating 3.2) I think we might need separate speech outputs for checked to uncheck and unchecked to checked. I'm not sure how to handle this case. Maybe the output of both should be copied into a single speech output.

Things to document:

(read 3.1) What kind of "other information" do we want users to supply? We should begin to document this
(read 3.2) The difference between "All Incomplete Output" and "All Incorrect Output" and "incomplete output" vs "incorrect output"
(read grouping 2.1) Difference between announced (soon to be "spoken") and communicated (soon to be "conveyed").

jfhector · 2019-11-14T01:00:12Z

I've just reviewed the latest updates to the prototype. I think that it's great work, well done! And I also love that @Yohta89 has actually done a user test and videoed it. I want to work with more people like you Yohta!

I've added some thoughts on the issues highlighted by @Yohta89, and also on other issues that I've found.

Issues raised by Yohta

Regarding point 1.2 in Yohta's comment regarding the separation between the two main sections on the page

Rewording the headings and nesting them slightly differently might help. For example:

Heading level 2: Test instructions
Heading level 3: Success criteria for the tests
Heading level 2: Record the test results

Regarding point 1.3 in Yohta's comment regarding disabling the submit button

My understanding of screen reader user experience is that it's best to not disable submit buttons, as they stop being discoverable. Instead, the submit button could stay enabled at all times, but if pressed too early, the page should display and announce the errors that need to be fixed because the user is able to successfully submit the page.

You can see a live example of this pattern in the UK Government 'Register to Vote' form, or in TenonUI's form demo

Regarding point 3.1 in Yohta's comment regarding "Other details"

I was also confused by the label "Other details". Having followed this project, I have an idea of what we'd like testers to write here. But I doubt that testers would intuitively know what's expected of them.

We should find a more descriptive label. Maybe:

"Additional notes"
"Additional testing notes"

Regarding point 3.2 in Yohta's comment regarding the distinction between the 'All Incomplete Output' and ' Incorrect Output'

I agree that this is a bit confusing at first.

Adding documentation could be part of the solution. But I don't think that adding more documentation will be enough (most humans don't read documentation most of the time). We need to try to make things simpler to grasp immediately and intuitively.

I believe that this is also linked to another issue about column headers (see 5 just below).

Other potential usability issues / suggestions

5: Column header don't act as column headers

Note: I believe that this is potentially a big usability issue. (We'd have to do usability testing to confirm how big a deal it actually is). But it's one that we could fix now or later, I believe, without disrupting the structure of how the page works.

In a table, I'd expect the column headers to describe what's in each column. Instead, in this prototype, the column headers don't do that, but provide an affordance to tick all the radio buttons in that column.

I understand why we do that, and I think that it's quite clever, and has a lot of benefits.

So it is possible that we will choose to keep that design pattern as is. But first I'd like to highlight a usability issue that comes with this specific design, so that we can try to improve it.

I was confused looking at these tables at first, because they work differently than almost any other table I've found online. (See first paragraph just above).

Suggestion for simple, more familiar column headers

More usual, expected column headers would be "Correct output", "Incomplete output" and "Incorrect output".

Then, in the table cells under each of these headers, I imagine that we could simply have radio buttons. Each of these radio buttons would be doubly labelled (using aria-labelledby), first by its column header, then by its row header.

Suggestion for allowing users to tick all the "Correct", "Incomplete" or "Incorrect" options at once

The table structure I just described would give us a table that (I believe) would work in a more familiar way, but it doesn't yet give users the possibility to check all radios in a column at once.

I think that there are other ways that we could give users that affordance:

For example, there could be an additional row, just after the column headers (and before the three existing rows). It might be that that additional row doesn't have a row header, but its row cells would each contain a button. The buttons would read "All correct", "All incomplete" and "All incorrect". Clicking one of the buttons would tick all the radio buttons in the same column in the rows below it.

So, to recap, the buttons would be below the "Correct", "Incomplete" and "Incorrect" column headers, and would read "All correct", "All incomplete" and "All incorrect". I hope that that would make it clear what the buttons do (at least visually) while still allowing the existence of descriptive column headers.

For screen reader users, the buttons could be labelled and described like this:

"All correct" label with the following aria description:
- "Click this if the role 'checkbox', the name 'Lettuce' and the state of the checkbox (checked) are all correctly communicated".
"All incomplete" label with the following aria description:
- "Click this if the role 'checkbox', the name 'Lettuce' and the state of the checkbox (checked) are all partially not communicated".
"All incorrect" label with the following aria description:
- "Click this if the role 'checkbox', the name 'Lettuce' and the state of the checkbox (checked) are all incorrectly communicated".

I don't know whether that solution would work as is. We'd need to prototype it. My intention here is to suggest that:

our current table structure design has the downside of being unusual and I expect disorientating because it doesn't have column headers and has something a unusual and disorientating in place of column headers
I believe that we can find alternative solutions, ideally now, if not later

6: I find the current main page headers confusing

I found the main page headers (i.e. 'Testing behaviour 1 of 2') unclear/confusing.

I'm not sure what the phrase 'Testing behaviour 1 of 2' means. Initially I read the header as "there is such a thing as a 'testing behaviour', and I'm looking at the first one of these 'testing behaviours'".
Now I think that the header might mean "Now we're testing behaviour 1 of 2, and later we will be testing behaviour 2 or 2". But I'm still unclear what a 'behaviour' is.
I didn't understand why there were two pages. It only hit me now that the first page is for tests to be executed in Reading Mode, and page two for tests to be executed in Forms mode. But it's likely that I'd go to page two and not read the 'Instructions' section again, if the page header looks similar. (Humans don't read web pages, they just scan them quickly and rely on sign posts).

Suggestions for clearer, more descriptive page titles

What other word could we use?

maybe "Test run 1 of 2"
or "Test page 1 of 2"
or "Tests to run in Virtual Cursor Mode" and "Tests to run in Forms Mode"

I imagine that this one would work best. Or at least, I think that we should have something as descriptive as that 3rd one – something that gives an insight as to what's on the test page; and how that test page is different from the other one.

7: Pascal Case is harder to read

This one is a minor thing.

'Sentence case' is easier to read than 'Pascal Case', and more usual online. I think that Column headers should be in sentence case (e.g. "Other details" or "All correct output")

spectranaut · 2019-11-14T20:51:03Z

Thanks @jfhector for the review! I'll get to it soon.

@Yohta89 I can't replicate the bug where the test page closes every time you enter data into the form.. it looks like you are using Chrome, what is the exact version?

mfairchild365 · 2019-11-14T20:54:20Z

I agree with all of your thoughts, @jfhector. We might be able to redesign the table as you have suggested. Additionally, it might be good to make the first column a row header, would (hopefully) provide a better group label for the series of radio buttons in the row.

What do you think about moving the radio buttons that currently serve as column headers outside of the table entirely? Perhaps just before the table and after the 'Relevant speech output after command (required)' text field.

jfhector · 2019-11-15T00:41:11Z

I've just done a very quick interaction prototype for the alternative table structure I had in mind. I just wanted to see whether it'd work, and how to best make it work.

I think that it does work, but let me know what you think, and how we might improve this (or the other table design) further. We might want to do some very quick usability testing to settle on a design.

Where you can see the interaction prototype

Here's a live version of the interaction prototype.

And here's the code for the interaction prototype.

The code is in a new branch I've created, called 'prototypes-exploration'. I don't expect that we would ever merge anything from this branch. I only intend to use it for throwaway interaction prototypes and code examples. Do say if you think it's better to not have this new branch.

Differences between this prototype and the description I gave in my previous comment above

I've ended up putting the 'Tick all correct/incomplete/incorrect' buttons in a row in the table footer (rather than above the other rows).
Note: @mfairchild365 thought about putting these buttons outside the table altogether, but I wouldn't know how to ensure that they stay visually lined up with each column – without giving each column a pre-determined width and doing something that feels hacky in CSS.

Alternatively, we could keep the buttons inside the table (as they are now), but style the borders of the table body and table footer to make it look like the buttons are just outside the table, below it. Doing that, we might want to visually hide the associated row header text, maybe.

My views on the interaction prototype

Note: I've only created this very quick prototype to test the idea I had in mind. I don't mind what solution we end up going for.

What I like about the prototype

The table is structured more like a regular table, in so far as it has column headers that describe the cells of that column. I expect that this structure will be easier to understand for users.

What I don't like about the prototype

The accessible names of the radio buttons are very long. And the part of each radio button's name that is unique is at its end.

This is because I tried to create a name that reads clearly. I use aria-labelledby to point to the row header first, then a hidden span that contains the word 'with', then the column header. This gives accessible names like this: "The role 'checkbox' is spoken with correct output".

Testing the experience with VoiceOver, I find that it works quite well. But clearly, such long accessible names are unusual. And it could be that someone who isn't as intimate with the design of the prototype gets confused by it.
Although the last row is technically part of the table footer (rather than the table body), it still looks like it's another row of data – which it isn't. We can solve that visually with CSS, but not so well for screen reader users. I don't imagine it's a big problem.

Improvements that we'd need to make, if we went for a table structure like this

We'd need to make the radio buttons much bigger, because they're not associated with a HTML label element so their click areas is currently too small.
My mini-prototype doesn't include a column for additional testing notes. I just forgot to include it. If we like the approach the prototype is taking, I'll add that column to the prototype to see how it changes it. I think that it should work reasonably well, with a bit of CSS.

ghost · 2019-11-15T18:35:06Z

Thanks @jfhector for the review! I'll get to it soon.

@Yohta89 I can't replicate the bug where the test page closes every time you enter data into the form.. it looks like you are using Chrome, what is the exact version?

@spectranaut
So you're trying to replicate the bug I experienced from '1.1 Test window will disappear every time a tester tried to type in information on a form, and I wish I could keep this open. (6:23)' ,right?

I'm using Google Chrome Version 78.0.3904.97 (Official Build) (64-bit). And it seems like I'm experiencing the same issue in Firefox (70.0.1) as well.

(operating 3.1) I don't understand Yohta's concern here: there is no assertion about the name of the checkbox being announced, so it's not necessary to pass. What exactly is unclear here?

My confusion comes from the assumption for screen readers to announce the name of the condiment that has checked, since VoiceOver (which is my first choice for Screen Reader) announces 'checked, Lettuce, checkbox' in this test.

However, you're right that this is a test about operating checkbox and we want to test the change in the state of a checkbox.
PerhapsI'm overthinking this since I use multiple screen readers, but this confusion would less likely to happen if the test facilitator or test instruction could guide testers to focus on the assertion and not to think too much beyond assertion, considering the fact that this is the test about operating checkbox in JAWS (not 'reading' checkbox).

mcking65 · 2019-11-18T18:48:53Z

I feel there is confusion around the concept of having shortcuts for marking output as correct. I think the table header or footer concept does not reflect real life.

In real life, it will often be the case that all output is correct. But, if you are testing 3 things, e.g., role, name, and state, it is not likely that all 3 will fail. And, if they do, they could fail in different ways, i.e., one could be incomplete and the other incorrect. So, the shortcuts for all incomplete and all incorrect do not map to circumstances that are ever likely to occur in the real world. This creates a lot of confusing noise in the UI.

The only conditions when we want to test 2 or more discrete elements of output at once and to have a shortcut for the result are when:

All the outputs are in the same utterance.
All have the same priority, e.g., all are must-have output, not should-have or nice-to-have output.
They are correct a very significant percentage of the time

By combining tests in this way we can significantly streamline both writing and running tests, especially as more and more tests pass.

I think the shortcuts need to be completely independent of the table. The shortcut choices are:

Correct
Incomplete or incorrect

If correct is chosen, the table auto populates,otherwise, nothing happens in the table and the user has to choose an option for each behavior for that command.

For example, Here is what I would like to hear if I were to tab into a shortcut radio group for JAWS insert+up output results on checkbox in the role, name, and state test:

Insert+up arrow output, checked, correct: name, role, and state were conveyed, radio button 1 of 2.

Then, if I arrow to the other option:

checked, Incomplete or Incorrect: name, role, and state were not fully or accurately conveyed.

That is, a radio broup labeled by a heading "Insert+Up output" that contains two buttons, one for correct, and another for incomplete or incorrect. The descriptive part of the label makes the meaning of the option fully understood.

After the radio group, you can have the same table you have now, but it will not have controls in the column headers. That way, they will be read much more nicely by screen readers and be less confusing for everyone else.

If you choose the shortcut radio for "correct", we could dynamically add a link right after the shortcut radio group for "Record results for NEXT_COMMAND" where "next_command" is the next heading text. However, that link would not be there if you choose incorrect.

spectranaut · 2019-11-18T21:06:09Z

So you're trying to replicate the bug I experienced from '1.1 Test window will disappear every time a tester tried to type in information on a form, and I wish I could keep this open. (6:23)' ,right?

@Yohta89 I watched your video recording (awesome, so detailed!) and now I understand what you mean. It's true the window "disappears" but it's just behind your other window (as you know, because you navigate back to it). I assume testers will have to find their own ergonomic way of navigating between the test page window and the test results window. It might be helpful to have them side by side, each only taking up half the screen, for sited testers.

spectranaut · 2019-11-18T21:16:37Z

Thanks for the thoughtful explorations of alternative "shortcut" radio buttons, @mcking65 and @jfhector ! I'll incorporate a few of your design choices, @jfhector, but I'll try putting the radio button outside of the table per Matt's suggestion. I'd appreciate more feedback on how to make this pretty to sited users once everything is in place.

@mcking65 -- what should be the accessible name for the radio buttons that are within the table? Should it include the assertion, like in JF's prototype?

spectranaut · 2019-11-18T22:18:28Z

another question for @mcking65: you asked for a link that will skip the table and send you to the next command to record results for. Can I implement this as a button that moves focus to the text input for the next key command, essentially skipping the table?

mcking65 · 2019-11-20T01:20:55Z

@spectranaut asks:

what should be the accessible name for the radio buttons that are within the table? Should it include the assertion, like in JF's prototype?

As we discussed yesterday, the buttons we need for each assertion are:

Correct output
No output
Incorrect output

It might be nice for screen reader users to hear the assertion in the accessible name. One way to include the assertion in the name might be like this:

For correct output:

Correct output for The role 'checkbox' is spoken
Correct output for The state of the checkbox (not checked) is conveyed

For no output:

No output for The role 'checkbox' is spoken
No output for The state of the checkbox (not checked) is conveyed

For incorrect output:

Incorrect output for The role 'checkbox' is spoken
Incorrect output for The state of the checkbox (not checked) is conveyed

I would love to have a pair of terms that sound less similar than "Correct" and "Incorrect". Maybe "Good output" and "Incorrect output"? Or, "acceptable" and "Incorrect". That could help reduce errors when recording results.

mcking65 · 2019-11-20T01:23:43Z

@spectranaut asks:

another question for @mcking65: you asked for a link that will skip the table and send you to the next command to record results for. Can I implement this as a button that moves focus to the text input for the next key command, essentially skipping the table?

Yes, except that Since it is moving focus to another place on the same page, it should be a link.

spectranaut · 2019-11-20T01:28:53Z

Hey @jfhector and @mcking65 I add form validations and put the "all correct/all incomplete" radio buttons above the table. @jfhector in particular can you take a look because I am DEFINATELY not a designer and could use any opinions you have on how to make it better :)

I only am just seeing @mcking65's recent comments so I'll get to them tomorrow.

mcking65 · 2019-11-20T06:55:50Z

I'm discovering a lot as I attempt to craft tests for combobox.

Most relevant to this issue is the fact that there are other types of results that we need to capture. Some of which I am aware are:

Did a mode switch occur that should have occured?
Did a mode switch occur that should not have occured?
Output was correct per the assertion, but other things were announced that were problematic, e.g., tons of duplication of output, random speech from other elements, etc.
Unforeseen types of defects that impact the results, e.g., navigating causes an unexpected reading cursor jump, operating freezes or crashes the screen reader.

Here are my thoughts on incorporating these things.

Mode Switch Assertions

This is a type of assertion that is only relevant to specific screen readers that can switch between reading and interaction modes automatically, JAWS and NVDA. For example, when you Tab to a composite or text input widget in reading mode, the mode should switch to interaction. Actually, I guess VoiceOver automatically switches to interaction in some circumstances as well.

So, say my test is titled "Navigating to an empty, collapsed, editable combobox in reading mode conveys role, name, and editable property."

When conducting this test, the tester will navigate with arrow keys and tab as well as a quick key. When using JAWS or NVDA, tab should also switch modes from reading to interaction.

To capture this, some options could be:

We could have a test file that is only for testing mode switching. So, that file would have only the navigation commands that trigger mode switching. And the assertions would be only for mode switching. Then, The user could perform the exact same navigation commands again (plus other navigation commands that do not switch modes) in a second test file and report on all the output related assertions.
We could have a separate test file for the screen reader navigation commands that do switch modes. That is, if testing with JAWS or NVDA, you could get a test file with navigation commands that switch modes and test all the navigation assertions (role announcement, state announcement , etc.) as well as mode switching. Then another test file for the commands that do not switch modes.
We could figure out an architecture where assertions are both screen reader and command dependent. Then the tester can test all the reading mode navigation commands at one time, but there would be one additional assertion for the commands that trigger mode switching.

From the perspective of running tests, I like option 3. I think it is most friendly for the tester. And, it might be more friendly for the test writer, depending on how we construct test writing. However, it does seem like it would be more complex to implement in the harness.

From the perspective of harness simplicity, option 1 might be best. It will create more work for testers. However, at least the purpose of the duplicate work is very clear.

Option 2 reduces the amount of work for testers but seems like it adds complexity when compared to option 1 without much benefit.

Perhaps there are still more approaches worth considering.

In any case, this affects our harness because the labels on buttons cannot be just about output.

Capturing defects unrelated to assertions

As described above, there can be a wide variety of things that go wrong that happen as a result of commands we are testing but are not described by saying one of our assertions failed. We have to capture these types of defects in our data because they can destroy screen reader interoperability even if some or all the output is correct.

I think the easiest way to capture this is to change our shortcut radio options. Here is what they might look like for checkbox for the JAWS insert+up command:

All good: Insert+Up conveyed, role, name, and state and there were no unexpected or undesirable behaviors.
Some failure: Insert+Up either failed to meet an assertion or an unexpected or undesirable behavior occurred.

Then in the results table, there is a row for each assertion as we have now with columns for Good Support, No Support, Incorrect Support. I do not think we need the additional tester notes field on each row. Naming the columns this way enables the assertions to be about behaviors other than spoken output. Although for some behaviors, good support and no support might be enough.

Then below the table is another radio group named "Other failures" with two options:

No unexpected or undesirable behaviors were observed.
Some other unexpected or undesirable behavior was observed.
If option 2 is chosen, an edit field named "Description of other failure for insert+Up" appears.

mfairchild365 · 2019-11-20T12:14:34Z

Good finds, @mcking65. Some thoughts.

For automatic mode switching, a 4th option could be to make it a conditional assertion so that in only applies to screen readers (or AT) that support automatic mode switching. It may be possible to even define which AT the assertion applies to programmatically so that the tester doesn't have to guess.
Automatic mode switching makes me wonder if we should assign a severity level to failures. If automatic mode switching does not occur, does it result in an inconvenience or does it outright prevent understanding or operation?
For capturing defects unrelated to assertions: I'm tempted to enumerate the ways that this can happen with the possibility of an 'other' field. The programmer/analyst in me wants to be able to find all the commands that resulted in duplicate output vs random output, and screen reader devs might wish to do the same. This isn't possible (or very hard) if testers are using their own language to describe errors.

spectranaut · 2019-11-20T20:07:50Z

In today CG meeting we agreed with @mfairchild365 about having a list of failures.

The list we have so far is:

unexpected reading cursor jump
screen reader crashed
browser crashed
screen reader includes excessive and/or irrelevant information after key command

@mcking65 can you tell me all the key commands for checkbox (as they currently exist in resources/at-commands.mjs that will trigger a change in the screen reader modes?

spectranaut · 2019-11-22T18:05:59Z

You can see an example of an additional assertion for the screen reading changing modes after "TAB" in read-checkbox.html test file now.

mcking65 · 2019-11-22T20:37:06Z

@mfairchild365 wrote:

Automatic mode switching makes me wonder if we should assign a severity level to failures. If automatic mode switching does not occur, does it result in an inconvenience or does it outright prevent understanding or operation?

Yap, actually assertions need to all have a must, should, or nice to have categorization. That is part of our original plan. I've added that to my combobox testing plan.

BTW, mode switching is not a blocker for people who know how to manually control mode. Unfortunately, that is not the typical user. So, for some people it can be a blocker when mode switches don't work right.

mcking65 · 2019-11-22T20:48:01Z

@spectranaut wrote:

In today CG meeting we agreed with @mfairchild365 about having a list of failures.

The list we have so far is:

unexpected reading cursor jump

screen reader crashed

browser crashed

screen reader includes excessive and/or irrelevant information after key command

Here is how I would refine the wording and content of this list:

Output is excessively verbose, e.g., includes significant amounts of redundant or irrelevant speech
Reading cursor position changed in an unexpected manner
Screen reader became extremely sluggish
Screen reader crashed
Browser crashed
Other

mcking65 · 2019-11-22T20:49:44Z

@spectranaut wrote:

You can see an example of an additional assertion for the screen reading changing modes after "TAB" in read-checkbox.html test file now.

Mode switching does not happen for checkboxes. However, it does for comboboxes, and I have included it in the test plan I am about to post.

mcking65 · 2019-11-22T21:11:43Z

I have written 9 tests for combobox so far, but only in Excel. I will export each test to a separate CSV file, and export a command list in CSV format aswell and put it in the prototype branch in the combobox-autocomplete-both directory.

I have no idea how many tests there will be for combobox. Perhaps in the neighborhood of 25, but it could be many more.

Going through this process, I am realizing several things that we need to do beyond the prototype:

We really do need an efficient way to write tests. This could support gathering everything needed for the test plan, including a description of necessary steps for the setup script. This is the piece that needs to be done by assistive technology/web accessibility experts.
All test plans need to be reviewed, and we need an easy way to review the plans and manage those reviews. This needs to be a way that presents the plan in a very easy to read format. It can't be the HTML file, and it really ought not be the harness.

Now for changes to the harness and its inputs.

Specifying to which AT a test applies

I added a field in the header of each test to specify which AT a test applies. For now, I createively called it, "Applies To". I imagine this field having values like:

Screen Readers: means any screen reader on any platform
Desktop Screen Reader: applies to any desktop screen reader, but not mobile or touch-based screen readers.
Touch Screen Reader: Applies to only touch-based screen readers
JAWS[, NVDA[, ... ]]: comma separated list of any supported screen readers

Obviously, the above means that we need to be able to assign tags or categories to each supported assistive technology to identify the buckets into which they fall.

Instructions based on mode

In some of the tests, I used different wording for the interaction mode instructions and the reading mode instructions. I think we need to be able to do this.

Command Listener

In the list of commands, I have 4 columns:

Task
Command Listener
Mode
Command

If the command is not related to the assistive technology, e.g., it is processed by the browser/web page, I specified the listener as the browser. Otherwise, it is a specific screen reader.

Other refinements

I put navigate to the combobox and read the combobox into separate tests. This is because the reading commands are not navigation commands.

In addition, I made separate commands for navigating with mode switching, because some navigation commands switch modes, and some do not. And, this can vary among screen readers. I think the specificity of our commands list is going to work to our advantage here.

For now, this is all in the attached Excel.
combobox-autocomplete-both.xlsx

mcking65 · 2019-11-26T08:15:00Z

@spectranaut, Here are some things that we need to address in the harness. I will start with some easy stuff to clean up wording and make it easier to understand.

It would help with accessibility to have a title tag in the tests., e.g.:

<title>Navigating to editable combobox switches mode from reading to interaction. | ARIA-AT Test Runner</title>

It feels like clutter to have an H1 with "ARIA-AT Test Runner". I am assuming this is a sort of substitute for branding to tell the user that this page is part of the aria-at project. I'm OK with leaving it for now if others think it is important. Otherwise, perhaps we nix it.

I think we should have only 1 H1, and it should only include the name of the test. So, for instance, instead of:

<h1>Current test (1 of 3): Navigating to editable combobox switches mode from reading to interaction.</h1>

Pull out the info about the test number and put it after the heading, e.g.:

<h1>Navigating to editable combobox switches mode from reading to interaction.</h1>
<p>Test 1 of 3</p>

I still find the "Testing behavior 1 of 1" or "Testing behavior 1 of 2" pretty confusing. This is a bit bigger issue to tackle, so saving that for later.

The accessible name on the list of commands is "AT Controls". It should simply be "Commands". Sometimes those commands will not be AT commands.

Under success criteria, Change the language:

"For each command listed above, the following assertions are met:"

That is more clear than "every possible command" and, I want to remove the word "must" because we will test more than must-have assertions.

Change the label on the output edit box from:

Relevant speech output after command (required):

to:

SCREEN_READER_NAME output after COMMAND_NAME (required):

e.g.:

JAWS output after Tab and Shift+Tab (required):

When I tab to those edit fields, I need to hear for which command the field pertains, and the word "Relevant" isn't really necessary.

The first shortcut radio for success is currently named:

All assertions have been meet after Tab / Shift+Tab and there was no additional unexpected or undesirable behaviors.

It incorrectly uses the word "meet" instead of "met". But, we can make the wording simpler by changing tense:

All assertions met after Tab / Shift+Tab without additional unexpected or undesirable behaviors.

Similarly, change:

Some assertions have not been met after Tab / Shift+Tab or there as an additional unexpected or undesirable behavior.

to:

Some assertions not met after Tab / Shift+Tab or additional unexpected or undesirable behavior occurred.

Now, something very important for screen reader users who tab through the form, we need the radios in the table to include the assertion in the name, e.g., "Good Support for ASSERTION_TEXT"

For the question: "Was there additional undesirable behavior?", we need a yes/no radio group, and then, if no, is chosen, present or enable the select. However, the select has to be multi-select. I don't know if html multi-selects are easily accessible. We'll need to test that out. And, finally, if "other" is selected, an additional edit field is necessary to catch a description of the other behavior.

I am really confused by the buttons:

Finish test
Redo Test
Skip Test
Submit Result

Skip and submit are easy to understand. I don't understand finish. And, re-do seems odd. Is that more like clear all answers?

I noticed a couple of bugs.

Refreshing the browser does not refresh the current page; it returns you to the runner page where you choose tests to run.
Checking the all good radio and then changing selection to the some failed radio leaves all the good radios in the table checked. I guess this could be a feature, but it seems dangerous. If there are failures, I think we want to force users to answer for each assertion.

mcking65 · 2019-11-27T08:55:05Z

@spectranaut, Per our discussion about labeling radios in the table, I labeled them in the first two rows of the table in this file table.html.txt.

The two rows use two different ways of including the words "for assertion" in the middle of the label. Use which ever you prefer. The first relies on a class (off-screen) that I did not include in the markup. The second uses aria-label.

Note that in both cases I moved the assertion ID from the td to the div to avoid including the words in the element with the required class.

spectranaut · 2020-03-11T14:26:13Z

I believe everything has been addressed and incorporate into the test design, so closing this issue!

spectranaut assigned mcking65, mfairchild365 and jfhector Oct 30, 2019

spectranaut mentioned this issue Nov 20, 2019

Mapping abstract test instructions to concrete, screen-reader-specific instructions #14

Closed

spectranaut mentioned this issue Nov 22, 2019

How to encode and display AT-specific instructions? #20

Closed

spectranaut closed this as completed Mar 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial design for displaying tests! #15

Initial design for displaying tests! #15

spectranaut commented Oct 30, 2019

ghost commented Nov 8, 2019

spectranaut commented Nov 13, 2019

jfhector commented Nov 14, 2019

spectranaut commented Nov 14, 2019

mfairchild365 commented Nov 14, 2019

jfhector commented Nov 15, 2019 •

edited

Loading

ghost commented Nov 15, 2019 •

edited by ghost

Loading

mcking65 commented Nov 18, 2019

spectranaut commented Nov 18, 2019

spectranaut commented Nov 18, 2019

spectranaut commented Nov 18, 2019

mcking65 commented Nov 20, 2019

mcking65 commented Nov 20, 2019

spectranaut commented Nov 20, 2019

mcking65 commented Nov 20, 2019

mfairchild365 commented Nov 20, 2019

spectranaut commented Nov 20, 2019

spectranaut commented Nov 22, 2019

mcking65 commented Nov 22, 2019

mcking65 commented Nov 22, 2019

mcking65 commented Nov 22, 2019

mcking65 commented Nov 22, 2019

mcking65 commented Nov 26, 2019 •

edited

Loading

mcking65 commented Nov 27, 2019

spectranaut commented Mar 11, 2020

Initial design for displaying tests! #15

Initial design for displaying tests! #15

Comments

spectranaut commented Oct 30, 2019

ghost commented Nov 8, 2019

[Reading checkbox]

[Operating checkbox]

[Read checkbox Grouping ]

spectranaut commented Nov 13, 2019

Clear changes

Design issues I am not sure how to solve

Things to document:

jfhector commented Nov 14, 2019

Issues raised by Yohta

Regarding point 1.2 in Yohta's comment regarding the separation between the two main sections on the page

Regarding point 1.3 in Yohta's comment regarding disabling the submit button

Regarding point 3.1 in Yohta's comment regarding "Other details"

Regarding point 3.2 in Yohta's comment regarding the distinction between the 'All Incomplete Output' and ' Incorrect Output'

Other potential usability issues / suggestions

5: Column header don't act as column headers

Suggestion for simple, more familiar column headers

Suggestion for allowing users to tick all the "Correct", "Incomplete" or "Incorrect" options at once

6: I find the current main page headers confusing

Suggestions for clearer, more descriptive page titles

7: Pascal Case is harder to read

spectranaut commented Nov 14, 2019

mfairchild365 commented Nov 14, 2019

jfhector commented Nov 15, 2019 • edited Loading

Where you can see the interaction prototype

Differences between this prototype and the description I gave in my previous comment above

My views on the interaction prototype

What I like about the prototype

What I don't like about the prototype

Improvements that we'd need to make, if we went for a table structure like this

ghost commented Nov 15, 2019 • edited by ghost Loading

mcking65 commented Nov 18, 2019

spectranaut commented Nov 18, 2019

spectranaut commented Nov 18, 2019

spectranaut commented Nov 18, 2019

mcking65 commented Nov 20, 2019

mcking65 commented Nov 20, 2019

spectranaut commented Nov 20, 2019

mcking65 commented Nov 20, 2019

Mode Switch Assertions

Capturing defects unrelated to assertions

mfairchild365 commented Nov 20, 2019

spectranaut commented Nov 20, 2019

spectranaut commented Nov 22, 2019

mcking65 commented Nov 22, 2019

mcking65 commented Nov 22, 2019

mcking65 commented Nov 22, 2019

mcking65 commented Nov 22, 2019

Specifying to which AT a test applies

Instructions based on mode

Command Listener

Other refinements

mcking65 commented Nov 26, 2019 • edited Loading

mcking65 commented Nov 27, 2019

spectranaut commented Mar 11, 2020

jfhector commented Nov 15, 2019 •

edited

Loading

ghost commented Nov 15, 2019 •

edited by ghost

Loading

mcking65 commented Nov 26, 2019 •

edited

Loading