-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Explainer: WebDriver Extension for Accessible Nodes, etc. (potential solution for #197) #203
Comments
Thanks for the detail here, @cookiecrook. This is a really solid start.
Would this return an axId or an accessible node snapshot?
Can you explain the desire to return multiple properties at once here? This is very different to elements, for example, where WebDriver only returns one thing at a time. For example, to get an element attribute, you use /session/{session id}/element/{element id}/attribute/{name}, which only gets a single attribute. Obviously, returning multiple things in one call is better for performance. However, it does add some complexity in the spec; e.g. we have to work out what set of things to return as per your discussion section. If we return a single thing at a time, we can avoid some of that complexity. Having different methods for every single thing would be tedious for extensibility. But perhaps we could have a simple attribute getter with a defined set of attribute keys we can expand over time? For example, /session/{session id}/accessibility/node/{axId}/attribute/{name}, where {name} could be "label", "role", "pressed", etc. Further down the line, some thought needs to be given to how axId is specified. Some engines have simple globally unique 32 bit numeric ids for accessible nodes. I think Chromium does? I'm not sure about WebKit. However, Gecko does not, instead having a 64 bit unique id which is only guaranteed to be unique within the document, not across documents. So, some care needs to be taken in terms of what assumptions are made. I see WebDriver specifies that a node id is created as "a new globally unique string" and it also specifies that there is a "node id map". We might need to do something similar for axId. Or perhaps we can just specify that the id is globally unique but opaque and implementation defined? I'm not sure if that's reasonable. Bikeshedding: Maybe this is just me, but I don't love the name event or notification for things we perform on an accessible node. I tend to think of events and notifications as things that a node fires. I would have suggested "action", but that gets conflated with default or custom actions. Maybe "interaction"? |
@jcsteh wrote:
That's an open question. Initially I thought either "ax node from element" or "ax node from id" should return the same snapshot object, but I don't have a strong preference for or against making the additional call...
Mainly to avoid tedium and perf hits... In the tree walker use case, for example, making each attribute/property separate calls could turn one call per element into dozens or hundreds per. But I acknowledge it could work either way.
The Gecko GUID question seems worthy of researching sooner rather than later. Obviously the spec should be limited to features anticipated to be readily implementable in all engines. I agree with all your other points, and I acknowledge those are open questions too. |
It probably doesn't matter that much. If these calls return a snapshot, the snapshot should include the axId. The answer to this question will depend heavily on whether we go with snapshots or individual getters.
That's certainly true. This seems to be something that was considered acceptable for DOM elements in WebDriver and it'd be nice to have a similar interface for simplicity/consistency. On the flip side, we don't need to use WebDriver to walk the DOM tree, whereas we have no choice for the accessibility tree, so I realise the use case is quite different. |
If an opaque, implementation defined, globally unique id string is acceptable, I think this should be implementable in all engines. That said, when I first raised this, I didn't realise that a WebDriver session had a "current browsing context". As I understand it, a browsing context is associated with a document. If the accessibility methods use this browsing context, that means we only need to look at the document associated with the current browsing context, not all documents everywhere. That does make this a lot more feasible. I guess we probably still want the id string to be globally unique though, even across browsing contexts? |
Some scattered (sorry, it's that kind of day) thoughts: It might be helpful to guide the discussion if we could document some of the types of things we'd like to be able to test in WPTs using these APIs. Specifically, I think questions like those @jcsteh is asking around returning a property bag vs. returning discrete properties (as is done for Element properties), and questions around including/excluding ignored nodes when tree walking, might be easier to answer with a solid understanding of what we're going to do with the output.
Some of this makes me wonder whether we'd want to require accessibility to be "enabled" before an accessiblenode can be retrieved, so that we can ensure that the accessibility IDs are consistent (AFAICT currently Chrome at least implements computedname/computedrole on top of the CDP I guess all the property names will be based on the ARIA names, as the best platform-independent vocabulary we have available? |
Possibly need a way to register for outward notifications too… e.g. When a live region changes. |
Live region changes in particular might be tricky to standardise. Each API does them differently, which I suspect means core browser implementations vary wildly. Notably, IAccessible2 and ATK don't have specific live region events, but instead rely on generalised text inserted/removed events and the client checking live region properties on the object. This is not to suggest that registering for outgoing events isn't something we need. It very probably is. However, I think it might take longer to iron out the details there and it might not make sense to block this work on standardising live region events. Is there some other outward notification we can start with to get the core concept working? Focus or selection perhaps? |
Spoke with @OrKoN today who mentioned a related use case for accessibility in webdriver... possibly w3c/webdriver-bidi#443 |
TPAC-related updates summarized in #197 (comment) |
Most relevant from above linked notes:
So for the sake of near-term interop, the minimum viable product could focus on those that could ship near-term in all three engines:
But not an outgoing notification snarfer, for example. [Update Nov 7: As an example, this likely means that outgoing ARIA Live Region notifications would not be testable in WebDriver Classic.] |
Actually we could even remove (3. Trigger/Synthesize Accessibility Event/Notification) from the MVP, but it seems achievable and useful, so I'm keeping it in the short list for now. |
Potential error codes:
|
Rather than muddy the problem issue #197 with a specific proposed solution, I'm posting this as a standalone issue. Ideally we could turn this Issue into an Explainer and eventually a Spec, but the goal is to get wider approval of the idea first, during a few meetings at TPAC 2023 Sept 11–15 in Spain.
Note: I will be editing this problem description, so expect changes.
Note on WebDriver-BiDi
This explainer does not use BiDi examples, but we don't anticipate problems converting to the other format and welcome accessibility additions to Classic and/or BiDi. It's been suggested that this be added to the BiDi roadmap.
Current State of Cross-Browser Web Accessibility Testing
Existing WebDriver accessibility testing methods go through DOM Element, to AX Element, then to its label or role.
get a string value from the backing accessibility object (if it exists) of a given DOM element
In 2023, we added over 1000 automated accessibility tests to the WPT Interop 2023 Accessibility Investigation using the above two WebDriver methods, but there is so much more to test, and no way available to test it in WPT/WebDriver.
Potential Changes
See also: #197
…a new WebDriver accessibility extension might look something like this:
1. Way to access the backing "accessible node" of a DOM element (if one exists).
Note
Only one of the following two accessors 👇 are needed, not both
get accessible node from its mainstream DOM element (if one exists)
EITHER a new method in a new accessibility-specific webdriver extension.
OR a new method on the existing webdriver element interface.
Note
Only one of the preceding two accessors 👆 are needed, not both. Currently prototyping option 2.
2. Way to access an "accessible node" by its WebDriver ID directly (e.g. you may receive this ID from a parent/child cross-reference).
Regardless if an accessible node is associated with a DOM element (some are not), once you already have the accessible node id:
get accessible node by its WebDriver ID
3. Way to Trigger an Accessibility Event/Notification.
We also need a way to trigger a notification on the accessibility object, too.
Note
synthesizeevent
is just a draft name. Very open to change on every aspect of this.Common Events
Click/Press
where the minimum payload is the notification type (e.g. a screen reader “click” would fire):
Explanation: “AX Press” almost always results in a DOM “click” but the event object on a “press on AX object” event can end up very different from a “click on DOM element.” For example:
AT Focus (pulls keyboard focus if the element is focusable)
AT Focus should be verifiable, b/c it will pull standard keyboard focus along with it, if the AT focused elements is keyboard focusable.
Other Events/Notifications
Trigger “Action” (lower priority for v1/MVP)
It could also be used for non-default “actions” (e.g. trigger the associated “reply” action):
This one 👆 has native precedent, but the proposed Web API hasn’t yet shipped, so it may be lower priority.
Scroll into view a.k.a. “scroll to visible” (lower priority for v1/MVP)
This might not be needed as it’s usually called downstream from focus, rather than directly from AT.
Show Menu (lower priority for v1/MVP)
Show menu (VO and other AT’s equivalent to show the “right-click” menu). This sometimes results in a different AT-vs-mainstream behavior when web site has overridden the “right-click” mouse behavior.
I don’t know how if “showMenu” would be interoperable on other systems, but it’s in WebKit because Mac VO and other AT support it. I assume Windows has something similar.
4. Test-Only (WebDriver-only for now?) Interface for accessible node.
Return value for the accessibleNode would be a static snap shot of the element at the time of the request:
Example return object for accessible node getter.
Discussion Points
Getter Interface for ~“accessibleNode”
There’s a balance between whether to return a limited scope of known things to query" or to return "over-expose” as much as possible about the backing accessible object… Some relationships or properties are costly or slow to return, so we’ll probably need to start with a subset of the things that all implementations can return reasonably quickly.
Perhaps multiple getters: a default set of the easy ones (role, label, required, checked, yadda, yadda) and then we don’t include the ones with a significant perf cost or other complications unless requested specifically.
/session/{sID}/accessibility/~ax_element/{axID}
for defaults/session/{sID}/accessibility/~ax_element/{axID}/~full
for everything/session/{sID}/accessibility/~ax_element/{axID}/~partial
for a specific set, with an array of keys in the post payloadNote
Note that
~
above indicates TBD draft name proposals... Open to changes, of course.Object/Node Persistance
API should be clear that Accessibility Objects/Nodes are not expected to persist once removed from the accessibility tree. Though this may be possible in some implementations, it is unlikely to be readily achievable in all implementations, so:
The text was updated successfully, but these errors were encountered: