Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(web): segmentation event + Promise resolution ordering 🐵 #7440

Closed
wants to merge 8 commits into from

Conversation

jahorton
Copy link
Contributor

@jahorton jahorton commented Oct 12, 2022

A part of the epic #7324 line, and the first member PR of "phase 3" for gesture implementation... at least "in spirit."

It's important to make sure that the segmentation engine's events and Promises provide a usable interface with consistent ordering to the upcoming gesture-synthesis layer. While doing very early prototyping for said layer, I came to realize that there were a few issues that made their use - especially in unit tests - far more difficult than would be ideal.

To clarify the issues and ensure they're properly addressed, this changeset also adds a unit test heavily based on said "very early prototyping", the start of which may be seen as follows:

let segmentEventPromise = () => new Promise((resolve) => touchpath.once('segmentation', resolve));
let promiseTail = segmentEventPromise().then(async (segments) => {
let [startSegment, secondSegment] = segments;

Since almost all of the Segments are produced and resolve asynchronously, using a state-machine based model to detect gestures needs to process each segment as it arrives, one at a time. However, the start Segment and the first pending Segment are produced at the same time, synchronously; this meant that specialized handling would be needed to get the event for both Segments.

Rather than requiring significantly complex handling external to TrackedPath and the segmentation engine, I believe the simplest solution is simply to provide both within a single 'raise' of the 'segmentation' event, hence the change to the event's signature. This may slightly complicate the synchronous signatures at play, but it prevents much greater complexity for interfacing with the asynchronous aspects of the design.

It is then possible to handle each segment in sequence via async / await as follows:

let segmentEventPromise = () => new Promise((resolve) => touchpath.once('segmentation', resolve)); 

// ...

// Use the segment at index 1 for the initial `segmentation` event; index 0 otherwise.
let segment: Segment; // corresponding to the current "link" in the Promise "chain"

// State machine decisions are first possible once the segment type is 'recognized'.
segmentType = await segment.whenRecognized;

// Must be set _before_ the await below; the resolution signal and the next segment happen 
// synchronously (in the same 'message' of the main message-loop)
//
// Skip if `segmentType` == 'end'; the chain ends in that case.
let nextPromise = segmentEventPromise();

if(transitionOnlyDependsOnSegmentType) {
  // Do state-machine transitions based on the new segment's type if `segmentType` is all that matters.
} else { // if the state machine transition depends on a property only determined at resolution time:
  // The current Segment will always fully resolve before its successor is emitted.
  await segment.whenResolved;

  // Do state-machine transition
}

if(segmentType != 'end') {
  [segment] = await nextPromise;  // Shifts to the next link in the chain.
  // Conceptually - goto: line 6 - `let segment: Segment;` above
}

@keymanapp-test-bot skip

@jahorton jahorton added this to the A16S12 milestone Oct 12, 2022
this.fakeClock.restore();
});

it('synthetic sequence: start -> hold -> move -> end', async function() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (new) unit test is derived from a first-pass extreme rough-draft/prototype for gesture synthesis control flow handling; its needs motivated most of the other changes in this PR.

/**
* Designed to facilitate creation of 'synthetic' touchpath sample sequences for use
* in unit tests.
*/
export class TouchpathTurtle {
export class TouchpathTurtle extends EventEmitter<EventMap> {
Copy link
Contributor Author

@jahorton jahorton Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows the class to serve as a source of synthetic touch events for non-segmentation unit tests. It'll definitely come in handy for easily-maintainable synthetic touchpath sequences for unit tests, as seen with the new unit test "synthetic sequence: start -> hold -> move -> end":

// START: TOUCHPATH SIMULATION
// Move NE 3 pixels over 66ms, sampling every 33ms.
turtle.move(45, 3, 66, 33);
// Wait for 20 ms
turtle.wait(20, 20);
// Move S 10 pixels over 100ms, sampling every 16ms.
turtle.move(90, 10, 100, 16);
// END: TOUCHPATH SIMULATION

With commentation, it's very easy to see and intuit exactly what sort of touchpath this is, and it's also very easy to modify the touchpath if needed in the future.

@jahorton jahorton changed the title fix(web): segmentation event + Promise resolution ordering fix(web): segmentation event + Promise resolution ordering 🐵 Oct 12, 2022
@jahorton jahorton marked this pull request as ready for review October 12, 2022 05:38
@jahorton jahorton mentioned this pull request Oct 12, 2022
1 task
@jahorton
Copy link
Contributor Author

... right, I didn't actually double-check for browser-based unit-test breakage; I only focused on the headless-mode stuff. The failed test run for Web is nigh-certainly a "mia culpa."

Copy link
Member

@mcdurdin mcdurdin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if we need to be using promises like this. Can't we just check the gesture state machine on a (a) touch event, or (b) timeout?

The timeout is needed only for longpress, so a setTimeout would be run on touchstart. Any macro movements from the starting point would cancel the timeout, which then cancels the longpress, and a new timeout can be started for a potentially different longpress to cater for roaming touch.

If we abandon the idea of abstracting gestures, as nice as that may seem conceptually, I think the model becomes far, far simpler:

  1. Tap -- on touch release, if the end point is on the same key as the start point (and longpress hasn't fired) that's a tap. (multitap assessment, see point 3)

  2. Flick -- on touch release, assess if the simple path from start to end point (ignoring transit points!) matches a cardinal or intercardinal direction and is long enough -- for this we need only the two points in question. Now, if we decide we want to be fancier on this, we still only need a single function call on the array of captured points between touch start and touch release, which returns the direction of the flick. Speed of flick should not be considered -- I think we are going to run into usability issues if we make flicks dependent on speed. Again, a longpress timeout cancels any chance of flick.

  3. Multitap -- on touch release, if the same key is tapped, and if the duration between the start of the previous tap, and the end of the second tap is within the threshold, match. (Or perhaps duration between end of previous tap and start of second tap, tbd.)

  4. Longpress -- once the longpress timeout fires, we enter longpress mode, and all other gestures become irrelevant. Handle as we already do.

  5. Roaming -- in initial implementation, I suggest these are only applicable if the keyboard doesn't have flicks. We may be able to tweak this in the future with distance thresholds -- not speed thresholds. Otherwise, these should work in the same way as they currently do. Once a roam has started, flicks and multitaps will not meet their starting conditions, so will naturally be ignored.

Is there any reason why this very simple model won't work? I may not have covered 100% of the conditions, but there are no abstractions needed, and by removing the abstractions, we also remove most of the complexity.


Note that visual feedback on flicks and multitaps would certainly be beneficial -- for example, see what happens on an iPad keyboard when you flick down to get digits and punctuation. This can then help give feedback to the user with roaming touch if the distance of the gesture exceeds the flick threshold, "snapping" off the flick.

This visual feedback can still be catered for in this simple model, on touchmove events.

@jahorton
Copy link
Contributor Author

jahorton commented Oct 13, 2022

I do wonder if we need to be using promises like this. Can't we just check the gesture state machine on a (a) touch event, or (b) timeout?

Even should the rest be simplified... (b)'s one aspect of what the Promises are for. It's not difficult to wrap a timeout call with a Promise. At the most basic level:

export const timedPromise = (func, time) => {
return new Promise<void>((resolve, reject) => {
setTimeout(() => {
try {
func();
resolve();
} catch (err) {
reject(err);
}
}, time);
});
}

And really, there's no need for that func param, now that I look at it. That can be safely placed on the .then clause. A bit more work the above code would lead to support for timeout cancellation as well.

@mcdurdin mcdurdin modified the milestones: A16S12, A16S13 Oct 16, 2022
@mcdurdin
Copy link
Member

Given our current discussions, I hope you don't mind if I return this to draft to take it out of the review queue temporarily?

@mcdurdin mcdurdin marked this pull request as draft October 19, 2022 19:29
@jahorton jahorton modified the milestones: A16S13, 17.0 Oct 21, 2022
@jahorton
Copy link
Contributor Author

Took some work, but I've successfully revived the branch; it's now based on the modularized form of the gesture engine.

This does nothing to address previous review concerns; I just figured that getting this up and running before making further changes would probably be wise; it's best to start new work from a stable position, after all.

Base automatically changed from chore/web/modularize-gesture-recognizer to feature-gestures June 21, 2023 01:27
@jahorton
Copy link
Contributor Author

jahorton commented Jun 21, 2023

This (and the 🐵 / gesture work) has been dormant for quite a while, so it'll probably take a bit to reboot discussion and our understandings fully.

If memory serves, my perspective on the prior discussion was that I needed to look more at the "forest" of the prior discussion than the "trees" of it; the general concern as opposed to the details.

  1. The main concern, I believe, was the code/development complexity of the subsegmentation algorithms and such.

    • In particular... how it was costing a fair bit of time to develop.
    • We did have to delay the release of gestures past 16.0, after all, so it's clearly a concern with merit.
  2. I know that there were also concerns regarding the math.

    • I feel like this is more of a "tree" - a detail - than the real focus, though.
    • If we drop to assuming anything between 'touch' and 'release' is a single segment, then we get just linear regression, which is only undergrad-level stats / linear algebra.
      • The existing CumulativePathStats class is designed to do this live as new input samples arrive.
    • The stats do get more 'involved' when we drop the last point's assumption and utilize subsegmentation as I had been doing.
      • These 'fancier' bits are found within PathSegmentation and its related classes.
      • These 'more involved' stats are just a more advanced version of regression - "segmented regression" - along with related bog-standard statistical tests that anybody with a Master's in stats should find familiar.
      • The "least standard" parts would be some of the details unique to segmented regression, but at least those are pre-existing and pretty well documented.

Looking back, and looking at the code that exists, I'm fairly confident that I can shift gears and go with a "single segment" approach for now. Makes me a bit sad, as I'd actually gotten the more advanced style's foundations working, but I suppose the "more advanced style" would take more time to vet properly.

That said, there are still a few concerns and points of difference worth talking about.

  1. We generally aim for strong backward compatibility and cross-version consistency if at all possible.
    • I'd like to assert a corollary here: Anything new we add should not be too restrictive on what we do in the future.
    • We've generally been quite adverse to "breaking changes"... and even to changes that require compiled checks in keyboards like "if in older version, do X instead of Y".

This was one of the big contention points, I think. Part of the original gesture design-work aimed to ensure that we can reasonably extend to multi-segment flicks and/or swipe-typing in the future. It may be a "while off", of course.

The point I'd like to make here: any expectation we set here for how certain gestures operate will continue to be an expectation if and when we do add the more complex ones.

  1. There's also the natural desire for good UX when it comes to gestures.
    1. Accessibility - how usable is it for people with less mobility / dexterity?
    2. Consistency - how reliable is it / how easy are the patterns to learn?
    3. ... potential for irritability - how much / little does it "get in the user's way" or require them to "go out of their way"?

And these points come to a head when we're talking about flicks.

My position:

  • We should be restrictive with what may be interpreted as a flick, keeping in mind potential future gesture types.
    • Allowing something to act as a flick now when it could not be interpreted that way later if a desired extra gesture type were added... that feels dangerous.
    • It's hard to get this 100% right at this point, of course.
  • Flicks are normally marked as an accessibility concern; if it can be typed via flick, it should also be typeable via a more "accessibility-friendly" pattern elsewhere within the keyboard.
    • Longpresses are also a concern, but they're likely "more accessible" than flicks and they have no "max time" threshold.
  • Accidentally-started flicks should be rare. (Keep 'false positives' low!)
    • People get bumped slightly, type while they're walking, etc.
  • The existing "roaming touch" behavior is itself an accessibility feature - it enlarges key caps, after all.

Your position, I think:

  • For accessibility reasons, we should be more forgiving with what can be interpreted as a flick.
  • The boundary between "roaming touch" vs "flick" is too blurry to allow both for the same keyboard.
  • ... flicks are more important than roaming touches?
    • This is just an inference, admittedly.

Now, if we decide we want to be fancier on this, we still only need a single function call on the array of captured points between touch start and touch release, which returns the direction of the flick.

I believe we need to be "fancy" in this regard. We should validate that flicks are in a relatively straight line. Suppose the touchpath bends significantly - say, from 15 pixels straight North to moving another 15 pixels West.

  • if doing multi-segment flicks, that's straight-up a N->W flick.
  • if allowing swipe, those are probably two separate components of a swipe. (It could be 'f' -> 'r' -> 'e' on a US keyboard, for example.)

Even worse - 15 pixels straight North, then 20 pixels straight South... for a net of 5 pixels South. Is this reasonable as a "flick to the South"?

  • Fortunately, this is as far from single-segment straight as possible, so it'd be very easy to rule out as a legal flick.

Either way, the two sequences above are far better suited to either potential future gesture type; it'd be awkward to have to backtrack a decision to accept that the first as a single-segment NW flick in the future due to backward-compatibility/consistency.

Fortunately... we can handle this reasonably with single-segment regression tests for now; "is this a straight line" is a pretty easy question for our segment stats-tracking mechanism to answer.

One possible point of concern:

Any macro movements from the starting point would cancel the timeout, which then cancels the longpress, and a new timeout can be started for a potentially different longpress to cater for roaming touch.

... and when does a 'macro movement' end? That's actually what subsegmentation is designed to detect and answer.

Our existing answer in KMW 'til now: when a new key ends up underneath the touchpoint, if we aren't yet locked in on a longpress, consider it a "macro movement".

  • Do note the "for roaming touch" bit - this clearly assumes that no flicks are expected; the same motion would be very reasonable for a flick.
  • It's not perfect, to be sure... but it is a decent approximation and can work for now.

@jahorton
Copy link
Contributor Author

All of that last comment to say... I think it actually is possible to make the basic, earlier-targeted gestures work decently without the subsegmentation engine, and existing pieces are sufficient to fill in the gaps for now. In which case, I guess that means we continue to leave this PR iced 🧊 until we decide to revisit and extend to allow more complex gestures.

@jahorton
Copy link
Contributor Author

I am still concerned on our differences regarding one notable point, though:

Speed of flick should not be considered -- I think we are going to run into usability issues if we make flicks dependent on speed.

It's a flick - there are going to be usability issues in one direction or another. The speed aspect is also part of the word's standard definition. If we don't consider speed at all, I believe it impossible to fully reconcile with roaming touches - even with a more complicated model.

Suppose a straight movement to the east for two keys' width on a touch device, then a 0.5 second pause.

  • Option 1: that's a flick to the East
    • possibly cancelled due to the pause after initial confirmation as a flick?
  • Option 2: that's a roaming touch into a longpress two keys over.

Would your answer (re: option 1 or option 2?) differ if I said...

  • the "straight movement" took 0.3 seconds?
  • the "straight movement" took a full second?
  • If the pause after the motion were 3 seconds instead of 0.5 seconds, would your answer change?

Once a roam has started, flicks and multitaps will not meet their starting conditions, so will naturally be ignored.

The problem being... if both are permitted, we probably can't confirm a roam until we've ruled out flicks. Speed would be a useful tool for detecting failed flicks in order to enter roaming mode.

Roaming -- in initial implementation, I suggest these are only applicable if the keyboard doesn't have flicks. We may be able to tweak this in the future with distance thresholds -- not speed thresholds.

If someone uses fast flicks - like, in the span of 0.2 seconds, a quick motion - I believe it would be harder for that person to consistently constrain the distance to meet such a "max distance" threshold. You'd end up penalizing the people who would otherwise get the most efficiency out of the feature, accidentally creating a "max speed" threshold. I'd see that going one of two ways:

  • Auto-cancels the input entirely -> "How did it miss my keystroke? Is this keyboard dropping inputs?"
  • "I guess it's a roaming touch now?" -> Selects the key under the would-be flick's endpoint: "I didn't ask for that key."

Now, if we auto-cancel on reaching an OSK boundary or device-screen boundary... that's more self-evident and provides an easier-to-recognize pattern for cancellation. They're boundaries that don't "move" based on the original touchpoint.

I do think we can move forward with dropping subsegmentation without arriving at a full answer to this concern, at least. The existing stats measurements do facilitate tracking speed, after all; it's just a matter of if we decide not to 'tap' that info.

@mcdurdin
Copy link
Member

I believe it impossible to fully reconcile with roaming touches - even with a more complicated model.

Indeed, which is why I have been saying drop roaming touches ... for the last however long.

@jahorton
Copy link
Contributor Author

jahorton commented Jun 21, 2023

If we don't consider speed at all, I believe it...

This is a critical part of the quoted statement.

Your position, I think:

  • [omitted]
  • ... flicks are more important than roaming touches?
    • This is just an inference, admittedly.

I'm concerned about the relationship of accessibility to roaming touch & its key previews; this seems to be dropping a "more accessible" feature for a "less accessible" one, with no plans to reconcile them at any point. I do not believe reconciliation impossible... unless we drop flick speed as a factor in determining "flick or not?". This is part of why I've been adamantly lobbying for it as a flick criterion... for the last however long.

To be clear... it does seem like we'll need to drop it for 17.0 anyway for the first wave of flick implementation - though we could offer an option of "disable all flicks" out of concern for accessibility for affected people, which would allow restoring it to those in need of key previews.


Succinct through-line:

  • key previews help with accessibility, especially on smaller phone form-factor devices
  • because key previews are good for accessibility, we should keep them when and where possible.
  • Therefore, if there's a reasonable way to reconcile key preview gesture requirements with flick gesture requirements, we should take that.
  • I do not believe the former point is possible unless we consider motion speed as a criterion for flick gestures.
  • Given all points above within this list, we should thus consider motion speed as a criterion for flick gestures - to reconcile "shiny new feature" with "existing accessibility functionality".

@jahorton
Copy link
Contributor Author

jahorton commented Jun 21, 2023

Took a bit of digging, but I found a few reference points that've been sticking in my mind:

https://github.com/srl295/cldr/blob/ecf728ab5a4ccd5678ae08c2a949b2bebbfe37df/docs/ldml/tr35-keyboards.md#accessibility

Some features, such as multiTap and flicks, have the potential to reduce accessibility and thus should be discouraged. For example, multiTap requires pressing keys at a certain speed, and flicks require a more complex movement (press-and-flick) beyond a simple tap. Alternatively, inclusion of accessible methods of generating the same outputs (for example, simple keys on an additional layer), should be considered.

That was added in unicode-org/cldr#2360.

Even longpress is discouraged by the LDML keyboards doc, in fact:

https://github.com/srl295/cldr/blob/ecf728ab5a4ccd5678ae08c2a949b2bebbfe37df/docs/ldml/tr35-keyboards.md#element-key

Attribute: longPress="a b c" (optional) (discouraged, see [Accessibility])

@jahorton
Copy link
Contributor Author

We've triaged the segmentation aspects of this until way later if at all, opting for a different strategy in the meantime. Also, recent PRs have pulled in many of the other related aspects of this PR, such as the TouchpathTurtle improvements. We're good to go ahead and retire this PR.

@jahorton jahorton closed this Jul 25, 2023
@jahorton jahorton deleted the fix/web/segment-path-interfacing branch July 25, 2023 05:59
@jahorton jahorton restored the fix/web/segment-path-interfacing branch October 24, 2023 01:15
@jahorton jahorton deleted the fix/web/segment-path-interfacing branch October 24, 2023 01:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants