Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Touch bindings #1904

Open
ddevault opened this issue May 3, 2018 · 21 comments
Open

Touch bindings #1904

ddevault opened this issue May 3, 2018 · 21 comments
Labels
enhancement New feature or incremental improvement input/touch

Comments

@ddevault
Copy link
Contributor

ddevault commented May 3, 2018

touch {
    set along_the_top "20px x 1.0 @ .5 x 0"
    set bottom_half "1.0 x .5 @ 0 x .5"
    set right_half ".5 x 1.0 @ .5 x 0"

    # five finger tap to spawn a terminal
    gesture exec $term {
        down 5
        delay < 10ms
        up all # optional, all guestures implicitly end with "up all"
        # equivalent shorter version:
        # tap 5
    }

    # five finger long press to lock screen
    gesture exec swaylock {
        down 5
        50ms < delay < 1s
    }

    # slide along the top to right to spawn dmenu
    gesture exec dmenu_run {
        down 1 $along_the_top
        horiz +100px < move
    }

    # two finger slide across bottom half of window to splith
    gesture splith {
        down --window 2 $bottom_half
        horiz +/-0.5 < move
    }
    gesture splitv {
        down --window 2 $top_half
        vert +/-0.5 < move
    }

    # rotate screen with 5 finger rotate
    gesture output LVDS-1 transform +90 {
        down 5
        45deg < rotation < 90deg
        up all
    }

    # drag from right/left to switch workspaces
    # --create to make a new one if you hit the end
    gesture workspace --create next_on_output {
        down 1 "100px x 1.0 @ 1.0-100px x 0"
        horiz -0.25 < move
    }

    gesture workspace --create prev_on_output {
        down 1 "100px x 1.0 @ 0 x 0"
        horiz +0.25 < move
    }

    # 3 finger drag up to float windows
    gesture floating enable {
        down 3 $bottom_half
        move $top_half
    }

    # 3 finger drag down to float windows
    gesture floating disable {
        down 3 $top_half
        move $bottom_half
    }

    # 3 finger drag to move windows
    gesture {
        down --window 3
        # move at least 50px to start an interactive move
        # calculated from the average position of your fingers
        50px < move --attach
    }

    # 3 finger pinch to resize windows
    gesture {
        down --window 3
        50px < expand/shrink --attach
    }

    # 3 finger rotate to rotate floating windows
    gesture floating enable {
        down [floating] 3
        15deg < rotate --attach
    }

    # Except gedit which I want to resize with rotation for some reason
    gesture {
        down [appid="gedit"] 3
        15deg < rotate --attach
    }
}
@ddevault
Copy link
Contributor Author

ddevault commented May 3, 2018

Extended examples with some gestures for interactive move/resize/rotate of windows

@ascent12
Copy link
Member

ascent12 commented May 3, 2018

Having the command after 'gesture' seems odd.

@ddevault
Copy link
Contributor Author

ddevault commented May 3, 2018

Where else would it go?

@emersion emersion added the enhancement New feature or incremental improvement label May 3, 2018
@grahnen
Copy link

grahnen commented Jan 10, 2019

I have a suggestion for an API, built on the API sent to me by @ddevault over on IRC. (Thanks a lot for the guidance!) It's a little crude maybe, so if it could receive a second opinion from someone that'd be incredibly helpful! For now it's over on https://github.com/grahnen/libtouch but if you want to adopt it, it can just be transferred to swaywm/libtouch.

Seeing this is my first attempt at any actual FLOSS contribution, I feel like I need the approval (and constructive critique!) of at least one more before I proceed, hope that's not too much to hope for ^^

@ddevault
Copy link
Contributor Author

This looks close... I wonder if you could write a little demo program?

@grahnen
Copy link

grahnen commented Jan 10, 2019

There's the examples.c that has some example gestures (given a working API of course), if that's what you meant. One could add more there, should you want to. Otherwise, the demo program would need a (semi-)complete implementation, which I could make

@ddevault
Copy link
Contributor Author

Duh, sorry. Yeah, I think it's time to start implementing some of this stuff so we can get an idea of how well the API works.

@grahnen
Copy link

grahnen commented Jan 12, 2019

Alright I've started, There's basically only the "register_touch" and "register_move" implementations (in which the actual tracking and progress tracking is done) left to be completed (they've been started).

Creating multitouch gestures consisting of taps/swipes should be doable now.

Edit: Most of it is done now. Except some (probably major) bugfixing. I don't really know the ins and outs of sway so I don't know where to begin to implement it.

@emersion
Copy link
Member

Prior art: https://launchpad.net/geis

@progandy
Copy link
Contributor

@whot
Copy link

whot commented Oct 4, 2019

I don't think you should do tapping as a "down, delay, up" but treat it as a distinct "tap" item instead. tapping is complicated and splitting it up into the separate components is going to burn you out. Just define "tap", make the exact behaviour implementation defined and continue from there. There are all sorts of timeout issues you're dealing with in tapping, it's ... optimistic to think that a "down, delay 10ms, up" would be sufficient :)

And thinking about this, I'd say your going to make your life easier by defining a simpler set of core gestures (swipe, pinch, tap, hold) and then using only those. And changing finger count is hard within a gesture, so you may want to do something like:

    # 3 finger drag down to float windows
    gesture floating disable {
        swipe 3 $bottom_half
    }

Where you need some more complicated gestures, those would still be a sequence of the core components, e.g. tap-and-swipe:

    tap 3 $bottom_half
    swipe 3 $bottom_half

alas, just as a heads-up, combined gestures have historically not been very successful due to the learning curves required.

@VanLaser
Copy link

Since this seems to be the 'touchscreen', 'multitouch' and 'gestures' design thread (if I'm not mistaken), please - if possible - take into account in this phase the following libinput + touchscreen issue:

Libinput has gestures enabled for touchpad, but not for touchscreen, and the ... quite reasonable reason for this is given here: https://wayland.freedesktop.org/libinput/doc/latest/gestures.html#gestures-touchscreens

Coming back to sway, from the exact same reason given above the compositor would have to take into account gestures together with their location with respect to windows positions, or, in another words, provide a way for the user to know the associations between each finger touch and the window it touched, either on the border, or inside it, or alternatively separate touch gestures in 'per-window' gestures or ... design something related to this issue.

I'm actually only saying this - if the compositor is the right place to implement multitouch touchscreen gestures, and the reason is the interaction between windows coordinates and touch locations, then interpreting gestures should take into account this interaction, somehow.

@beatboxchad
Copy link

I understand the motivation to not interpret touchscreen gestures from libinput. It could cause subtle issues in the wild as libinput and something like Firefox fight, or app-specific handlers don't get the events whereas no handler is configured in libinput unless the packager provides some default configuration, or other messes. Indeed, I do a lot of pro audio/live performance development, and my applications are the appropriate place to store logic about how to handle touch events. I can directly access each touchscreen from, say, SuperCollider's HID client, and make whatever noises I want. Architecturally, I think I'm on board so far.

But then in practice, I'm a Sway user with two touchscreens, one on my laptop and one external. It turns out to be sort of an onerous geek trap that I can make a quick three-finger swipe send a container to another workspace or output, or browse workspaces, or any of the other wonderful things you can tell Sway to do... but only on my trackpad. About the only thing I can do with my touchscreens is switch the focus, and the pointer doesn't seem to follow but stays tied to the trackpad (I think that's my bad, I just haven't looked up how to configure the seats or something).

I'm as keyboard-centric as the next one, but sometimes it's nice to be leaning over a cup of coffee and just reach out and smudge my screen to scroll some docs. I think it would be great to have the option to do the same to switch to the previous workspace. I'm not convinced there's no way to expose the gesture data without clobbering existing functionality, with the caveat that if the user defines a gesture handler that an application also defines, the behavior is undefined. In general, the libinput developers are right, but for an audience like Sway users, I think this is perfectly acceptable. We're devs and hackers, by and large.

Perhaps this comment belongs on libinput's tracker, because I think it would also be appropriate to use the existing gesture logic to parse gestures in libinput, and just send the raw global-coordinate data to the compositor, which could then decide what to do with the event based on what containers are in those raw coordinates. My gut says implementing the whole gesture logic in Sway could lead to inconsistent behavior between trackpad and touchscreen, as they're handled by different 'drivers'. This seems like a lot of technical debt to take on for one feature.

The earlier PR as well as https://github.com/xiamaz/libinput-touchscreen looks interesting, but as stated, I'm searching for an excuse not to re-implement raw gesture parsing.

My main question: am I missing something about the current architecture and implementation when I suggest trying to get libinput to go ahead and parse the gestures using existing logic and send the resultant events to the compositor (Sway) and then using Sway to forward them to the container whose bounds they fall within if no handler is defined (or optionally by a handler after work is complete)? My apologies, for I have not explored through libinput yet to answer my own questions.

@whot
Copy link

whot commented Jan 17, 2023

@beatboxchad am I guessing that none of the applications you are currently using actually do real multi-touch? As in, your touch support is primarily a mouse pointer replacement?

Because once you have applications that truly support multitouch, gestures are no longer trivial. The same gesture that looks like a 3fg swipe now may be a valid interaction in the application. At least on the libinput level we have no idea of knowing.

Pinch is even worse because two touchpoints moving in different directions? You don't want this to be "zoom window" because in your maps application it's supposed to zoom into the map, right? If you had a mt-capable RTS where you move your units around with fingers, you'd get a lot of fake gestures that are really just unit movements.

The only reason we can do touch gestures in libinput is because there is nothing really defined for "three fingers on touchpad" [1] so we can't really do much wrong by interpreting it as gesture. And for 2-finger pinch the same is true (although we're always fighting the more niche cases of 2-finger scrolling).

[1] not quite true, there is a multitude of three fingers are on the touchpad and you may want a single click, double click or middle click, depending which one of these fingers actually count...

@beatboxchad
Copy link

beatboxchad commented Jan 18, 2023

Right, I'm super clear on the reasons one would avoid writing (and committing to maintain) such code, and in full agreement. I do run applications that parse touch, and I can clearly see the mess I'd be making for myself if I did not proceed with care. I agree that exposing this mess to users at large would not be good.

Nonetheless, I think that safety goal can be accomplished without constraining user freedom nor forcing desktop compositors and applications to re-implement raw event parsing (at the expense of behavioral consistency between my touch devices depending on whether libinput or sway or Firefox or some mt-capable RTS is doing the work). I am imagining something on the libinput side like a compiler flag or optional setting which applies existing gesture recognition logic to generate touchscreen events and optionally emits them.

Personally I think that architecturally, the parsing of raw device data belongs in the driver (libinput), and any incomplete information (like whether an event falls within the bounds of some window, along with the final decision as to whether to act on an event) should then be provided by the compositor and/or the application. I'm aware that this idealization will end up brittle under real-world conditions. I'd like to learn about those edge cases. For my own machine, I think I'll fork libinput, try to turn on those events, and report back. If you can predict a failure in my plan, please let me know! :)

@VanLaser
Copy link

I fully agree!

Let's consider another type of input: the keyboard. Isn't that kind of input available to both Sway and the applications that run in Sway? There is no problem, no confusion there to decide who consumes the input, we just filter the keys, interpret some of them, and pass the rest to the apps. But for some reason, when touch gestures are involved, we get confused about who can, or should, interpret them.

In the same way, we could have some way of announcing that "these gestures are for Sway" (e.g. related to workspaces, windows etc.) or that "these gestures are for the focused app, pass them to the app" etc. One way to do it is use a finger to push some kind of button (similar to Super_L/Win key) while doing Sway-related gestures with the other fingers or the other hand. Or have some kind of toggle gesture, or touch button (like Vim's normal/insert/visual modes).

In other words, I think everybody involved in the touch input chain - starting with libinput - should try to provide as much functionality as possible, gestures included if possible (since already implemented for touchpads), with the options for the next links in the chain (the window manager / compositor, the applications) to ignore those and provide their own. I know it's easy for me to speak, but really, the argument that an application can provide a better implementation for a gesture is no excuse to avoid providing generic gestures for all the other applications that don't, isn't it?

@whot
Copy link

whot commented Jan 19, 2023

existing gesture recognition logic to generate touchscreen events

fwiw, the existing logic in libinput is very focused on touchpads, I reckon quite a fair bit of work might be needed to make this more generic.

the parsing of raw device data belongs in the driver (libinput)

libinput does very little with touch screen input, it mostly converts it to sane numbers (mm or scaled to width/height) and passes them on. Any HW-specifics are mostly abstracted, either in the kernel or in libinput. The only special thing we do is touch arbitration but that only applies to tablets with touchscreens anyway. And I think we may do palm detection but only for those touches the HW designates as palm (MT_TOOL_PALM).

Other than that, you can pretty much put on top of libinput whatever you could put inside libinput, it's working off the same data.

@beatboxchad
Copy link

beatboxchad commented Jan 26, 2023

I got sucked into the hyperfocus vortex and have been hacking on this all week. You're right, it's a bit of work to tease out the multitouch stuff from the touchpad stuff, but not the worst -- the code is pretty well-organized and legible, well-documented too. With modern dev tools (I'm on emacs but use LSP and all that) it only took an afternoon to end up with a custom dispatcher for all touchscreens, mostly copied from the touchpad source tree. It compiled and ran, but no touches, thus pointer or gestures. I mucked up the state machine somewhere and my touches aren't getting unhovered. Touchscreens don't send pressure or proximity (or, neither of mine do). The fake_touches logic looks like it should be working fine, but it's not and I've been doubling down on debugging instead of taking a day off to freshen up my wits. Hhh, typical.

The dev tools show everything else working nicely AFAICT. Once I get the touches back, we'll see if I made any mistakes in the scaling and gesture. The tools' feedback along with my understanding of the code make me optimistic. If things aren't a total mess, I'll reach out as I clean up and refactor the code. I'd love to get your feedback on it.

I'm commenting here because if this works my changes would expose that same gesture config interface to Sway but it'd work on a touchscreen.

** edit** no BTN_TOOL_*TAP messages for touchscreens either. Go off the slots alone, it looks like. Both of my touchscreens handle palm and thumb detection on their own, and compliantly end all touches when they detect one. My external touchscreen sends an MSC_TIMESTAMP. I dunno if I care.

@beatboxchad
Copy link

beatboxchad commented Feb 3, 2023

I've completed my initial prototype of porting libinput's touchpad gesture engine over to a generic interface for all touch devices. There is plenty of finish work left. I intend to continue reaching out to the community for help and feedback -- it includes softer decisions like default behaviors and whether this functionality should even be included, as well as more technical questions like refactoring the touchpad logic to use the generic interface, pending those other discussions.

But for now I am delighted to watch the gesture glyphs move around in libinput-debug-gui as I scroll, swipe, and so on. I have fine-tuning to do, and just a couple bugs left to squash. Yep, this is a lot of work! Yo @whot I watched your talk, and have gotten to know your work these past couple weeks. I can confirm your experience. My warm gratitude to you and the rest of the community. Another thing you were right about is just how chaotically undefined the behavior is when, say, Firefox attempts to interpret my maybe-pinch-maybe-scrolls as zooms, and so on. This brings me back to this design thread.

To avoid this unwanted conflict, Sway needs some way to exclude touchscreen-sensitive windows from gesture interpretation. Yes, doing so dynamically is a Hard Problem (ideas like the hotkey toggling @VanLaser shared above are useful here). But a user capable enough to enable and configure gesture actions should be capable enough to establish a workable policy for their touch apps. Or perhaps Sway could evdev-style "grab" any device with active gesture configuration, and you could configure an allow-list of touchscreen-sensitive applications to pass the raw events to instead of gesture. Over time, sensible defaults might become established and perhaps appear as a commented examples in distributed config files.

It would be SUPER great if application gesture exclusion did not include the titlebar/decoration. With the right tuning, I could swipe/click/drag/pinch that little region to trigger WM actions on an application, but retain access to the application's multitouch implementation. I'm already using this to tap-and-drag, but you can't drag across monitors with an absolute coordinate system locked to the display. A swipe fits this niche perfectly IMHO.

I'm guessing that these ideas are feasible based on how, unlike libinput, Sway has access to all the information necessary to make these decisions. But sometimes implementations can get a little hairy, so I'm seeking the input of those more experienced with the Sway codebase and the development of this feature.

@beatboxchad
Copy link

Hey colleagues, just checking in. I had a difficult spring in New Orleans -- party city, and I run with lots of working entertainers. There were two drug-related deaths in my circle, and I was too fussy to code. I'm also striving to resuscitate my freelancing hustle, so what energy I was able to muster went to getting my CI game together in case I encounter Wordpress. I deal with a couple chronic health issues that make me wary of working full-time, so I prioritize long games over short ones and am conservative with my commitments. For exhibit A, see the hyperfocus I displayed on this work followed by a months-long period of inactivity.

I'm gonna drop my ego and draft-PR what I have. I wanted to contribute something working to reduce the friction, but I think I was prematurely optimizing. As I mentioned above, there's just one dysfunction. Another pair of eyes will likely spot my error.

I will use a LLM to clean it up, too. :) Exciting times! I intend to get it done sometime this coming week. Thank you, please thrive.

@mecattaf
Copy link

Some users have reported success with lisgd. However sway chose to wrap libinput-gestures inside its config which I think is a great thing given how we would otherwise need to add libinput-gestures to the input group etc.
I think a key difference between touchpads and touchscreens is the necessity for edge gestures for the latter.
Thank you all for the support!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or incremental improvement input/touch
Development

No branches or pull requests

10 participants