Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ConPTY Passthrough mode #1173

Closed
be5invis opened this issue Jun 8, 2019 · 85 comments · Fixed by #17510
Closed

ConPTY Passthrough mode #1173

be5invis opened this issue Jun 8, 2019 · 85 comments · Fixed by #17510
Labels
Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) In-PR This issue has a related PR Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Needs-Tag-Fix Doesn't match tag requirements Product-Conhost For issues in the Console codebase
Milestone

Comments

@be5invis
Copy link

be5invis commented Jun 8, 2019

Modern apps won’t read the hidden character grid and do everything in VT. So why not an API/console mode to tell Console Host to completely throw away that?

@be5invis be5invis added the Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. label Jun 8, 2019
@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Jun 8, 2019
@DHowett-MSFT
Copy link
Contributor

@zadjii-msft was looking into this. The main issue, if I recall correctly, is that we need to trash the entire buffer when something enters or exits “passthrough” mode. It’s also only truly applicable when there is a connected pseudoconsole session.

@DHowett-MSFT
Copy link
Contributor

And the reason one console might enter and exit passthrough mode multiple times is that you may run coolNewThing.exe from CMD, and perhaps it might launch a further process that needs legacy support.

Additional concerns: if you have a tree of four processes, each of which wants passthrough to be different, should the ones that are doing ReadConsoleOutput be able to read the buffers from the other legacy/non-passthrough ones?

It’s complicated when you get into compatibility discussions. 😄

@DHowett-MSFT DHowett-MSFT added Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Product-Conhost For issues in the Console codebase labels Jun 8, 2019
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Jun 8, 2019
@zadjii-msft
Copy link
Member

Yea, I tried getting this working for like a day last year, but it's something I've wanted to work on for a while.

As Dustin mentioned, there'd be real weirdness moving between passthrough mode and non-passthrough mode. However, I think it might still be something good to investigate.

@DHowett-MSFT DHowett-MSFT removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Jun 10, 2019
@zadjii-msft
Copy link
Member

From #1985:

I've discussed this a couple times on different threads before, but I think I never made a real issue for it.

The idea of conpty passthrough mode is that a commandline client application
that knows it's only going to use VT sequences (and not the API) to modify
the console could set a special mode, ENABLE_PASSTHROUGH_MODE. If conpty is
active, the console would then stop rendering itself over conpty, and anything
that was written to the console would go straight to the terminal.

  • This would work especially well for something like wsl, where it's only ever going to be talking VT.
  • It would not work for cmd.exe, because of cmd.exe's heavy reliance on the API.
  • Windows Powershell again needs the API pretty heavily, but maybe Powershell
    Core, which is cross-platform, could make use of it.

This has some rough edges that need to be sorted out.

  • What happens when the app exits, and passthrough mode is turned off? The
    terminal and conpty's buffers would not be equivalent anymore!
    • I've been toying with the idea of having conpty both pass the sequences
      through, and also process them itself, so that the terminal and conpty stay
      in sync.
  • What happens when someone tries to call Console API's in passthrough mode?
    • I'm thinking we just cause them to fail. All save for Get/SetConsoleMode and reading input / writing output.
    • We'll also probably need to be able to read resize events.

This needs a real spec written, but it probably needs prototyping done before that.

@zadjii-msft zadjii-msft added this to the Console Backlog milestone Jul 16, 2019
@zadjii-msft zadjii-msft changed the title Give ability to let console applications to claim that it will never use Console API and do everything with VT, so Console Host can do more optimizations ConPTY Passthrough mode Jul 16, 2019
@oising
Copy link
Collaborator

oising commented Jul 17, 2019

We need a SIGWINCH asynchronous signal for resizing somehow.

@mintty
Copy link

mintty commented Jul 19, 2019

This is an essential feature for opening MS text-mode software (esp. WSL) to 3rd-party environments.
The current approach seems to tighly connect the Windows terminal implementation with ConPTY, however that limits applications unnecessarily and makes them highly dependent on Windows terminal progress (which may still take years, honestly).

It would not work for cmd.exe, because of cmd.exe's heavy reliance on the API.

It should be a combined mode: Whenever a console-API-based application is run, e.g. as started from a pure terminal-based application, the console API calls should be transformed into terminal escape sequences. Note as there are not so many features in the console API, this is a much easier approach than the reverse mapping, trying to squeeze terminal features through the conhost bottleneck.

@therealkenc
Copy link

Just realized #2035 is this ask framed differently.

however that limits applications unnecessarily and makes them highly dependent on Windows terminal progress

Not just progress. Behavior. This guy is writing a Tek4010 emulator. He is going to need a PTY if say his code were ever ported to native win32. And conhost sure as heck doesn't know what to do with the bytestream coming down that PTY. Never will. And need not care.

We need a SIGWINCH asynchronous signal for resizing somehow.

That too. Doesn't have to be a signal, mind, if there is some religious/philosophical reasons against. But if not it needs to be a separate (third rail) HANDLE on which I can WaitForMultipleObjects(), because no one said there is a ReadFile() byte coming, ever.

@be5invis
Copy link
Author

@mintty
There are many absurd console API usages, like reading the screen. That's really a legacy that ... IBM PC has a video card while PDPs aren't.
One idea is that ConPTY can have a "sync" stuff to read back the screen from a console application if some console API applications want to something strange. Otherwise the console host could simply convert console API calls into VT sequence, and forward that to the terminal app.

@therealkenc
Copy link

therealkenc commented Jul 20, 2019

There are many absurd console API usages, like reading the screen.

Great. Invent a new CSI sequence for that. Send the PCHAR_INFO lpBuffer back as I dunno a base64 gzip. The feature is not supported by Mintty. Yet. No biggie. Not even unusual. Any 6502/Z80 assembler programmer working 1981 was free to add their own crazy vendor-specific sequence to their company's terminal if they were bored enough on a weekend too.

@mintty
Copy link

mintty commented Jul 21, 2019

There are many absurd console API usages, like reading the screen.

Great. Invent a new CSI sequence for that.

I assume this concern is more about the other direction of such adaptation, i.e. how would you serve a Windows console program that wants to use that "absurd" feature? You could run a second, hidden console in parallel, to maintain backwards compatibility.

@DHowett-MSFT
Copy link
Contributor

run a second, hidden console in parallel

The people who hate how ConPTY is implemented today will absolutely hate how it's implemented if we do that. 😁

@mintty
Copy link

mintty commented Jul 21, 2019

They wouldn't even notice in pure pass-through applications. It might be necessary to solve an otherwise unresolvable dilemma.

@therealkenc
Copy link

therealkenc commented Jul 21, 2019

run a second, hidden console in parallel

This solution makes no sense because there is no window in sshd.exe nor someoldprogram.exe. Both those programs are text only applications that wouldn't know a Consolas Font from a hole in the ground.

To the point of #2035, neither sshd.exe nor someoldprogram.exe are terminal emulators. Gnome Terminal is a terminal emulator. And gnome-terminal is the only thing that can give the correct answer as to the contents its screen buffer; ie what ReadConsoleOutput() should return. Similarly, the only thing that knows the contents of (VS Code) xterm.js's screen buffer is xterm.js. ConPTY doesn't have a clue what xterm.js internal screen buffer contains. At best it can only guess by scraping the data passing by.

I assume this concern is more about the other direction of such adaptation,

Yes. And as a practical matter I wouldn't expect the ESC[?GIVEMEBUFFERASIBM feature being added to vteterminal anytime soon. Unless someone can point out the killer app that calls ReadConsoleOutput().

[a hidden window] might be necessary to solve an otherwise unresolvable dilemma.

I think we are closer than that. If conhost wants to read-only scrape the data going by for the sole purpose of keeping a wild-ass-guess as what the actual terminal emulator's buffer looks like, sure, I can live with that. ReadConsoleOutput() is a red herring. If conhost (call it a urxvtd analogy) wants to maintain a shadow buffer, it can knock itself out.

...So long as it is quiet about it. There is no reason for a pass-through "mode", which is how this issue was framed. There is never a reason for ConPTY to inject a VT sequence into a WriteFile() on a PTY handle -- ever. Daniel over on the VS Code team can't possibly care; because he has his own terminal emulator. PTYs don't care about VT100 sequences. Never heard of them.

[Then we need SIGWINCH (or equivalent thereof). Which isn't related to VT100 sequences or screen buffers either.]

@DHowett-MSFT
Copy link
Contributor

DHowett-MSFT commented Jul 21, 2019

@therealkenc,

Like it or not, conhost is the API server that, regardless of whether it presents a window, makes all existing Win32 console applications work. It must continue to make those applications work, because organizations really hate love when Microsoft rolls through with a new standard and tells everybody to drop what they're doing and throw out thousands of lines of code.

I think you're looking to turn this request, and this project, into something it's not. You may be attempting to turn CreatePseudoConsole into something it's not, too. Things work the way they work because we need to maintain compatibility with the thirty years of applications written since the VGA text mode buffer became the "official" design inspiration for how consoles should work in DOS and Windows.

ConPTY exists to--narrowly-stated--allow an application that understands a number of sequences as specified by an xterm-256color terminfo to host a windows console application; to wit: an application that would otherwise run in conhost should be able to run in a "terminal emulator" of sufficient compatibility. It's not intended to support a TEK4010 application (those are not windows console subsystem applications), and it's not likely to want to support a TEK4010 terminal emulator that is expecting to receive a bytestream from a TEK4010 application. That guy will probably end up doing what everybody ELSE who doesn't want to write a windows console subsystem application does: use pipes, because they don't have the same compatibility requirements (neé guarantees) as the windows pseudoconsole infrastructure.

Through that lens, a "passthrough" mode is required. A console application by default, and this cannot be changed for compatibility reasons, starts up in a mode where it just has full access to all of the stupid Win32 console APIs that no terminal emulator developer wants to countenance. Nobody should be able to read back the contents of a terminal buffer, local or remote, that they wrote to. Nobody should be able to write into the offscreen section of the buffer, because there's no guarantee anywhere else that it actually exists. But, they do. Developers use this. Applications expect this, because they were written as windows console subsystem applications. A passthrough mode--mode!--is the only way we can offer an application a way to say "I promise I won't use the old ways" while still being a windows console subsystem application. That's the first step we can make towards ConPTY being the dumb pipe you want it to be.

If you'd like to debate whether the Windows Console was the right choice, or was well-designed, I'm happy to have you do it--but not here.


Two asides.

  1. If you're looking for a better SIGWINCH, follow WINDOW_BUFFER_SIZE_EVENT generated during window scrolling #281.
  2. On machines where isatty(3) is a libc-provided fixture, you also have openpty(3). When somebody spawns a child application with a stdin/stdout hooked up to file descriptors they get back from openpty(3), isatty(3) suddenly returns 1. That application will, more often than not, decide that what that means is that it can send back VT100 escape sequences. That's what it means to most applications to be "on a tty." Sure, it's a terrible abstraction and a poor design and applications should be smarter than this, but they're not. The Windows Pseudoconsole fits right in, here, with the understanding that "I've allocated a PTY, which means I want VT".

@therealkenc
Copy link

I'm not intending to criticize the design or implementation -- at all. That was not the intent. I am trying (more slowly than intended) to find a solution to open-for-a-year issue WSL#3279. Of which Biswa96 has a very good start.

@zadjii-msft
Copy link
Member

woah this thread got pretty out of hand over the weekend.

The stars aligned Friday, and I actually got a chance to play around with implementing a passthrough mode for conpty. I'm pretty happy with how it works so far, so I think it needs a spec and some polish, and maybe we can ship it one day. Here's the approach I've been taking:

  • I introduced a new ENABLE_PASSTHROUGH_MODE to the SetConsoleMode flags.
  • When a commandline application enables passthrough mode, and the console is currently attached to a terminal (it's in conpty mode), any text they write to the console is written straight to the terminal as well, with no munging by conhost.
  • In passthrough mode, conpty stops "rendering" any changes to its buffer.
  • If a commandline application knows it's not going to be doing anything with the console API, it can safely set passthrough mode to be able to talk directly to the terminal.
  • [NOT DONE YET] If a commandline app in passthrough mode tries to call any API's, we'll either:
    • Convert the effects of that API call to VT, and pass that through to the terminal. We'll do this for API's where this is reasonable - SetConsoleTextAttribute is a good example.
    • For some APIs where the terminal doesn't really matter, we'll just keep doing what we're currently doing (case in point GetConsoleProcessList).
    • For APIs that don't make sense for conhost to be able to respond in passthrough mode, or we can't create a VT sequence to perform the requested operation, we'll return E_UNSUPPORTED_API (or a real error) indicating that API isn't supported in passthrough mode. Case in point: ReadConsoleOutput*, ScrollConsoleScreenBuffer, SetConsoleDisplayMode, etc.
      • Since the client app was updated to add support for passthrough mode, it should be able to be updated to handle this error case as well.
  • conhost also processes the strings that a commandline app emitted. This is to try and keep conpty's state in sync. This is important for the following.
  • When a client app exits passthrough mode, there's a chance that the terminal is in a torn state from the conpty. Case in point, if cmd.exe launched wsl.exe, and wsl enabled passthrough, did some stuff, then wsl exited, and cmd restored the console back to non-passthrough mode (as it needs to use the API). When passthrough mode is exited, conpty will redraw its screen, to re-sync the terminal to what conpty believes the buffer looks like.
    • For things like just launching wsl.exe or ssh.exe directly, or for mintty, this won't be as important, as the client will always be in passthrough mode, and the conpty will never "exit" passthrough.

This provides a way where we can be sure that apps that weren't updated for running in a pty will still have access to the entire console API, but apps that want to live in the new world can say "I promise I know what I'm doing", and run even smoother in conpty. I believe there's going to be an incredibly small intersection of apps that want to use VT and also call things like ReadConsoleOutput. Most apps are wither going to be console-like, using the API heavily, and might not be as likely to be updated for such a mode. However, for *nix-like apps that aren't going to be using the API so much and are primarily speak VT, this is an excellent option. This flag creates a clear distinction between the two when running in a pty.

@mintty
Copy link

mintty commented Jul 25, 2023

Windows console API has a lot more functionality

You mean like reading back the screen contents? An exotic feature rarely useful and certainly not relevant in the WSL domain.

less convinced I am that it's worth the effort

That's why I had tried to drag this discussion over to a WSL issue. For the Windows Terminal application, it may be dispensible to provide terminal transparency as it's a terminal itself. For WSL however, there is serious need for transparent terminal access, both local and remote, so passthrough mode without any Windows console legacy burden is essential for the WSL launcher.

@christianparpart
Copy link

christianparpart commented Jul 25, 2023

The Windows console API has a lot more functionality than your average VT terminal, and even with the most advanced VT capabilities there will likely be things we just can't reproduce

@j4james hey, I am curious to know what Windows console API offers that doesn't exist as VT sequence or extension just yet. OTOH, I think if one can count the number of features that are available on the conhost side but not on the VT side, it might make sense to introduce VT sequence extensions for those few and still put conhost on top of VT. I do not want to mandate here anything, I am just curious to know what conhost is more advanced in. :)

On the "your average VT terminal" argument, I'd love to bring some of my main features of contour to Windows, which I can't until either ConPTY supports them, or a passthrough mode is available. currently my windows version of Contour will always be inferior than on other platforms, sadly. Not sure how to solve this without ConPTY passthrough in the future :)

@jerch
Copy link

jerch commented Jul 25, 2023

@christianparpart I think the buffer read caps are on the annoying side of things - as far as I remember console API allows complete buffer access (in the sense, that the console app "owns" the console and thus the buffer state). Thats almost impossible with non rect-based VT mechanics or at least will be limited to active cursor area only prolly on all TEs. I still think that most buffer access primitives could be mimicked in VT.
The hard unsolvable problem might be the exclusiveness, that console API kinda provides to apps with a quite strict process IO coupling/isolation - a thing that many ppl also wish for terminals in linux/macos, but it is just not possible there with the current TTY/PTY abstraction (though its not a matter of VT mechanics but POSIX terminal API).

@j4james
Copy link
Collaborator

j4james commented Jul 25, 2023

@j4james hey, I am curious to know what Windows console API offers that doesn't exist as VT sequence or extension just yet.

You can find a list of the console APIs here:
https://learn.microsoft.com/en-us/windows/console/console-functions

If you click through each of them, you should see a "Tip" section which explains which of them do or don't have a VT equivalent. If you think you can convert all of them into VT sequences, please do feel free to contribute PRs. A lot of the framework is already in place to do this.

it might make sense to introduce VT sequence extensions for those few

We aren't even at the point where most conpty terminals are supporting the standard VT sequences we would need. The chances of everyone agreeing to a bunch of Windows-specific extensions on top of that doesn't seem very likely.

I think what some of you really want is just a pass through mode for WSL. If that's the case, you can assumedly build your terminal as a Linux GUI app running on WSL. But ConPTY is specially for Windows console application support. If you aren't interested in that, this isn't the issue for you.

@palves
Copy link

palves commented Jul 25, 2023

There is a set of Windows console applications that want the feature (in my case, GDB (GNU Debugger) testing), which don't need anything from the console API that the supported subset of standard VT sequences doesn't already provide. IMO, there could be a console API that such applications could call to enable some restrictive/subset of the API, only. Call it "passthrough mode", or "no intermediate buffer mode" or some such, and when that mode is active, the problematic console APIs (like complete buffer access) would just fail with an error.

@zadjii-msft
Copy link
Member

FWIW that's my preferred approach to solving this. A console mode that a client app can enable to say "I solemnly swear I am up to good". Something like WSL would opt-in, because it knows it's only ever going to do VT.

That doesn't work as well for mixed applications that want to do both. But that would let legacy console apps rely on conpty's regeneration for their own needs, and modern apps just use VT and all the new features that come with it.

@j4james
Copy link
Collaborator

j4james commented Jul 25, 2023

If that's the way we want to go, that's fine, but it means the Windows shells don't gain any benefit from it. For example, you wouldn't be able to TYPE a sixel file, or ECHO a Contour-specific escape sequence and expect it to work. Personally I'd consider this a waste of time, but I can accept that's all some people might care about.

@jerch
Copy link

jerch commented Jul 25, 2023

To avoid rewriting console API into wonky VT pendants, wouldn't the following work:

  • make stream-IO/passthrough VT-mode default
  • additionally make legacy console API fully abstract on both sides (process & TE side)
  • a TE can decide whether to implement the legacy console primitives, if it does - processes can call into console API as before, if not - processes may not call into legacy console (may need special process flags?)
  • TE operates as single source of truth for console buffer primitves - its the TEs resp. to get data from its internal terminal buffer feeding the TE-side of the console API
  • provide a standard TE on OS side fully implementing legacy console primitives, so things like AttachConsole still works
  • last step - retire ConPty 😺

Ofc there are details to overcome, like the question whether the console primitives can be made blocking or need true memory sharing & possible security implications.

@csdvrx
Copy link

csdvrx commented Aug 1, 2023

On a sidenote - recently had a similar image forth and back encoding/decoding need for state serialization within xtermjs, and did not vote for sixel, but QOI (reasonably fast in wasm with good enough compression - much better than any PNG handling offered by the browser engines).

@jerch , am I right to say that we had a few standards already existing (sixel, kitty, ...) and now we've got one more?

I think it's just sad that the desire for technical perfections goes before practical concerns for something that'd work about everywhere, if leveraging the existing ecosystem.

So like @j4james, I think not being able to reach some baseline level of functionality is waste of time:

If that's the way we want to go, that's fine, but it means the Windows shells don't gain any benefit from it. For example, you wouldn't be able to TYPE a sixel file, or ECHO a Contour-specific escape sequence and expect it to work. Personally I'd consider this a waste of time

Actually, I'd consider that not just a waste of time, but a sad waste of time.

@jerch
Copy link

jerch commented Aug 1, 2023

am I right to say that we had a few standards already existing (sixel, kitty, ...) and now we've got one more?

Nope, that serialization format is for TE internal usage only, not intended for outside with an explicit sequence. QOI yields the best compression/performance ratio covering RGBA in 32bit. Since xtermjs is browser based, we cannot just store raw bytes somewhere on the disk for session restore, as other desktop TEs would do here.

@mominshaikhdevs
Copy link

what is the latest update on this?

@zadjii-msft
Copy link
Member

Nothing to share at this time. We'll make sure to update this thread when there is. In the meantime, might I recommend the Subscribe button?
image
That way you'll be notified of any updates to this thread, without needlessly pinging everyone on this thread ☺️

github-merge-queue bot pushed a commit that referenced this issue Jul 1, 2024
## Summary of the Pull Request

This PR introduces basic support for the Sixel graphics protocol in
conhost, limited to the GDI renderer.

## References and Relevant Issues

This is a first step towards supporting Sixel graphics in Windows
Terminal (#448), but that will first require us to have some form of
ConPTY passthrough (#1173).

## Detailed Description of the Pull Request / Additional comments

There are three main parts to the architecture:

* The `SixelParser` class takes care of parsing the incoming Sixel `DCS`
  sequence.
* The resulting image content is stored in the text buffer in a series
  of `ImageSlice` objects, which represent per-row image content.
* The renderer then takes care of painting those image slices for each
  affected row.

The parser is designed to support multiple conformance levels so we can
one day provide strict compatibility with the original DEC hardware. But
for now the default behavior is intended to work with more modern Sixel
applications. This is essentially the equivalent of a VT340 with 256
colors, so it should still work reasonably well as a VT340 emulator too.

## Validation Steps Performed

Thanks to the work of @hackerb9, who has done extensive testing on a
real VT340, we now have a fairly good understanding of how the original
Sixel hardware terminals worked, and I've tried to make sure that our
implementation matches that behavior as closely as possible.

I've also done some testing with modern Sixel libraries like notcurses
and jexer, but those typically rely on the terminal implementing certain
proprietary Xterm query sequences which I haven't included in this PR.

---------

Co-authored-by: Dustin L. Howett <[email protected]>
@microsoft-github-policy-service microsoft-github-policy-service bot added the In-PR This issue has a related PR label Aug 1, 2024
@lhecker lhecker closed this as completed in 450eec4 Aug 1, 2024
@microsoft-github-policy-service microsoft-github-policy-service bot added the Needs-Tag-Fix Doesn't match tag requirements label Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) In-PR This issue has a related PR Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Needs-Tag-Fix Doesn't match tag requirements Product-Conhost For issues in the Console codebase
Projects
None yet
Development

Successfully merging a pull request may close this issue.