Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Voice Command #5434

Closed
MariaLiama opened this issue Apr 21, 2020 · 15 comments
Closed

Voice Command #5434

MariaLiama opened this issue Apr 21, 2020 · 15 comments
Labels
Area-Extensibility A feature that would ideally be fulfilled by us having an extension model. Area-Input Related to input processing (key presses, mouse, etc.) Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Needs-Tag-Fix Doesn't match tag requirements Product-Terminal The new Windows Terminal.
Milestone

Comments

@MariaLiama
Copy link

Description of the new feature/enhancement

Voice Command to control the terminal

Proposed technical implementation details (optional)

@MariaLiama MariaLiama added the Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. label Apr 21, 2020
@ghost ghost added Needs-Tag-Fix Doesn't match tag requirements Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Apr 21, 2020
@JushBJJ
Copy link

JushBJJ commented Apr 21, 2020

If microsoft approves this, we should probably brainstorm ideas on how this would work.

@zadjii-msft
Copy link
Member

I honestly don't see how this would effectively useful, however, I'll acknowledge that this is a valid feature request.

There's definitely a 0% chance that this gets implemented by our team outside of a hackathon. I will throw this into our mega-thread for extenstion ideas though, because I could definitely imagine some other third-party developer trying their hand at this feature, and the Terminal should be able to support them.

@zadjii-msft zadjii-msft added Area-Extensibility A feature that would ideally be fulfilled by us having an extension model. Area-Input Related to input processing (key presses, mouse, etc.) labels Apr 21, 2020
@zadjii-msft zadjii-msft added this to the Terminal Backlog milestone Apr 21, 2020
@zadjii-msft zadjii-msft added the Product-Terminal The new Windows Terminal. label Apr 21, 2020
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Apr 21, 2020
@WSLUser
Copy link
Contributor

WSLUser commented Apr 21, 2020

Technically this could be useful for those with disabilities (other than those who can't speak or speak well). Next thing you know the Console/Terminal is supports virtual keyboards with key presses being entered via blinks of an eye.

@carlos-zamora
Copy link
Member

Implementation-wise, I guess you would have to do the following...

  • have voice commands to invoke keybinding actions
  • use UIA to navigate/read the contents of a terminal buffer
  • (the hard part) Input:
    • use speech-to-text to "type" commands into a terminal
    • bonus points if you can figure out mouse interactivity haha
    • figure out some way to send key-chords directly to the terminal (i.e.: ctrl-r)

Fun thought experiment. It would definitely be a fun extension to implement too.

@DHowett-MSFT DHowett-MSFT removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Apr 25, 2020
@DHowett-MSFT
Copy link
Contributor

Yeah, this is an interesting extension idea. Thanks! Triaged, backlog, requires Area-Extensibility.

@rjperrella
Copy link

Microsoft already has voice recognition technology built into Windows (since Vista if memory serves). It already knows how to do the "hard parts" that @carlos-zamora mentioned in his comment. Is there not an API you can interface with to make voice recognition work better? At the moment, for example, it seems the voice recognition software is unable to "see" what was entered in, so it can't make selections.

@speechchemistry
Copy link

I currently use typing mode with NATO phonetics (e.g. load Windows Speech Recognition (WSR), and say "Start typing" then say "Lima Sierra Enter") and you will see what is in the current directory. I realise that sounds a bit tedious but in the long run it's faster than the normal WSR dictation mode because of the improved accuracy. Microsoft has hardly touched WSR since Windows 7 though so it has not kept up with the accuracy of Dragon. Unfortunately Dragon does not fully work on Windows Terminal - how come? Putty works ok with Dragon v15 "spell mode" and it's unlikley that Nuance have optimised it for such a niche app. So is Putty using a more standard form of text input than Windows Terminal? It would be so useful for me with my disability if Windows Terminal was fully compatible with Dragon Speech Recognition.

@charliecalvert
Copy link

Voice recognition does work in the terminal. It would be useful to see a list of valid commands. So far I know " press" and "go to beginning/end of line". What else is available?

@JushBJJ
Copy link

JushBJJ commented Feb 6, 2021

How do you enable it? @charliecalvert

@charliecalvert
Copy link

charliecalvert commented Feb 6, 2021

@JushBJJ Make sure you have the Speech Recognition dialog running and set to Listening. This ancient video has the basics.

@zadjii-msft zadjii-msft modified the milestones: Terminal Backlog, Backlog Jan 4, 2022
@carlos-zamora
Copy link
Member

Voice Access seems relevant to this.

We should make sure this has a pleasant experience with Windows Terminal. It really just uses the UIA API, so nothing crazy should happen (knock on wood).

@zadjii-msft zadjii-msft added the Needs-Discussion Something that requires a team discussion before we can proceed label Mar 24, 2023
@zadjii-msft
Copy link
Member

Note: Discussion:
Can we close this now that Voice Access is GA?

@rjperrella
Copy link

Just tried it out on Terminal. Seems to work. I think you can close it.

@charliecalvert
Copy link

charliecalvert commented Mar 25, 2023 via email

@zadjii-msft
Copy link
Member

Oh I was using "GA" as "Generally available", i.e., no longer just in Insiders builds.

I'm gonna call this one - I'd bet that Voice Access is better than anything we'd be able to build ourselves. And if it isn't, then I'm sure that's feedback that they'd appreciate to help make VA even better.

Thanks all!

@zadjii-msft zadjii-msft removed the Needs-Discussion Something that requires a team discussion before we can proceed label Mar 27, 2023
@microsoft-github-policy-service microsoft-github-policy-service bot added the Needs-Tag-Fix Doesn't match tag requirements label Mar 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Extensibility A feature that would ideally be fulfilled by us having an extension model. Area-Input Related to input processing (key presses, mouse, etc.) Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Needs-Tag-Fix Doesn't match tag requirements Product-Terminal The new Windows Terminal.
Projects
None yet
Development

No branches or pull requests

10 participants