TagUI for Desktop Applications --> use visual automation #113

ArulKarthickKuppusamy · 2018-04-03T07:19:47Z

Hi Ken,

We are trying to Implement TagUI for automating Desktop Applications. It would be great if we have some documents/videos related to automating desktop applications.

If Possible please share the details to [email protected]

Thanks,
Arul

kensoh · 2018-04-03T09:51:49Z

Hi Arul, thanks for asking this. TagUI relies on visual recognition to automate desktop applications.

More details here. Steps that support visual automation are click, hover, type, select, read, show, save, snap. For example, below automation flow tries to send an email through Outlook, by looking for best matches of images of respective UI elements.

Attaching the sample images for reference - samples.zip. They won't work on Windows Outlook (or macOS Outlook of different versions), as the UI icons will look different for different OS and versions. But will be good to see to get a better idea how visual recognition is used to control UI actions.

Helper function visible() can also be used to detect whether an image is visible.

click outlook-icon.png
click new-email.png
enter mail-body.png as Hi Whoever,\n\nAttached are the M1 numbers.\n\nRegards,\nKen
enter subject-field.png as M1 Lucky Numbers
enter to-field.png as [email protected]
click attach-button.png
click numbers-icon.png
click choose-button.png
click send-button.png

Below is another example of visual automation

Using OCR to grab text from PDF (alternatively use Python libraries or other CLI tools), followed by typing and printing thank-you letter from MS Word. Attached images - word_samples.zip

click minimize.png  
dclick receipt.png
wait 2 seconds
read page.png to receipt_text
write receipt_text to receipt.txt
click close_pdf.png

dclick letter.png
wait 8 seconds
dclick address.png
type page.png as John Lim[enter]123 ABC Street[enter]Singapore 1234567[clear]
dclick name.png
type page.bmp as John
dclick amount.png
type page.png as $123.00
click file.png
click print.png
click confirm.png
click close_word.png
click dontsave.png

kensoh · 2018-04-12T08:07:39Z

Attaching the sample images for reference. They won't work on Windows Outlook (or macOS Outlook of different versions), as the UI icons will look different for different OS and versions. But will be good to see as samples to get a better idea how visual recognition is used to control UI actions - samples.zip

kensoh · 2018-05-09T18:30:47Z

Re-posting a great comment from @adegard here since this issue is related to AHK and is still open.

Thank you @kensoh for your answer. I'm a beginner devlopper so virtual display seems to me a little bit complicate for now...

Aboout AutoHotKey, I would like to make a little tool for editing Tagui script if it is possible... Can I share with you a repository to work on it? it is a little menu to remember mains comands in english (activated by using crtl+left click), it is not completed, but I could share it to you and other user: https://github.com/adegard/tagui_scripts

I read about AI Singapore and other blogs on RPA, it seems that for beginners UIpath is a bit complicate and RPA express too much big program to install... So in my opinion Tagui is a very good alternative, simple and leight. Please continue your project, even it's so hard to maintain ;-)

kensoh · 2018-05-10T02:35:54Z

Hi @adegard wow looks cool! I've just tried out your AHK TagUI commands helper. I think I have to create a new section on TagUI home page to link to tools and stuffs that the community create 😄

PS - thanks very much for your feedback and encouragement! Yes TagUI will continue to be maintained to make RPA accessible to a broader user community than large organizations with deep pockets.

adegard · 2018-05-10T13:00:59Z

OK thank @kensoh
so I will complete the helper tool.. for my personal use. I need it to don't remember all commands in 6 months!! so I will copy all your example in it to render it more friendly.

I ' m not using TagUI on server but on my personal PC, so for me headless script combines with cron (like z-cron tool) is very important! but at the same time, I need some tool to "accelerate" the process of script production, because we have a lot of things to automatize... !
I will do my best to complete the ahk script!
thanks again

kensoh · 2018-06-14T20:58:52Z

I'm only looking at 3 items in pipeline for TagUI before hitting maintenance mode.

integrating with desktop apps - TagUI for Desktop Applications --> use visual automation #113
assistant for writing scripts - TagUI Writer 1.01 : helper tool for coding (most useful commands for beginners) #188
for loop break and continue - For loop bugfix - explore enabling break and continue within for loops #216

May reach out to other open-source RPA software maintainers to look at collaboration. Was thinking yesterday if can make a great open-source RPA tool and pass on to @microsoft or another large tech company to maintain, can put pressure on commercial RPA tools to raise the quality and ease-of-use of their free versions. That should lead to the largest impact on the RPA ecosystem.

kensoh · 2018-06-15T08:51:31Z

Besides the example above on outlook, using vision step, users can send custom commands to Sikuli to do things like typing complex keystroke sequences. There also seems to be a trend towards using computer vision for UI automation of desktop apps. This is happening for commercial RPA software and also startups such as http://www.intellibot.io.

Furthermore, I can't see a sensible way to harmonize the steps API for AutoHotkey or RoroScript with TagUI. They are all different powerful tools, but to try to force an integration for the sake of integrating is senseless. Users will be better off writing the automation flows directly in those software and using run step or api step to invoke those part of the automation, if they still want to manage the whole flow from within TagUI.

Because of this, have decided to abandon efforts on trying to integrate natively with AHK or RoroScript but instead use the effort to review possible ways to improve Sikuli's visual automation integration. Folks who want integration with desktop apps, just give a shout here your use scenarios and let's see what can be done to run those automation workflows using TagUI-Sikuli's native integration.

CC @Aussiroth @lohvht - we can discuss next week some examples of use scenarios for desktop apps, and explore ways to make it easy + accurate to run visual automation on them.

kensoh · 2018-06-15T08:58:21Z

1 idea is make it super simple to create customized workflows for different desktop apps. For eg, having a 'module' for excel 20XX, a 'module' for outlook 20XX. Where each module is nothing more than folders with images of UI elements that we can either create ourselves or let users submit as PRs.

And perhaps coupled with that some automation flows that can be called via tagui steps to do some action. eg tagui excel/create_new_sheet (that also means tagui step need to support sending parameters as part of the step). @adegard's screen-capture tool will come in very handy 😄

- supports [enter] and [clear] keywords just like the standard type step for webpages - trigger word is page.png and page.bmp, just like steps snap, read, show, save

kensoh · 2018-06-18T10:06:03Z

above commit adds visual automation for type page.png as text

supports [enter] and [clear] keywords just like the standard type step for webpages
trigger word is page.png and page.bmp, just like the steps snap, read, show, save

prior to this, type step can only type into an UI element on screen, eg type search_bar.png as 123

kensoh · 2018-06-18T16:15:41Z

Have looked through sikuli's doc. can't find anything else that should be implemented directly as part of tagui steps. for those niche custom commands, vision step can be used - more details of sikuli commands here - http://doc.sikuli.org and here - http://sikulix-2014.readthedocs.io/en/latest

Closing the issue for now, the screen capture utility to facilitate capturing image snapshots can be done as part of #188. The modules idea above is worth exploring when the time is ripe (for community contributed images of elements). also copying @Aussiroth @lohvht for further inputs.

kensoh · 2018-07-04T06:32:32Z

User question - just to clarify, what is the page.png? and also what does the highlighted codes mean?

click minimize.png
dclick receipt.png
wait 2 seconds
read page.png to receipt_text
write receipt_text to receipt.txt
click close_pdf.png

dclick letter.png
wait 8 seconds
dclick address.png
type page.png as John Lim[enter]123 ABC Street[enter]Singapore 1234567[clear]
dclick name.png
type page.bmp as John
dclick amount.png
type page.png as $123.00
click file.png
click print.png
click confirm.png
click close_word.png
click dontsave.png

My reply

For visual automation, TagUI looks out for .png or .bmp names instead of element identifiers referring to webpage UI (user-interface) elements.

read page to xxx normally means read text contents of the webpage to variable xxx. read page.png to receipt_text uses visual recognition and OCR (optical character recognition) to read the text on whole screen to the variable receipt_text. it's trying to capture the text from the PDF file to save into a text file.

More details of the visual automation here -
https://github.com/kelaberetiv/TagUI#visual-automation

write receipt_text to receipt.txt saves the variable to a text file receipt.txt

More details of all the TagUI steps here -
https://github.com/kelaberetiv/TagUI#steps-description

vijendra-impetus · 2018-07-05T10:23:14Z

@kensoh ,

I tried the below steps:

dclick receipt.png
wait 2 seconds
read page.png to receipt_text
write receipt_text to receipt.txt
click close_pdf.png

But after below message it got stuck nothing happening I tried several times by clicking on different images in folder but its not clicking on any image.

tagui D:\TagUI_Windows\word_samples\pdfread
[starting sikuli process]

START - automation started - Thu Jul 05 2018 15:17:15 GMT+0530 (India Standard Time)

click D:/TagUI_Windows/word_samples/confirm.png

kensoh · 2018-07-05T13:13:38Z

Hi @vijendra-impetus recently a user has a similar problem when using the visual automation on Windows. It just hangs after running, even when Sikuli and Java has been installed.

This is the solution that works for her, see here to see if it helps your situation - #229

If not, can you paste the contents of the tagui_windows.log file in src\tagui\tagui.sikuli here to see what is the error messages in backend?

vijendra-impetus · 2018-07-10T11:34:28Z

Hi @kensoh ,

As per the solution the logs were printing in log files are like below :

+++ running this Java
java version "1.8.0_171"
Java(TM) SE Runtime Environment (build 1.8.0_171-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.171-b11, mixed mode)
+++ trying to run SikuliX
+++ using: -Xms64M -Xmx512M -Dfile.encoding=UTF-8 -Dsikuli.FromCommandLine -jar c:\tagui\src\tagui.sikuli\sikulix.jar -r tagui.sikuli
Jul 03, 2018 11:40:41 AM java.util.prefs.WindowsPreferences
WARNING: Could not open/create prefs root node Software\JavaSoft\Prefs at root 0x80000002. Windows RegCreateKeyEx(...) returned error code 5.
[tagui] START - listening for inputs

[tagui] FINISH - stopped listening

But when I check the log file as mentioned by you at location src\tagui\tagui.sikuli , the log file (tagui_windows.log) is completely empty. Nothing is there in log file.

kensoh · 2018-07-13T01:14:51Z

Hi @vijendra-impetus can you check inside the tagui\src\tagui.sikuli folder, is there a runsikulix file? That file should be there is installation is completed.

For installation, see these steps (now in midst of updating visual automation in main documentation to have these details in tutorial) - https://github.com/kelaberetiv/TagUI/blob/master/src/media/RPA%20Workshop.md#visual-automation

mathiasx88 · 2019-02-14T07:36:06Z

Hi Kenson,

Recently i trying out the tagui web automation chrome extension, after the script are generated, when i try to run the script, it indicate that the web element are not found. Can assist to advice. Some of the element are able to get a response, but some does not.

https://www.google.com/ click .gLFyf.gsfi enter .gLFyf.gsfi as github[enter] click .aajZCb input:nth-child(1) click .bkWMgd:nth-child(1) .LC20lb

kensoh · 2019-02-15T11:03:03Z

Hi @mathiasx88 the recording is not foolproof, you can try using XPath by inspecting directly from your web browser, for example - https://github.com/kelaberetiv/TagUI#find-xpath-of-web-element

After you copy XPath using the example in the link above, you can perform TagUI actions on the element using the familiar steps. Besides copying from browser, it is a good investment to learn XPath and writing your own XPath locator. It is very expressive and very useful for selecting web elements.

mathiasx88 · 2019-02-22T07:57:07Z

Hi, @kensoh

I am able to run the tagui script directly from command prompt now. But when i try to run for firefox, i will encounter error. Below are the screenshot of the error. Will you be able to advice? Thanks alot.

kensoh · 2019-02-24T10:09:22Z

Hi @mathiasx88 yes Firefox has an overhaul from v60 and SlimerJS is not compatible yet. More details here on using Firefox (for eg using older version or automating it visually) - #344 (comment)

mathiasx88 · 2019-05-28T04:40:48Z

Hi @kensoh

May i check, i tried to use the following command to clear the text field that come with default value 65, but whenever i run the command, it does not clear. Can assist to advice.

type /html/body/div/section[4]/div/div[2]/div/form[1]/div[13]/div[1]/input as [clear]8938392[enter]

kensoh · 2019-05-31T15:39:24Z

It might be the XPath is wrong or other reasons, but hard to take a look without replication steps.

oai1228 · 2019-10-16T06:08:55Z

@kensoh
Hi I have some questions

I wrote code below:
dclick /Users/desktop nate.png
wait 3
snap page
snap logo
snap page as nate_sample.png
snap logo as nate_sample2.png
wait 3

In cmd,
START - automation started - Wed Oct 16 2019 15:05:45 GMT+0900 (?�?쒕?援??쒖???

dclick /Users/議곗슦??desktop nate.png
....

does not working well
Can you explain about visual automation, and why that code dose not working

kensoh · 2019-10-16T08:51:33Z

Hi @oai1228 I think there cannot a space in the file name - dclick /Users/desktop nate.png
Try using something simple like nate.png without space to see if it works.

Visual automation requires Java SDK (64-bit), see here for details -
https://github.com/kelaberetiv/TagUI#visual-automation

Finally, check the log files in tagui/src/tagui.sikuli folder to see what is the error message.

oai1228 · 2019-10-22T08:29:58Z

Hi ken, I have problem again,

START - automation started - Tue Oct 22 2019 17:24:52 GMT+0900 (?�?쒕?援??쒖???
dclick c:/TagUI/tagui/src/samples/ever.png

and then cmd dose not work.
I already download Java SDK (64-bit)

and I can't understand , Where do program click on the picture?

kensoh · 2019-10-22T11:21:05Z

Thanks @oai1228, looks like no other users have encountered this problem before, some next steps to try -

take image of your windows start button and name it as start.png
in your automation script, write one line click start.png
run automation and check the log file in tagui\src\tagui.sikuli

yoga212121 · 2024-06-26T08:41:16Z

hey @kensoh is it possible that my flow is entering login credentials on some webpage simultaneously while i am writing a a report on another site, or does it have to be undisturbed during the flow, is it possible for the website actions such as click type etc to run in background while i am performing some other operation

kensoh added query feature and removed query labels Apr 3, 2018

kensoh mentioned this issue Apr 6, 2018

What are some of the main limitations of TagUI? #116

Closed

kensoh mentioned this issue May 6, 2018

How to do TagUI visual automation using OCR #152

Closed

kensoh mentioned this issue May 22, 2018

Can TagUI automate Windows, mainframe applications as well along with web applications? #171

Closed

kensoh self-assigned this May 28, 2018

kensoh removed their assignment Jun 13, 2018

kensoh mentioned this issue Jun 14, 2018

Some ideas for TagUI development #83

Closed

kensoh changed the title ~~How to Implement TagUI for Desktop Applications~~ Using TagUI for Desktop Applications - review possible Sikuli visual automation enhancements Jun 15, 2018

kensoh changed the title ~~Using TagUI for Desktop Applications - review possible Sikuli visual automation enhancements~~ TagUI for Desktop Applications - review possible Sikuli visual automation enhancements Jun 15, 2018

kensoh changed the title ~~TagUI for Desktop Applications - review possible Sikuli visual automation enhancements~~ TagUI for Desktop Applications - explore Sikuli visual automation enhancements Jun 15, 2018

kensoh self-assigned this Jun 15, 2018

kensoh added a commit that referenced this issue Jun 18, 2018

#113 - visual automation for type page.png as text

5ce0390

- supports [enter] and [clear] keywords just like the standard type step for webpages - trigger word is page.png and page.bmp, just like steps snap, read, show, save

kensoh closed this as completed Jun 18, 2018

kensoh mentioned this issue Jun 20, 2018

Clearing input fields and sending modifier keys such as Ctrl key #155

Closed

kensoh changed the title ~~TagUI for Desktop Applications - explore Sikuli visual automation enhancements~~ TagUI for Desktop Applications --> use Sikuli visual automation Jun 21, 2018

kensoh changed the title ~~TagUI for Desktop Applications --> use Sikuli visual automation~~ TagUI for Desktop Applications --> use visual automation Jun 21, 2018

Aussiroth mentioned this issue Jun 27, 2018

Getting live visual automation to work #225

Closed

kensoh mentioned this issue Jul 4, 2018

Where can I find sample TagUI scripts showing visual automation examples #230

Closed

kensoh mentioned this issue Jul 4, 2018

May I know if it is possible to use TagUI to automate sending email reminders? #232

Closed

sangasangasanga mentioned this issue Jul 25, 2018

Can tagui scrape from pdf file in the desktop application? #250

Closed

kensoh mentioned this issue Sep 4, 2018

Remote Desktop or Citrix based automation support in TagUI #272

Closed

kensoh mentioned this issue Sep 29, 2018

Mail read support in Tagui #281

Closed

kensoh mentioned this issue Feb 15, 2019

Chrome recorder script not working for some elements #339

Closed

kensoh mentioned this issue Oct 24, 2019

How can we send emails from TagUI as part of an automation flow? #134

Closed

idiogolima mentioned this issue Nov 6, 2019

Wait 60 seconds hangs script - run with chrome or headless option #615

Closed

kensoh mentioned this issue Jan 23, 2020

[TagUI] OCR for Mac - here are a few samples using visual automation #698

Closed

theMoe mentioned this issue Jul 15, 2020

Visual automation doesn't work - to check more with user on why image not found #845

Closed

chuenlim mentioned this issue Feb 2, 2021

Reading text from desktop applications - use OCR or copy text to clipboard #926

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TagUI for Desktop Applications --> use visual automation #113

TagUI for Desktop Applications --> use visual automation #113

ArulKarthickKuppusamy commented Apr 3, 2018

kensoh commented Apr 3, 2018 •

edited

Loading

kensoh commented Apr 12, 2018 •

edited

Loading

kensoh commented May 9, 2018

kensoh commented May 10, 2018 •

edited

Loading

adegard commented May 10, 2018

kensoh commented Jun 14, 2018

kensoh commented Jun 15, 2018

kensoh commented Jun 15, 2018

kensoh commented Jun 18, 2018

kensoh commented Jun 18, 2018

kensoh commented Jul 4, 2018

vijendra-impetus commented Jul 5, 2018

kensoh commented Jul 5, 2018

vijendra-impetus commented Jul 10, 2018

kensoh commented Jul 13, 2018

mathiasx88 commented Feb 14, 2019 •

edited

Loading

kensoh commented Feb 15, 2019

mathiasx88 commented Feb 22, 2019

kensoh commented Feb 24, 2019

mathiasx88 commented May 28, 2019

kensoh commented May 31, 2019

oai1228 commented Oct 16, 2019

kensoh commented Oct 16, 2019

oai1228 commented Oct 22, 2019 •

edited

Loading

kensoh commented Oct 22, 2019

yoga212121 commented Jun 26, 2024 •

edited

Loading

TagUI for Desktop Applications --> use visual automation #113

TagUI for Desktop Applications --> use visual automation #113

Comments

ArulKarthickKuppusamy commented Apr 3, 2018

kensoh commented Apr 3, 2018 • edited Loading

kensoh commented Apr 12, 2018 • edited Loading

kensoh commented May 9, 2018

kensoh commented May 10, 2018 • edited Loading

adegard commented May 10, 2018

kensoh commented Jun 14, 2018

kensoh commented Jun 15, 2018

kensoh commented Jun 15, 2018

kensoh commented Jun 18, 2018

kensoh commented Jun 18, 2018

kensoh commented Jul 4, 2018

vijendra-impetus commented Jul 5, 2018

kensoh commented Jul 5, 2018

vijendra-impetus commented Jul 10, 2018

kensoh commented Jul 13, 2018

mathiasx88 commented Feb 14, 2019 • edited Loading

kensoh commented Feb 15, 2019

mathiasx88 commented Feb 22, 2019

kensoh commented Feb 24, 2019

mathiasx88 commented May 28, 2019

kensoh commented May 31, 2019

oai1228 commented Oct 16, 2019

kensoh commented Oct 16, 2019

oai1228 commented Oct 22, 2019 • edited Loading

kensoh commented Oct 22, 2019

yoga212121 commented Jun 26, 2024 • edited Loading

kensoh commented Apr 3, 2018 •

edited

Loading

kensoh commented Apr 12, 2018 •

edited

Loading

kensoh commented May 10, 2018 •

edited

Loading

mathiasx88 commented Feb 14, 2019 •

edited

Loading

oai1228 commented Oct 22, 2019 •

edited

Loading

yoga212121 commented Jun 26, 2024 •

edited

Loading