-
Notifications
You must be signed in to change notification settings - Fork 585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TagUI for Desktop Applications --> use visual automation #113
Comments
Hi Arul, thanks for asking this. TagUI relies on visual recognition to automate desktop applications. More details here. Steps that support visual automation are click, hover, type, select, read, show, save, snap. For example, below automation flow tries to send an email through Outlook, by looking for best matches of images of respective UI elements. Attaching the sample images for reference - samples.zip. They won't work on Windows Outlook (or macOS Outlook of different versions), as the UI icons will look different for different OS and versions. But will be good to see to get a better idea how visual recognition is used to control UI actions. Helper function visible() can also be used to detect whether an image is visible.
Below is another example of visual automation Using OCR to grab text from PDF (alternatively use Python libraries or other CLI tools), followed by typing and printing thank-you letter from MS Word. Attached images - word_samples.zip
|
Attaching the sample images for reference. They won't work on Windows Outlook (or macOS Outlook of different versions), as the UI icons will look different for different OS and versions. But will be good to see as samples to get a better idea how visual recognition is used to control UI actions - samples.zip |
Re-posting a great comment from @adegard here since this issue is related to AHK and is still open. Thank you @kensoh for your answer. I'm a beginner devlopper so virtual display seems to me a little bit complicate for now... Aboout AutoHotKey, I would like to make a little tool for editing Tagui script if it is possible... Can I share with you a repository to work on it? it is a little menu to remember mains comands in english (activated by using crtl+left click), it is not completed, but I could share it to you and other user: https://github.com/adegard/tagui_scripts I read about AI Singapore and other blogs on RPA, it seems that for beginners UIpath is a bit complicate and RPA express too much big program to install... So in my opinion Tagui is a very good alternative, simple and leight. Please continue your project, even it's so hard to maintain ;-) |
Hi @adegard wow looks cool! I've just tried out your AHK TagUI commands helper. I think I have to create a new section on TagUI home page to link to tools and stuffs that the community create 😄 PS - thanks very much for your feedback and encouragement! Yes TagUI will continue to be maintained to make RPA accessible to a broader user community than large organizations with deep pockets. |
OK thank @kensoh I ' m not using TagUI on server but on my personal PC, so for me headless script combines with cron (like z-cron tool) is very important! but at the same time, I need some tool to "accelerate" the process of script production, because we have a lot of things to automatize... ! |
I'm only looking at 3 items in pipeline for TagUI before hitting maintenance mode.
May reach out to other open-source RPA software maintainers to look at collaboration. Was thinking yesterday if can make a great open-source RPA tool and pass on to @microsoft or another large tech company to maintain, can put pressure on commercial RPA tools to raise the quality and ease-of-use of their free versions. That should lead to the largest impact on the RPA ecosystem. |
Besides the example above on outlook, using vision step, users can send custom commands to Sikuli to do things like typing complex keystroke sequences. There also seems to be a trend towards using computer vision for UI automation of desktop apps. This is happening for commercial RPA software and also startups such as http://www.intellibot.io. Furthermore, I can't see a sensible way to harmonize the steps API for AutoHotkey or RoroScript with TagUI. They are all different powerful tools, but to try to force an integration for the sake of integrating is senseless. Users will be better off writing the automation flows directly in those software and using run step or api step to invoke those part of the automation, if they still want to manage the whole flow from within TagUI. Because of this, have decided to abandon efforts on trying to integrate natively with AHK or RoroScript but instead use the effort to review possible ways to improve Sikuli's visual automation integration. Folks who want integration with desktop apps, just give a shout here your use scenarios and let's see what can be done to run those automation workflows using TagUI-Sikuli's native integration. CC @Aussiroth @lohvht - we can discuss next week some examples of use scenarios for desktop apps, and explore ways to make it easy + accurate to run visual automation on them. |
1 idea is make it super simple to create customized workflows for different desktop apps. For eg, having a 'module' for excel 20XX, a 'module' for outlook 20XX. Where each module is nothing more than folders with images of UI elements that we can either create ourselves or let users submit as PRs. And perhaps coupled with that some automation flows that can be called via tagui steps to do some action. eg tagui excel/create_new_sheet (that also means tagui step need to support sending parameters as part of the step). @adegard's screen-capture tool will come in very handy 😄 |
- supports [enter] and [clear] keywords just like the standard type step for webpages - trigger word is page.png and page.bmp, just like steps snap, read, show, save
above commit adds visual automation for
prior to this, type step can only type into an UI element on screen, eg type search_bar.png as 123 |
Have looked through sikuli's doc. can't find anything else that should be implemented directly as part of tagui steps. for those niche custom commands, vision step can be used - more details of sikuli commands here - http://doc.sikuli.org and here - http://sikulix-2014.readthedocs.io/en/latest Closing the issue for now, the screen capture utility to facilitate capturing image snapshots can be done as part of #188. The modules idea above is worth exploring when the time is ripe (for community contributed images of elements). also copying @Aussiroth @lohvht for further inputs. |
User question - just to clarify, what is the page.png? and also what does the highlighted codes mean?
My reply For visual automation, TagUI looks out for .png or .bmp names instead of element identifiers referring to webpage UI (user-interface) elements. read page to xxx normally means read text contents of the webpage to variable xxx. read page.png to receipt_text uses visual recognition and OCR (optical character recognition) to read the text on whole screen to the variable receipt_text. it's trying to capture the text from the PDF file to save into a text file. More details of the visual automation here - write receipt_text to receipt.txt saves the variable to a text file receipt.txt More details of all the TagUI steps here - |
@kensoh , I tried the below steps: dclick receipt.png But after below message it got stuck nothing happening I tried several times by clicking on different images in folder but its not clicking on any image.
START - automation started - Thu Jul 05 2018 15:17:15 GMT+0530 (India Standard Time) click D:/TagUI_Windows/word_samples/confirm.png |
Hi @vijendra-impetus recently a user has a similar problem when using the visual automation on Windows. It just hangs after running, even when Sikuli and Java has been installed. This is the solution that works for her, see here to see if it helps your situation - #229 If not, can you paste the contents of the tagui_windows.log file in src\tagui\tagui.sikuli here to see what is the error messages in backend? |
Hi @kensoh , As per the solution the logs were printing in log files are like below : +++ running this Java [tagui] FINISH - stopped listening But when I check the log file as mentioned by you at location src\tagui\tagui.sikuli , the log file (tagui_windows.log) is completely empty. Nothing is there in log file. |
Hi @vijendra-impetus can you check inside the tagui\src\tagui.sikuli folder, is there a runsikulix file? That file should be there is installation is completed. For installation, see these steps (now in midst of updating visual automation in main documentation to have these details in tutorial) - https://github.com/kelaberetiv/TagUI/blob/master/src/media/RPA%20Workshop.md#visual-automation |
Hi Kenson, Recently i trying out the tagui web automation chrome extension, after the script are generated, when i try to run the script, it indicate that the web element are not found. Can assist to advice. Some of the element are able to get a response, but some does not.
|
Hi @mathiasx88 the recording is not foolproof, you can try using XPath by inspecting directly from your web browser, for example - https://github.com/kelaberetiv/TagUI#find-xpath-of-web-element After you copy XPath using the example in the link above, you can perform TagUI actions on the element using the familiar steps. Besides copying from browser, it is a good investment to learn XPath and writing your own XPath locator. It is very expressive and very useful for selecting web elements. |
Hi, @kensoh I am able to run the tagui script directly from command prompt now. But when i try to run for firefox, i will encounter error. Below are the screenshot of the error. Will you be able to advice? Thanks alot. |
Hi @mathiasx88 yes Firefox has an overhaul from v60 and SlimerJS is not compatible yet. More details here on using Firefox (for eg using older version or automating it visually) - #344 (comment) |
Hi @kensoh May i check, i tried to use the following command to clear the text field that come with default value 65, but whenever i run the command, it does not clear. Can assist to advice. type /html/body/div/section[4]/div/div[2]/div/form[1]/div[13]/div[1]/input as [clear]8938392[enter] |
It might be the XPath is wrong or other reasons, but hard to take a look without replication steps. |
@kensoh I wrote code below: In cmd, dclick /Users/議곗슦??desktop nate.png does not working well |
Hi @oai1228 I think there cannot a space in the file name - Visual automation requires Java SDK (64-bit), see here for details - Finally, check the log files in tagui/src/tagui.sikuli folder to see what is the error message. |
Thanks @oai1228, looks like no other users have encountered this problem before, some next steps to try -
|
hey @kensoh is it possible that my flow is entering login credentials on some webpage simultaneously while i am writing a a report on another site, or does it have to be undisturbed during the flow, is it possible for the website actions such as click type etc to run in background while i am performing some other operation |
Hi Ken,
We are trying to Implement TagUI for automating Desktop Applications. It would be great if we have some documents/videos related to automating desktop applications.
If Possible please share the details to [email protected]
Thanks,
Arul
The text was updated successfully, but these errors were encountered: