Ask HN: Deep learning enabled GUI automation tools?

RMPR · on April 13, 2020

Not really a tool, but Python is widely used for deep learning so, you can combine Pytorch, Tensorflow, [insert your DL framework here] with Pyautogui[1] to achieve exactly what you're asking. If you feel Pyautogui is too much "manual", I built a kind of frontend for it [2].

[1]: https://github.com/asweigart/pyautogui

[2]: https://github.com/rmpr/atbswp

swayson · on April 13, 2020

I have been looking into pyautogui, wondering how I can hookup a custom backend then for the boundary box detection, which appears is not supported.

guess wrapping pyautogui might be the way to go, is my understanding correct?

atbswp looks very valuable, thanks for sharing.

RMPR · on April 13, 2020

> wondering how I can hookup a custom backend then for the boundary box detection, which appears is not supported.

You can take a screenshot with:

    pyautogui.screenshot()

With your neural network you can have the coordinates of what you want, and act with pyautogui afterwards. In many cases, a neural network can even be overkill, take a look at this https://vimeo.com/352072921 The script takes a screenshot of the webpage, recognize the current highlighted word with pytesseract and type it in with pyautogui, simple.

swayson · on April 13, 2020

this is great thanks!!!