Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: Deep learning enabled GUI automation tools?
3 points by swayson on April 13, 2020 | hide | past | favorite | 4 comments
I was wondering, have anybody found really good tools, potentially cross-platform for GUI automation but which leverages image detection from Computer vision models, say convolutional neural networks?

How about open-source alternatives?




Not really a tool, but Python is widely used for deep learning so, you can combine Pytorch, Tensorflow, [insert your DL framework here] with Pyautogui[1] to achieve exactly what you're asking. If you feel Pyautogui is too much "manual", I built a kind of frontend for it [2].

[1]: https://github.com/asweigart/pyautogui

[2]: https://github.com/rmpr/atbswp


I have been looking into pyautogui, wondering how I can hookup a custom backend then for the boundary box detection, which appears is not supported.

guess wrapping pyautogui might be the way to go, is my understanding correct?

atbswp looks very valuable, thanks for sharing.


> wondering how I can hookup a custom backend then for the boundary box detection, which appears is not supported.

You can take a screenshot with:

    pyautogui.screenshot()
With your neural network you can have the coordinates of what you want, and act with pyautogui afterwards. In many cases, a neural network can even be overkill, take a look at this https://vimeo.com/352072921 The script takes a screenshot of the webpage, recognize the current highlighted word with pytesseract and type it in with pyautogui, simple.


this is great thanks!!!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: