Heeey! I built a macOS copilot that has been useful to me, so I open sourced it in case others would find it useful too.
It's pretty simple:
- Use a keyboard shortcut to take a screenshot of your active macOS window and start recording the microphone.
- Speak your question, then press the keyboard shortcut again to send your question + screenshot off to OpenAI Vision
- The Vision response is presented in-context/overlayed over the active window, and spoken to you as audio.
- The app keeps running in the background, only taking a screenshot/listening when activated by keyboard shortcut.
It's built with NodeJS/Electron, and uses OpenAI Whisper, Vision and TTS APIs under the hood (BYO API key).
There's a simple demo and a longer walk-through in the GH readme https://github.com/elfvingralf/macOSpilot-ai-assistant, and I also posted a different demo on Twitter: https://twitter.com/ralfelfving/status/1732044723630805212
I was skimming through the video you posted, and was curious.
https://www.youtube.com/watch?v=1IdCWqTZLyA&t=32s
code link: https://github.com/elfvingralf/macOSpilot-ai-assistant/blob/...