Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Open-source macOS AI copilot using vision and voice (github.com/elfvingralf)
430 points by ralfelfving on Dec 12, 2023 | hide | past | favorite | 159 comments
Heeey! I built a macOS copilot that has been useful to me, so I open sourced it in case others would find it useful too.

It's pretty simple:

- Use a keyboard shortcut to take a screenshot of your active macOS window and start recording the microphone.

- Speak your question, then press the keyboard shortcut again to send your question + screenshot off to OpenAI Vision

- The Vision response is presented in-context/overlayed over the active window, and spoken to you as audio.

- The app keeps running in the background, only taking a screenshot/listening when activated by keyboard shortcut.

It's built with NodeJS/Electron, and uses OpenAI Whisper, Vision and TTS APIs under the hood (BYO API key).

There's a simple demo and a longer walk-through in the GH readme https://github.com/elfvingralf/macOSpilot-ai-assistant, and I also posted a different demo on Twitter: https://twitter.com/ralfelfving/status/1732044723630805212




Did you find that calling it “OSX” in the prompt worked better than macOS? Or was that just an early choice that you didn’t spend much time on?

I was skimming through the video you posted, and was curious.

https://www.youtube.com/watch?v=1IdCWqTZLyA&t=32s

code link: https://github.com/elfvingralf/macOSpilot-ai-assistant/blob/...


No, this is an oversight by me. To be completely honest, up until the other day I thought it was still called OSX. So the project was literally called cOSXpilot, but at some point I double checked and realize it's been called macOS for many years. Updated the project, but apparently not the code :)

I suspect OSX vs macOS has marginal impact on the outcome :)


Haha, makes perfect sense, thanks for the reply!


Heh. I remember calling it Mac OS back in the day and getting corrected that it's actually OS X, as in "OS ten," and hasn't been called Mac OS since Mac OS 9. Glad Apple finally saw it my way (except it's cased macOS).


You should add an option for streaming text as the response instead of TTS. And also maybe text in place of the voice command as well. I have been tire-kicking a similar kind of copilot for awhile, hit me up on discord @jonwilldoit


There's definitely some improvements to shuttling the data between interface<->API, all that was done in a few hours on day 1 and there's a few things I decided to fix later.

I prefer speaking over typing, and I sit alone, so probably won't add a text input anytime soon. But I'll hit you up on Discord in a bit and share notes.


Yeah, just some features I could see adding value and not being too hard to implement :)


> text in place of the voice command as well

That would be great for people with Mac mini who don't have a mic.


Hmmm... what if I added functionality that uses the webcam to read your lips?

Just kidding. Text seem to be the most requested addition, and it wasn't on my own list :) Will see if I add it, should be fairly easy to make it configurable and render a text input window with a button instead of triggering the microphone.

Won't make any promises, but might do it.


People with a Mac mini may not have a webcam, either!


It was a joke.


Added text input instead of voice as an option today.


Wrote some similar scripts for my Linux setup, that I bind with XFCE keyboard shortcuts:

https://github.com/samoylenkodmitry/Linux-AI-Assistant-scrip...

F1 - ask ChatGPT API about current clipboard content F5 - same, but opens editor before asking num+ - starts/stops recording microphone, then passes to Whisper (locally installed), copies to clipboard

I find myself rarely using them however.


Nice!


Make sure to set OpenAI API spend limits when using this or you'll quickly find yourself learning the difference between the cost of the text models and vision models.

EDIT: I checked again and it seems the pricing is comparable. Good stuff.


I think a prompt cost estimator might be a nifty thing to add to the UI.

Right now there's also a daily API limit on the Vision API too that kicks in before it gets really bad, 100+ requests depending on what your max spend limit is.


I love it! I’ve been circling around a similar set of ideas, although my version integrates with the web-based ChatGPT:

https://news.ycombinator.com/item?id=38244883

There are some pros and cons to that. I’m intrigued by your stand-alone MacOS app.


Love it! Will definitely use this when a quick screenshot will help specify what I am confused about. Is there a way to hide the window when I am not using it? i.e. I hit cmd+shift+' and it shows the window, then when the response finishes reading, it hides again?


There's a way for sure, it's just not implemented. Allowing for more configurability of the window(s) is on my list, because it annoys me too! :)


Annoyance Driven Development™


Currently imagining my productivity while waiting 10 seconds for the results of the `ls` command.


It's a basic demo to show people how it works. I think you can imagine many other examples where it'll save you a lot of time.


The demo on Twitter is a lot cooler, partially because you scroll to show the AI what the page has. Maybe there's a more impressive demo to put on the GH too?


Just used it with the digital audio workstation Ableton Live. It is amazing! Its tips were spot-on.

I can see how much time it will save me when I'm working with a software or domain I don't know very well.

Here is the video of my interaction: https://www.youtube.com/watch?v=ikVdjom5t0E&feature=youtu.be

Weird these negative comments. Did people actually try it?


So glad when I saw this, thanks for sharing this! It was exactly music production in Ableton was the spark that lit this idea in my head the other week. I tried to explain to a friend that don't use GPT much that with Vision, you can speed up your music production and learn how to use advanced tools like Ableton more quickly. He didn't believe me. So I grabbed a Ableton screenshot off Google and used ChatGPT -- then I felt there had to be a better way, I realized that I have my own use-cases, and it all evolved into this.

I sent him your video, hopefully he'll believe me now :)


You may be interested in two proof of concepts I've been working on. I work with generative AI and music at a company.

MidiJourney: ChatGPT integrated into Ableton Live to create MIDI clips from prompts. https://github.com/korus-labs/MIDIjourney

I have some work on a branch that makes ChatGPT a lot better at generating symbolic music (a better prompt and music notation).

LayerMosaic allows you to allow MusicGen text-to-music loops with the music library of our company. https://layermosaic.pixelynx-ai.com/


Oooh. Yes, very interested in MusicGen. I played with MusicGen for the first time the other week and created a little script that uses GPT to create the prompt and params which is stored to a text file along with the output. Let it loop for a few hours to get a few 100 output files that allowed me to learn a bit more about what kind of prompts that gave reasonable output (it was all bad, lol!)


Oh LayerMosaic is dope. I'm not entirely sure how it works, but the sounds coming out of it is good -- so you have me intrigued! Can I read more about it somewhere, I might have a crazy idea I'd like to use this for.


My brain read midjourney until I clicked on the GH link. What a great name, MIDIjourney!


Is it just me or is it incredibly useless?

"Here's a list of effects. Here's a list of things that make a song. Is it good? Yes. What about my drum effects? Yes here's the name of the two effects you are using on your drum channel"

None of this is really helpful and I can't get over how much it sounds like Eliza.


I just made a video where I test it with a proper use case. It helps me find effects to make a bassline more dubby and helps carve out frequencies in the kick drum to make space for the bass.

https://www.youtube.com/watch?v=zyMmurtCkHI


I made that video right at the start but since then I've asked it for example what kind of compression parameters would fit with a certain track and it could explain to me how to find an expert function which I would have had to consult a manual for otherwise.


Yeah I thought the same. Ultra generic advice and no evidence it has actually parsed anything unique or useful from the user’s actual composition.


I made another one: https://www.youtube.com/watch?v=zyMmurtCkHI

In the one I posted I was just so amazed how well it worked and didn't really try anything useful. In this video you can see it giving me quite good advice on how to make a bassline dubby and how to carve frequencies out of the kick drum to make space for the bass.

It also looks at spectrograms and gives feedback / takes them into account. I'm pretty amazed.


Did you change the GPT Vision system prompt at all? I wonder if changing it to state getting help with specifically Ableton, and maybe some guidelines around what kind of help you want could make it better?


No. But I found it good enough as it is


I mean it does send a screenshot of your screen off to a 3rd party, and that screenshot will most likely be used in future AI training sets.

So... beware when you use it.


OpenAI claims that data sent via the API (as opposed to chatGPT) will not be used in training. Whether or not you believe them is a separate question, but that's the claim.


Beware of it seeing a screenshot of my music set? OpenAI will start copying my song structure?

You can turn it on and off. Not necessary to turn it on when editing confidential documents.

You never enable screen-sharing in videoconferencing software?


I completely agree. A huge business with a singular focus isn’t going to pivot into the music business (or any of the myriad use cases the general public throws at it). And if they did use someone’s info, it’s more likely an unethical employee than a genuine business tactic.

Besides, the parent program uses the API, which allows opting out of training or retaining that data.


Yes this makes perfect sense. As we know, businesses definitely do not treat data as a commodity and engage in selling/buying data sets on the open market as a "genuine business tactic". Therefore, since the company in question doesn't have a clear business case for data collection currently, we can be sure this data will never be used against our interests by any company.


Hey, I was working on something to allow GPT-V to actually do stuff on the screen, click around and type, I tested on my Mac and it’s working pretty well, do you think it would be cool to integrate? https://github.com/rogeriochaves/driver


Yes. I think you commented this somewhere else, and I like it. I was considering doing something similar to have it execute keyboard commands, but decided it would have to wait for a future version. I think click + type + and performing other actions would be powerful, especially if it can do it fast and accurate. Then it's less about "How do I do X?", and more "Can you do X for me?".


I've been wanting to build something like this by integrating into the terminal itself. Seems very straight forward and avoids the screen shotting. So you would just type a comment in the right format and it would recognise it:

    $ ls 
    a.txt b.txt c.txt

    $ # AI: concatenate these files and sort the result on the third column
    $ #....
    $ # cat a.txt b.txt c.txt | sort -k 3
This already works brilliantly by just pasting into CodeLLaMa so it's purely terminal integration to make it work. All i need is the rest of life to stop being so annoyingly busy.


I wrote a simple command line app to let me quickly ask a quick question in the terminal - https://github.com/edwardsp/qq. It outputs the command I need and puts it in the paste buffer. I use it all the time now, e.g.

    $ qq concatenate all files in the current directory and sort the result on the third column
    cat * | sort -k3


yep absolutely - have seen a few of those. And how well they work is what inspires me to want the next parts, which are (a) send the surrounding lines and output as context - notice above I can ask it about "these files" (b) automatically add the result to terminal history so I can avoid copy/paste if I want to run it. I think this could make these things absolutely fluid, almost like autocomplete (another crazy idea is to actually tie it into bash-completion so when you press tab it does the above).

CodeLLama with GPU acceleration on Mac M1 is almost instant in response, its really compelling.


Yes, that's a good suggestion. I've just pushed a change to my utility to provide the paste buffer along with the question. This does mean you need to select the lines first but will work with your exact question now. It's actually useful to quickly provide more data when asking a question where I would have needed to think more about how to phrase the question previously. Btw it automatically puts the output into the paste buffer so there is no need to manually copy the result before pasting.

Of course, full integration with the terminal would be good!


This is very cool! Thank you for working on it and sharing it with us.


Thank you for checking it out! <3


I have a tangential question: my dad is old. I would love to be able to have this feature, or any voice access to an LLM, available to him via an easy-to-press external button. Kind of like the big "easy button" from staples. Is there anything like that, that can be made to trigger a keypress perhaps?


I personally have no experience with configuring or triggering keyboard shortcuts beyond what I learned and implemented in this project. But with that said, I'm very confident that what you're describing is not only possible but fairly easy.


Nice! Built something similar earlier to get fixes from chatgpt for error messages on screen. No voice input because I don't like speaking. My approach then was Apple Computer Vision Kit for OCR + chatgpt. This reminds me to test out OpenAI's Vision API as a replacement.

Thanks for sharing!


Thanks! You could probably grab what I have, and tweak it a bit. Try checking if you can screenshot just the error message and check what the value of the window.owner is. It should be the name of the application, so you could just append `Can you help me with this error I get in ${window.owner}?` to the Vision API call.


I would love to have something like this but using an open source model and without any network requests.


LLaVA, Whisper and a few bash scripts should be able to do it. I don't know how helpful the model is with screenshots though.

1. Download LLaVA from https://github.com/Mozilla-Ocho/llamafile

2. Run Whisper locally for speech to text

3. Save screenshots and send to the model, with a script like https://til.dave.engineer/openai/gpt-4-vision/


Probably in three months, approximately.


I misread the title and thought this was an app you run on a laptop as you drive around... which if you think about it, would be pretty useful. A combined vision/hearing/language model with access to maps, local info, etc.


It would be really cool, and I think we're not very far away from this being something you have on your phone.

The pilot name comes from Microsoft's use of "Copilot" for their AI assistant products, and I tried to play on it with macOSpilot which is maco(s)pilot. I think that naming has completely flown over everyone's heads :D


Nice project, any plans to make it work with local LLMs rather than "open"AI?


Thanks. Had no plans, but might give it a try at some point. For me, personally, using OpenAI for this isn't an issue.


I think that LM Studio has an OpenAI "compliant" API, so if there is something similar that supports vision+text then it would be easy enough to make the base URL configurable and then point it to localhost.

Do you know of a simple setup that I can run locally with support for both images and text?


People reading this should check out Iris[1]. I’ve been using it for about a month, and it’s the best macOS GPT client I’ve found.

[1]: https://iris.fun/


Oof, $20/month is a lot, when I already have my own OpenAI API key.


I guess having to enter the API key is not a great user experience for regular people who aren’t developers.


Due to ChatGPT Plus at $20/mo, and it not replacing ChatGPT, it doesn't stand strong for price conscious consumers. But I bet there's plenty who don't care.


I can see that.

For me, it did replace ChatGPT for one reason: The convenience of a lightweight Iris window being just a hotkey away.


I wish there was something like this for Windows!


I’ve looking for a simple way to use voice input on the main ChatGPT website, since it gets tiresome to type a lot of text into it. Anyone have recommendations? The challenge is to get technical words right.


If you're ok with it, you can use the mobile app -- it supports voice. Then you just have the same chat/thread open on your computer in case you need to copy/paste something.


Good idea, yes I do use the iOS app with voice all the time. But didn’t occur to me to use the iOS app to start a chat and continue on desktop. The main pain though is where I have lengthy back and forth with GPT4 discussing an approach or getting some piece of code just right. It often gets tiring enough that I just quickly type with lots of typos and it still does fine. But I’d rather not have to do that because these typo-filled chats will be hard to search though later :)


Do you have use case demo videos somewhere? Would be great to see this in action


There's one at 00:30 in this YouTube video (timestamped the link): https://www.youtube.com/watch?v=1IdCWqTZLyA&t=32s


I’d love to see a version of this that uses text input/output instead of voice. I often have someone sleeping in the room with me and don’t want to speak.


Added the text input option today.


You're not the first to request. Might add it, can't promise tho.


You made real-life Clippy! for the Mac. This would be great to be for other mac apps too. Add context of current running apps.


It should work for any macOS app. It just takes a screenshot of the currently active window, you can even append the application name if you'd like.


This looks very cool. Does anyone know of something similar for Windows? (or does OP intend to extend support to Windows?)


Hey, OP here. I don't have a Windows machine so have not been able to confirm if it works, and probably won't be able to develop/test for it either -- sorry! :/

I suspect that you should be able to take my code and only require a few tweaks to make it work tho, shouldn't be much about it that is macOS only.


For testing/development, you can download a free Windows VM here: https://developer.microsoft.com/en-us/windows/downloads/virt...


Have you thought about integrating the macOS accessibility API for either reading text or performing actions?


No, my thought process never really stretched outside of what I built. I had this particular idea, then sat down to build it. I had some idea of getting OpenAI to respond with keyboard shortcuts that the application could execute.

E.g. in Photoshop: "How do I merge all layers" --> "To merge all layers you can use the keyboard shortcut Shift + command + E"

If you can get that response in JSON, you could prompt the user if they want to take the suggested action. I don't see myself using it very often, so didn't think much further about it.


Did you not find the built-in voice-to-text and text-to-speech APIs to be sufficient?


Didn't even think of them to be honest.


Awesome! I love it! I was just about to sign up for ChatGPT Plus, but maybe I will pay for the API instead. So much good stuff coming out daily.

How does the pricing per message + reply end up in practice? (If my calculations are right, it shouldn't be too bad, but sounds a bit too good to be true)


I have a hard time saying how much this particular application cost to run, because I use the Voice+Vision APIs for so many different projects on a near daily basis and haven't implemented a prompt cost estimator.

But I also pay for ChatGPT Plus, and it's sooo worth it to me.

If you'd like to skip Plus and use something else, I don't think my project is the right one. I'd STRONGLY suggest you check out TypingMind, the best wrapper I've found: https://www.typingmind.com/


Wow, thanks for sharing that link, I've been looking for something like this :)


It's not working for me, I get a "Too many requests" http error


Hmm.. OpenAI bunch a few things into some error. Iirc this could be because you're out of credits / don't have a valid payment method on file, but it could also be that you're hitting rate limits. The Vision API could be the culprit, while in beta you can only call it X amount of times per day (X varies by account).

Make the console.log:s for the three API calls a bit more verbose to find out which call is causing this, and if there's more info in the error body.


Very cool, would love to have a Windows version of this.


I've not tried this on Windows, but might actually work if you run the packager. Try it. If it doesn't work, there shouldn't be too much that is macOS specific -- so you should be able to tweak the underlying code to work with Windows with fairly few changes.


This is brilliant!


Glad you liked it!


Was following these two projects by someuser on Github which makes similar things possible with Local models. Sending screenshot to openai is expensive , if done every few seconds or minutes.

https://github.com/KoljaB/LocalAIVoiceChat

While the below one uses openai - don't see why it can't be replaced with above project and local mode.

https://github.com/KoljaB/Linguflex


Nice! Although the productivity increase from being able to resolve blockers more quickly adds up to a lot (at least for me), local models would be more cost effective -- and probably feel less iffy for many people.

I went for OpenAI because I wanted to build something quickly, but you should be able to replace the external API calls with calls to your internal models.


Such a shame it uses Vision API, i.e. it can not be replaced by some random self-hosted LLM.


It can be replaced with a self-hosted LLM, simply change the code where the Vision API is being called. That's true for all of the API calls in the app.


Actually it's open source, so it can be replaced by some random self-hosted LLM



This is awesome


Thanks, glad you liked it!


> Open Source

> off to OpenAI Vision

Pick one


Welcome to the future where nobody is professional because there is no need for professionals. Just ask Corporate Overlord Surveillance Bot to give you instruction on what to do and how to think. Voilà. You are the master of the Universe. Dunning-Kruger champion for the ages to come.

The problem is obvious. Time to reaction. API calls limitation. Average response for a complex task due to limitation of the vision module. Similar functionality has to be available for free with local model tuned to those type of tasks - helper/copilot. Apple and Microsoft will include helper models into the OS soon. Let's hope they are generous and don't turn this to a local data gathering funnel (I have my doubts on this).


Worth mentioning that if you are in a corporate environment, running a service that sends arbitrary desktop screenshots to a 3rd party cloud service is going to run afoul of pretty much every security and regulatory control in existence


I assume that anyone capable of cloning the app, starting the it on their machine and obtaining + adding an OpenAI API key understands that some data is being sent offsite -- and will be aware of their corporate policies. I think that's a fair assumption.


that's a fair assumption. feels like swiftcoder is just trying to gotcha


You're telling me... the cloud... is other people's computers?!


The control for that is endpoints should be locked down to prevent install of non approved apps. Any org under regulatory controls would have some variation of that. Safe to assume an orgs users are stupid or nefarious and build defences accordingly.


This is exactly why in https://github.com/OpenAdaptAI/OpenAdapt we have implemented three separate PII scrubbing providers.

Congrats to the op on shipping!


True, but also true of other screen capture utilities that send data to the cloud. Your PSA is true, but hardly unique to this little utility. And probably not surprising to the intended audience.


A lot of negative comments here. However, I liked it!

Perfect Show HN and a great start of a product if the author wants to.


Thank you, it's my first GH project & Show HN.. and.. yeah.. learning here :D


Also think this is fun.

In general I’m pretty excited about LLM as interface and what that is going to mean going forward.

I think our kids are going to think mice and keyboards are hilariously primitive.


Before we know it, even voice might be obsolete when we can just think :) But maybe at that point, even thinking becomes obsolete because the AI:s are doing all the thinking for us?!


Please include "OpenAI-based" in the title. (Now many people here are disappointed).


Fair point, didn't think it would matter so much. Can't edit it any more, otherwise I'd change it to add OpenAI to the title!


Great. I created `kel` for terminal users. Please check it out at https://github.com/qainsights/kel


Very cool! Have you had much luck with Llama models?

I made Clipea, which is similar but has special integration with zsh.

https://github.com/dave1010/clipea


Yes, I used Langchain for Llama.


Clipea is cool.


Thanks!


Chatblade is another good one: https://github.com/npiv/chatblade


e-e-e-electron... for this..


Ah yes, cause what's better than building a real, working MVP? Learning Rust for half a year just so you can 'optimize' the f out of an app that does two REST calls.


To be fair, this does sound like the kind of app that would benefit from being able to launch instantly, and potentially registering with the OS as a service in a way that cross-platform frameworks like Electron cannot easily accommodate. But Rust would not be the easiest choice to avoid those limitations.


I don't know man. I'm new to development, it's what I chose, probably don't know any better. Tell me what you would have chosen instead?


Don't mind them—there's a certain subset of HN that is upset that web tech has taken over the world. There are some legitimate gripes about the performance of some electron apps, but with some people those have turned into compulsive shallow dismissals of any web app that they believe could have been native.

There's nothing wrong with using web tech to build things! It's often easier, the documentation is more comprehensive, and if you ever wanted to make it cross-platform election makes it trivial.

If you were working for a company it might be worth considering the trade-offs—do you need to support Macs with less RAM?—but for a side project that's for yourself and maybe some friends, just do what works for you!


Thank you for the explanation! At the end of the day, I'm a newbie and I'm in it to learn something new with each project. Next time I'll probably try my hand at a different framework.


I just watched a video about building a startup. One of the key points was to use what you know to get an MVP. Don't fret over which language or library to use (unless the goal is to learn a new framework). Just get building. I may not be a pro dev, but there is one thing I have learned over the years from hanging out amongst all of you. And that is, it doesnt matter if you are using emacs or vim, tabs vs spaces, or Java vs Python, the end product after all is what matters at the end of the day. Code can always be refactored.

Good luck in your development journey.


My two cents: I think you made a good, practical choice. If you're happy with Electron, I'd say stick with it, especially if you have cross-platform plans in the future.

If you want to niche down into a more macOS specific app, you could learn AppKit and SwiftUI and build a fully native macOS app.

If you want to stay cross-platform, but you're not happy with Electron, then it might be worth checking out Tauri. It provides a JavaScript-based API to display native UI components, but without packaging a V8 runtime with your app bundle. Instead, it uses a native JavaScript host e.g. on macOS it uses WebKit, so it significantly reduces the download size of your app.

In terms of developing this into a product, on one hand it seems like deep integration with the host OS is the best way to build a "moat", but then again, Apple could release their own version and quickly blow a product like that out of the water.


I think the parent comment is a shallow dismissal, but since you're asking, I would have built in SwiftUI


What's important is to get an product out there. Nobody cares what stack you use. just us geeks. don't get discouraged. you did well :)


electron's a really nice option, specially for people that aren't interested in porting their apps or spending too much time on development

this is a macOS specific app it seems - if you want better performance and more integration with the OS, i'd recommend using swift


Time to learn learn Swift in the next project then! Thank you for the deets.


The good news is you already have a tool to help you with inevitable XCode issues. grin


ignore the naysayers; nice job building out your idea


Thank you! I got pretty thick skin, but always a bit of insecurity involved in doing something the first time -- first public GH repo and Show HN :D


[flagged]


Apparently we aren't, so I changed it :D


[flagged]


According to your definition no program that connects using rest to a webservice is open source. That is absurd.

Open Source is defined by a license, not by what a program does. Also, it's trivial to connect such a program to another image recognition model if anyone wants.


I just think it's a bit misleading to call something "open-source (xyz)" if it wholly depends on a proprietary service that provides (xyz). If you made an open-source implementation of the Discord client, it'd be misleading to call it "Open-source direct messaging and communities app". At least in that example it would be possible to reimplement the backend and make it truly open-source, but so far nobody has been able to reimplement GPT-4V in a way that it's nearly as useful. Hence why some people get super excited when they see "Open-source [...] AI (using vision)" and super disappointed when it's just another wrapper for OpenAI


No. That is a tone-deaf and disingenuous interpretation. If I read “open-source AI copilot”, I pretty strictly read this as implying that the model is open-source. And I’m far from an open-source purist! Hell, I’ve never once made a snarky comment about OpenAI “not being open” or anything!


This is definitely clickbait. Do you see any other GitHub url on HN that need to stick "open source" on its title?


So fork it and change it to use Ollama or whatever you want. It’s open source.


Why the need to fork it, though? Shouldn't it be as simple as changing the URL of the API? If it is not so simple, then perhaps time for some standardization?


>Why the need to fork it, though?

Because you want it to behave differently than it does? It seems you are the one who wants things different, are complaining because they are not. Lucky for you, this is open source, and you can go ahead and fork it and change what you don't like!


fork it and contribute back to the main, i should have said :)


I'm contributing back not in code but in the form of architectural and UX advice :)


[flagged]


We can, if someone builds it. :)

I'm new to development, and this is what I went with. Don't know any better.


You made an open source tool and shared it with the world. Use whatever you want and don't feel bad about it. GP can port it to whatever native framework they want


if you want it, build it :)


Sure we can. Write them.

Oh, you want someone else to write them. For free.

That's different, then. Carry on!


[flagged]


That's ok, I'm with you -- it's more of a joke with friends that developed after I started making tutorials. I primarily use a normal face expressions, and they said I should use one with open mouth -- so I did for funsies, and been doing it a few times over the months although I feel it's a bit cringe.


There, changed it.


hey - i know there's quite some negativity in this thread, but i just wanted to let you know that you don't have to change something if you don't want to

if that specific facial expression is getting you more views, i see that as justified. it's up to you whether or not you want to keep the thumbnail or not


It depends on which audience they care about more. Do they want the HN crowd—old and cynical curmudgeons who dislike most of what is trending on social media—or the peers who cause such things to trend? If they decide they want the HN types as an audience, listening to their feedback is a good start.


I appreciate the comment, I really do! Fwiw I play around with thumbnails and titles quite a bit to try to learn what seem to work, and I think the underlying sentiment in the comment rings true for most in my audience. Aaand, it may be a coincidence but the views just hockey-sticked when I changed it :D


Every time I see someone making that face I wanna stick a slimy cold hotdog in their mouth


I like hotdogs :)


“macOSpilot runs NodeJS/Electron”

Lost me.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: