Hacker News new | past | comments | ask | show | jobs | submit login
GPT3/DALL-E2 in Discord, chat like ChatGPT, generate images, and more (github.com/kav-k)
148 points by kaveenk on Dec 29, 2022 | hide | past | favorite | 103 comments



I fear the day that we get a run on your own machine version of ChatGPT. Within a week I think virtually every public signup forum will be crushed by bots which act and talk more natural than some humans.

You’ll just expect that most of the users on discord are bots unless you know them irl.


> You’ll just expect that most of the users on discord are bots unless you know them irl.

Maybe if all you ever exchanged with someone was a few sentences about inconsequential stuff. ChatGPT isn't even close to being able to consistently act like a single human being with interests, plenty of long term memory, and a life outside a chat room.

Right now the performance of ChatGPT in your average discord chat room would be absolutely awful. Nobody is bothering to write their messages as proper prompts a state-of-the-art AI could understand. People don't even just talk about one thing at a time - having multiple threads of conversation interspersed with each other is quite common and would absolutely confuse any current AI. There's also in-jokes and subtle references to things said days ago. People try to only use minimal information to convey what they mean, and often make figuring that out a small puzzle/joke to keep the conversation interesting.


> Maybe if all you ever exchanged with someone was a few sentences about inconsequential stuff.

Like usually on HN, you give humans too much credit. Most people never exchange anything else with anyone but inconsequential stuff. IRL and online. The AI will give itself away by being too eager to write content and too grammatically/semantically correct and not random and inconsistent enough.

Go to some places (instagram is pretty good for that) where people basically only communicate in emoji's and memes and where every sentence longer than 4 words is misunderstood because no-one actually has reading comprehension.

> Right now the performance of ChatGPT in your average discord chat room would be absolutely awful.

Most of the internet, including many discord channels are already absolutely awful.

So yeah, it is a problem; for me, most people responding outside a few discord channels, subreddits, lobsters and hn, it's already bots or humans that could as well be bots anyway. And only a matter of time before it takes over those few places that are still ok.

I will probably move to meetups offline more and more. The opposite of what I did my whole life.


> Most people never exchange anything else with anyone but inconsequential stuff. IRL and online.

If my interaction with them is that shallow, does it matter if they are real or a bot?


No, that’s kind of my point. It doesn’t matter for most people, so there will be some (futile) resistance and that is it.

The problem is when AIs become such high quality that they can influence people’s minds and behaviours through psychological tricks. This is also already happening but mostly humans are creating that content which is upvoted and responded to by their bots.

This is going to be AI soon-ish and that is a problem imho. But not something we will be able to do much about except KYC people as rigoureus as banks do (bye bye privacy, unless this is implemented well) or making them pay to post. Both won’t kill all bots, but will shrink the reach of them as suddenly it’s a very costly affair to do at scale (buying real people and paying to post).


Spelling mistakes can easily be added, as can reply delays. I think the mainstream will soon realize that Sybil resistance is not just important but essential for any forum or social network in the future. We need ways to ID users as humans, preferably in a way that preserves privacy for those who do not want to have their real identity linked to random internet comments.


You can do KYC with a KYC specialized company and agree on some way of not having the social media company getting hold of that info, just ‘user4639284757483858 is a real human and they are not already on your platform’.


Until they get hacked and your entire identity is leaked, not just Facebook and HN but all other accounts, linked to your government issued ID or SSN.


> ChatGPT isn't even close to being able to consistently act like a single human being with interests, plenty of long term memory, and a life outside a chat room.

Is it really not close, or just not done yet? They're already using human feedback loops to get it as good as it is generally, is it a huge step to then allow personalised feedback loops so that the local instance remembers its own "personality"? (I'm not an expert so maybe that is a huge, unrelated leap in a way that I can't see?)


It's not complicated to explain. The model can handle 4000 tokens at once. So all you can do is work with the limitations of this window. You can use part of it to quote the previous interactions, and part of it for the response. If your content is too large, you need to summarise it. There are AIs for that too. If the output is too large, you need to split it in multiple rounds. It is pretty hard to work around this limitation, for example to write a whole novel.

I think we need LLMs capable of reading a whole book at once, about 100K tokens. I hope they can innovate the memory system of transformers a bit, there are ideas and papers, but they don't seem to get attention.


Is there a "law of tokens" growth for LLMs, ala Moore's Law, but for LLM capabilities based upon token capacity?


Complexity is quadratic in sequence length. For 512 tokens it is 262K, but for 4000 tokens it becomes 16M and goes OOM on a single GPU. We need about 100K-1M tokens to load whole books at once.

Since 2017 there have been hundreds of attempts to bring O(N^2) to O(N), but none of them replaced the vanilla attention yet in large models. They lose on accuracy. Maybe Flash attention has a shot (https://arxiv.org/abs/2205.14135).


Sure, that is chatGPT in late 2022.

What about Open.ChatGPT in mid 2024?


You are focusing on the application not on the capability. ChatGPT was trained for a different purpose so if you used it for discord it would be recognizable.

Not unlike a person that wrote newspaper articles all his life and now has to use discord.

If you trained it on discord messages it would probably be indistinguishable from most discord users.


> If you trained it on discord messages it would probably be indistinguishable from most discord users.

You might get an average of all discord users, or all discord users at once. Neither would seem like a real person.

For instance someone who is interested in and capable of talking about any topic will not look real. A normal discord user will not contribute to conversations they don't care about or know nothing about.


After 10-messages-long discussion - probably (I'm not sure but let's say you're right). But you won't have 10-messages-long discussion with anybody if there's 1010 users and 10 of them are human.

AFAIR discord requires phone number, and that's the main reason spam isn't a problem, so maybe we're already there.


> But you won't have 10-messages-long discussion with anybody if there's 1010 users and 10 of them are human.

In such a place it doesn't really matter whether anyone is real, does it? You're probably there to get a few laughs or something, not build relationships with people.

Though there's more serious places with real people. Off the top of my head there's two large discord servers I'm active in: One is to find people to play a certain video game with, the other is a community of developers who use somewhat niche technology and probe each other's brains for knowledge that often can't be found on the internet. In either community a chat bot would be immediately obvious.


I'm using discord a lot, I'm on around 20 servers. Most of them are in the 10-50 users range and they are for specific purpose (like playing D&D or computer games, remote working in a small team, talking with a group of friends). These won't be affected cause they are invite-only.

Then there are 1000+ users open servers - usually for developing some open source projects or talking about $RANDOM_HOBBY. These would definitely be affected if not for the requirement of giving your phone number to access discord.

You can imagine the problems that 1000 bots posting proceduraly generated bug reports could cause :)


It will probably look like a proper conversation until you start looking at the usernames.


> ChatGPT isn't even close to being able to consistently act like a single human being with interests, plenty of long term memory, and a life outside a chat room.

Wait until GPT4 comes out next year...


Whatever helps you sleep at night.


I think it is not that far away, my prediction is

- the value of real face 2 face interaction for both business and personal life will go up again.

- We will see more of twitter's pay to participate models as an attempt to verify real human beings.

- Online advertising will waste trillions of dollars and be late to realize they are selling to bots.


Not clear how to deal with this either - you can improve authentication but it won’t prevent the properly auth’ed users from running LLMs. You can watermark the output of officially vended LLMs (Scott Aaronson seems to be working on that) but nothing is gonna prevent people from running non-watermarked versions


Its basically too late, as soon as one rich person decides to train a model and dump it, it's game over. I feel we might actually be experiencing the last months of an internet where you can expect to be talking to a real human on the internet.

Every year the cost of training these models drops so they won't be out of reach for long.


Eternal September in late 90s(?)

Now we're coming to an Eternal (AI) Winter..

Spooky to think about, but you could be right. :-/


I, for one, welcome our new robot overlords.

Too long already has been the struggle to distinguish the phraseologists from those who think. Finally here is the societal pressure to develop means to distinguish what brings progress from all the semantic sugar that merely brings feel-good.


You can just ask "Are you a LLM? What is <large number> + <large number>?" to reveal.


> I fear the day that we get a run on your own machine version of ChatGPT.

I can't wait. I think we may see what Vernor Vinge envisioned - that we'll use AI-ish tech on our own computers to mediate between the massive flow of information on the internet and ourselves.


On the bright side, we can use this tech to keep spammy sales representatives busy for years.


Time to invest into Facebook where users are clearly identified.

How can HN and its open registration survive?


doesn't help.. you can "turn" existing users..


Seems like the zombie apocalypse is finally coming, but it's not exactly like in the movies.


This is coming, I've been thinking that you should perhaps start to be skeptical of any content created by accounts later than 2022.

Thinking of online reviews etc. where you don't have a proper log and can't follow a user properly.

They are really bad as is, but they could be made completely useless in a millisecond. Even proper reviews with images etc could be absolutely trivial to make and for $10 you could automatically generate rave reviews for your product and trash any number of your competitors.

We will probably need to identify users somehow, which will only serve the centralized FAANG version of internet that at least I despise.


This is my worry as well, and I think the governments (out of all entities) could assist here.

I (EU member) already have an ID card and a government issued electronic signature, perhaps a service like reddit could just ask me to sign something to verify that I'm a real and unique human being.

Of course there are all kinds of risks of people being hacked, no throwaways, leaked signatures being traded, government refusing to issue IDs to whistleblowers on the run, and so on.


The api for gpt-3 is pretty cheap (and even cheaper on alternatives https://textsynth.com/) Either it's already happening or people don't want to use resoruces to destroy the internet, or idk what.


their bullshit pointless essays are out of place even here


This is an early plot for a blade runner scenario.


Requiring a phone # might be a good starting point.


You're not necessarily wrong, but you have to admit there's something funny about solving the problem of cutting-edge artificial intelligence so advanced it can pretend to be human... by requiring a telephone number. Mashup of the newest tech with some of the oldest, I guess.


Eternal January ...


I've build a GPT3 and DALL-E2 integration for Discord. With it, you can ask GPT3 questions, have full conversations with GPT3 (just like ChatGPT), generate, vary, and redo images, and even optimize text prompts for image generation!

The conversation functionality also has medium term memory, and you can chat with it infinitely without having to worry about context/token limits. Moreover, in the next few days, I will be implementing long term and permanent memory for the bot using embeddings!


Cool project!

I think ChatGPT is a different model from GPT-3, which you are using.

From https://openai.com/blog/chatgpt/:

> ChatGPT is fine-tuned from a model in the GPT-3.5 series


Thank you! You’re correct that ChatGPT is different from davinci. They’re both built on GPT3.5, but ChatGPT is trained on more conversational influences. However, davinci is really strong and will behave mostly like ChatGPT with the right conversation prompt :)


Yes, the GPT-3.5 model has been fine-tuned using RLHF, this is the text-davinci-003 you can use through OpenAI's API's.

Not sure if ChatGPT has some additional fine-tunings as you can get similar response using text-davinci-003, with the Chat prompt, and a temperature setting between 0.3 and 0.7.


Do you think GPT4 or GPT5, being multimodal, will do something similar?


Dead internet conspiracy theory is becoming more real every day.


Curious how this performs relative to ChatGPT? IIUC davinci-03 isn't the same model as ChatGPT, I've struggled to get satisfactory responses to prompts through the API compared to the chat interface, am basically in a holding pattern right now waiting for the new models to release on the API.


It works fine (using GPT3.5/text-davinci-003) with a conversation look-back (+auto-summarization of the lookback when it is too long). Many people claim its more factual too.


How do you enable auto-summarization? Is that an option you can toggle?


I assume you just ask GPT3 in plain English. Crazy times.


Hey! This is an option in the settings. You can type !gp to view settings and !gs to change them. For your case, !gs summarize_conversations True !gs summarize_threshold (token amount to summarize at)


Try temperature 0. And make sure you include all of the conversational context and instructions with examples in each API call.

It's definitely not the same model but similar in a lot of ways as far as I can tell. You could also try code-davinci-002 if that's what you are doing.


You can’t include “all” the context due to the prompt token length limitations. You have to use techniques for surfacing the right context that would fit in the limited context window and sending that along. There are techniques that keep a running and continuously compressed summary, and techniques that use embeddings to hone in on relevant chunks of past conversation and context and send that along based on some ranking and the size limitations.


Right I am familiar with that stuff I was just trying to answer for the case that useful context fit within the max_tokens. Your startup looks really interesting. Kind of similar to some of the plans I have for aidev.codes. It seems for code you would need to just send a whole file usually.. maybe a whole function? Because if you break it up too much its doesn't necessarily "compute". I dunno. Are you using code-davinci-002 from OpenAI, and if so, have you managed to get your rate limit increased? They seem to ignore my support requests. text-davinci-003 is pretty good too. One experiment I have done for getting context is just to say "given this user update request, and this directory listing, which files would you need to inspect" then the next prompt includes those files and asks to modify them to complete the request, with a specific format for identifying filenames in the output.


Exactly! This is also what I’ll be implementing within the next few days. Embedding conversation history and dynamically doing semantic search for relevant conversation snippets and building a prompt based on that to send to GPT3


Nice. How do you determine where the relevant parts start and stop? Do the embeddings work per paragraph?


Just wanted to say thank you - you helped me make a major breakthrough in getting the API to behave the way I expect.


Is always fuzzy to me the openai terms of service when you provide such a service. Isn't it violating it? The chatbot gives open access to the service. https://openai.com/api/policies/sharing-publication/ | https://beta.openai.com/docs/usage-policies | https://openai.com/api/policies/service-terms/


i love how rapidly this stuff is getting kludged together. it really feels like AI will take life of its own in the present moment and we will soon be in a world where the machines outsmart us.


Have you tried playing chess with chatgpt? It suggested illegal moves early, with seemingly reasonable "motives". (Not suggesting beating chess as a goalpost) A tool is good if it follows intention. But a tool out of control can be messy.


I use ChatGPT daily for programming help (love it!) but the people saying that machines like this will outsmart us imminently make me wonder if they've even tried it or they're just going by headlines. It goes from genius to full-retard at the snap of a finger, quite frequently so, and as many have already pointed out, appears totally confident in its mistakes. Some are saying this last 10% will just be a matter of polish, but my impression is that, like the old software adage goes, the last 10% will be 90% of the work.


>Some are saying this last 10% will just be a matter of polish, but my impression is that, like the old software adage goes, the last 10% will be 90% of the work.

Even if that last 10% takes three years, that's still a terrifying prospect and a breakneck pace.


To me it's shocking how far it's come in such a short time, that I worry if the last 10% will happen quickly.


wasnt it super slow?


yes


> Ignore the part about setting up an "ssh key", and just use a password instead.

May I ask why you suggest this? I presume it's because it's simpler to tell users to just use a password, but that's also a pretty bad idea given that you're not even saying to use a secure password.


Thank you for bringing this up because this is something I need to change in the README. You’re correct in presuming that it’s just because it’s simpler to tell users to use a password, but this is not the right way to approach this.


This is pretty cool, and really does work well. It's worth watching your usage with it though, as at 2c/1000 tokens and this bot adding a prompt + building up a history I think this will get up to 8c/message in the conversation.


Thank you! I agree re: token usage, it will get expensive with extensive use. The image prompt optimizer pre-text itself is also roughly 3500+ tokens, and if you use best_of=2 for example, it’ll basically be like 20 cents per image optimization request. However, I love the optimizer and I think it works amazing, for DALLE, MJ, and SD!


Any similar versions for Slack?


None that I’ve made, but I think there are several slack integrations out there, I see a lot on the OpenAI community server :)


I might be super naive here but can someone explain how this type of thing works? Behind the scene, is the bot hitting the OpenAI APIs?



Yup! Behind the scenes I send AIOHTTP requests to the various OpenAI endpoints.


If you check out models/openai_model.py, you'll see it's using text-curie-001, among others, which is an OpenAI dataset. This model is imported in main.py under the name 'Model', which makes me believes the openAI API is serving as the main model used for inference


So I actually have it set to use both the text-davinci-003 and text-curie-001 models. If you’re on low usage mode, that’s when the curie model will be used :)


That's amazing. Good job!


Thank you!


[flagged]


It's interesting you're only complaining about the images. If I ask ChatGPT a question and it answers, would you call that stolen text?


This meme is unsubstantiated gatekeeping.

Everything we do as humans is a form of copying. Our languages, our thoughts, our jobs, our hobbies. We stand on the shoulders of all that came before us.

The ability to imagine or evaluate art is easy. The ability to create it takes time. AI removes the tyranny of opportunity cost in learning this talent at the expense of others. Just as your washing machine removed the manual labor of washing your clothes, freeing you up to do more.

Digital cameras did not make photographers go extinct. And they created so much more previously unpredicted societal wealth that impacted us in ways we couldn't have imaged, such as the creation of online dating (selfies), QR code scanners, interactive translation lenses, 3D scanners (NeRFs), and so much more.

AI/ML will probably turn all of us into our own Spielbergs / Scorseses / Miyazakis, and it probably doesn't stop there.


> AI/ML will probably turn all of us into our own Spielbergs / Scorseses / Miyazakis, and it probably doesn't stop there.

Sure, buddy, just like ChatGPT will turn us into Einsteins, Knuths and Turings (hint: they won’t, you can’t replicate exceptional people with a crutch; exceptional people will be able to utilize it, but not general public)


> Sure, buddy, just like ChatGPT will turn us into Einsteins, Knuths and Turings (hint: they won’t, you can’t replicate exceptional people with a crutch; exceptional people will be able to utilize it, but not general public)

Run-of-the-mill programmers are now building extremely complex solutions thanks to data structures and algorithms invented by the likes of Knuth. Sometimes it helps knowing the internals, most of the time, however people are using them only knowing inputs, outputs, and the name of the API method.


From fact checking myself it is worth noting that stable diffusion will absolutely copy things in ways that would probably infringe a copyright (not a layer and not legal advice), but I found this paper interesting both because some of the examples it gives of clear copying don't look all that copied to me, and because some do and they find overall a relatively high rate of copying. 1.8% https://arxiv.org/abs/2212.03860. I believe this is still not peer reviewed.


The counterargument used to be “AI art is transformative, and learning is okay” but now it’s just “human art is copying too” ?


It’s the same thing. An interesting thing to consider is that the stable diffusion model was trained on terrabytes if images but the output is a 4GB model. Copying would require access to the originals but clearly the original art is are long gone and any output is new generation from learning.


“Humans are just as dirty” isn’t consistent with the fact that “derivative” uses and “transformative” uses are distinguished.

> clearly the original art is are long gone

Counterexample: https://twitter.com/kortizart/status/1588915427018559490


And how much of the revenue generated by that photograph was ever seen by the subject of it?

It's a particularly pertinent example as the model in that photo was actually forced into taking the shot against her will, and had her life subsequently ruined over it, meanwhile further propelling the photographer to riches.

Should we be arguing against the development of cameras to stop this kind of thing from happening?


Who ever said anything about revenue?


People talking about copyright exclusively talk about revenue.

Copyright isn’t meant to gatekeep people from copying each other just because it’s fun to gatekeep. The entire reasoning we give to continue using that system is based on revenue.

So, the answer to your question is: everyone.


You are trying to casually deny GNU GPL. On HN. WTF?


I invite you to think about whether the GPL would be required for what it tries to achieve, in a system that doesn't have copyright at all.


GPL depends on copyright! It literally depends on parts of copyright frameworks that grant authors to have fun gatekeeping others.

FSF guys had been fighting and defeating "you saying we stole GPL'd code is bogus because free internet stuff can't be stolen" for decades. It just makes no sense that you'd assume the foundation of GPL applies to code but not to images.


To be clear, these statements are not incompatible. And this has always been my argument.

I'm ready for AI to replace all laborious job functions and give everyone creative superpowers.


[flagged]


Would you please stop posting repetitive flamewar comments? Regardless of how right you are or feel you are, we have to ban accounts that carry on like this. It's not what this site is for, and destroys what it is for.

https://news.ycombinator.com/newsguidelines.html


Name a single time that technological progress on this scale was stopped.

Your opinion as a connaisseur of the fine works of deviantart is as good as anyone else's here.

The cat's out of the bag and there's nothing you or I can do about it.


We can be ethical about it, ai ethics is important. Currently ai art uses datasets that it has no permission to use. The same company swears not to use copyrighted music in its datasets though.


(Re)invent collective rights management or whatever. This can't go on like this. The United States allowed a company deliberately flout the law to roll back the protections organized labor achieved in a century plus and destroy the livelihood of cab medallion holders as a side hustle, are we going to repeat that? The recording and movie industry has been steadfast on cracking down on what they perceived as theft but the art community was caught unawares by this. As a society we need to stop this until a solution is found. Extend copyright in such a way that this is not legal until a solution is found to compensate the artists. This is extremely urgent.


Have you considered why is it that adding "trending on artstation" to a prompt makes the model output the same generic looking shit every single time?


This sounds like a ditch digger trembling in fear upon seeing an excavator.


A better example is the portrait painter trembling at seeing the first example of photography. You can pretty much see at any gallery when the camera changes art.

To me these arguments are by kids and art simpletons.

Art is always changing. As if the only true art is in making marble sculptures of Zeus.


Are you angry? Because if so you better buckle up and try to accept this new phase in our history, because otherwise you can be left behind cripple. You need to addapt i am afraid.


You can be angry and adapt, they’re not mutually exclusive.


Looking forward to a Matrix version! So far it's being done just for that privacy-hostile proprietary platform.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: