Search over 5M+ Stable Diffusion images and prompts

chrismorgan · on Aug 26, 2022

Stable Diffusion was finally enough to push me to install drivers for the NVIDIA 3060 on my laptop which has sat completely unused (never powered on once I figured out how to not power it on!) since I got it (I’d have preferred no dGPU at the time, but wanted other features of the laptop that are just about never sold without a fancy dGPU for some reason). Pretty hefty requirements, as a casual layman, even though I know this is smaller and more accessible than just about everything in the past. I think I ended up at around 9GB downloaded (which will cost me almost $2 in concrete terms) and 23GB of disk space used (including things like nvidia-dkms, nvidia-utils, cuda and python-pytorch-opt-cuda; all the relevant Arch packages came to about 14GB).

I’m using https://github.com/basujindal/stable-diffusion to run it since I only have 6GB of VRAM.

I’m having fun. But I haven’t had much luck getting it to draw the quick brown fox jumping over the lazy dog; a few steps in there are often the shapes of two animals, but it is consistently reduced to just a fox after a bit more. Extensions to the prompt (like reminding it that there are two animals, and trying to separate the two concepts) can improve it a bit, but it still tends to forget there are two animals, or if it gets two, to draw two foxes, or a dog–fox hybrid and a lazy fox. I imagine I could vastly improve my results with img2img and giving it a basic sketch with placeholders for two distinct animals.

It also has a surprisingly poor idea of what an echidna is.

moritonal · on Aug 26, 2022

Is it okay to ask what your situation is that 9GB cost $2?

chrismorgan · on Aug 27, 2022

I live in Australia in a rural area where the best internet connection I can get is on the Optus cellular network (I have clear line of sight to a tower 400m away used by fewer than 200 people; it’s actually the best non-commercial supply I’ve ever had for both speed and reliability, typically around 45/15Mbps five years ago when I moved to the area and with less than one observed downtime of less than one hour per annum, though where available NBN fibre should generally be able to be better these days). Actually, this is cheaper than it would often be, because it depends on what supplier I’m with at the time, which often depends on available introductory offers. My current arrangement amounts to 20¢/GB, the cheapest I’ve ever had (it’s interesting looking back even four years, when the best available was $0.90–$1.10/GB). When I finish the current one in a couple of months, it looks like I’ll switch again and be back to the ballpark I’ve had before, around 30¢/GB. Skip introductory offers, and you’re mostly at $0.60–$1.00/GB, or circles.life a bit lower but I refuse to use them again because of bad service and shameless illegal conduct that they refuse to acknowledge or do anything about (like sending third-party advertising text messages from CirclesLife, which has been illegal in the absence of explicit consent since the Spam Act 2003).

aaaaaaaaata · on Aug 27, 2022

Being Australian?

mmerlin · on Aug 27, 2022

<rant> i.e. Being a Murdoch cash cow milked by sycophantic weasels known as the Liberal Party.

Onion chump Abbott and his frenemy Turncoat being the main beneficiaries of forcing Murdoch's and Telstras decrepit copper/coax quagmire into what was originally designed as a full FttP rollout, already in progress 5% completed when they came into power and promptly halted everything to please their puppetmaster.

Next they promised to halve the costs by delivering a slow copper-throttled NBN. Except they blew out the budget by quadruple and still climbing... already over double the costa of the originally planned FttP rollout, so Murdoch got richer, and we get only 5% the speed we should have gotten... and now also pay double the monthly fees we would have had if those weasels had just kept their greasy pork barreling mitts off our nation building tax dollars </rant>

GrayShade · on Aug 26, 2022

Mobile data usage costs?

cercatrova · on Aug 26, 2022

I've been using this recently, it works great if you don't know what you want to make.

Also look into CLIP Interrogator, it does image to text basically, turning an image you like into what its prompt could be. However, it won't provide everything for you, just the main description of content.

tiborsaas · on Aug 26, 2022

Thanks for the interrogator tip, It's so cool that it works in reverse.

pavlov · on Aug 26, 2022

Feels like shiny concept art for games is facing a similar moment as portrait painting in late 19th century. Why pay someone to paint in this generic commercial style when you can get a meaningful automatic result at the push of a button?

Most other styles of illustration seem safer for the moment because they rely more on the illustrator's personality. (I'm not talking about the kind of stuff you buy on Fiverr, but professional designers who mainly get work through their networks.)

woojoo666 · on Aug 26, 2022

What styles of illustration would you say are safe?

tlholaday · on Aug 26, 2022

The Jack Chick Tract somehow makes me want to use regex to parse html:

https://lexica.art/prompt/64a384bd-d1b2-4f79-8921-f71737c70d...

jenny91 · on Aug 26, 2022

The way it mangles faces is actually super creepy.

These shots from an exit-less, claustrophobic NYC subway with mangled faceless things is the stuff of nightmares:

https://lexica.art/?q=new+york+subway

zamfi · on Aug 26, 2022

NYC in the ‘80s was a hard time. This is pretty much in line with my recollections…

ALittleLight · on Aug 26, 2022

I feel like there needs to be a model that fixes faces to clean this up. Humans are so attuned to faces that I can imagine it would take a specialized model to render convincing faces. Maybe there could be a layer to identify and occlude existing pseudo-faces generated by Stable Diffusion and another model to populate the occlusion.

meowface · on Aug 26, 2022

Many are doing exactly this with Stable Diffusion + GFPGAN (https://github.com/TencentARC/GFPGAN) as a post-processing model.

thomashop · on Aug 26, 2022

This is already done e.g. in Majesty Diffusion. You run another amount of iterations using a face Restauration GAN

bobm_kite9 · on Aug 27, 2022

And hands. Try searching for ‘hand holding apple’ for instance

spywaregorilla · on Aug 26, 2022

I find it hilarious how many of these prompts are using "unreal engine 5" to get a good image.

There's a lot (or honestly maybe a small amount) of work to be done to improve these prompt interfaces. Raw projections of your queries into the embedding space is honestly pretty dumb. Like, it'd be nice if we could start by settling the embeddings into images that are "good".

addandsubtract · on Aug 26, 2022

There's a rating feature on the website that let's you rate the results. It's greyed out for me so I'm not sure if it's a timed feature or a premium thing, but it's there.

spywaregorilla · on Aug 26, 2022

Not this website. The prompt interface to the embedding model.

There's no reason one should have to spam "good" sounding phrases like "high quality" into the prompt to get a good image. Direct embeddings of the prompt are stupid.

whywhywhywhy · on Aug 26, 2022

What people call "prompt engineering" is just knowing to do that.

spywaregorilla · on Aug 26, 2022

Yeah and what I'm saying is that this rapidly rising "skill" is just nonsense. This is not a reasonable way to interact with the embedding space. We will not be doing prompt engineering, hopefully within the "near" future.

This is like copy pasting by highlighting, clicking 'edit' and scrolling down to copy/paste all with the mouse.

DrSiemer · on Aug 27, 2022

No, it is not. Making these models generate interesting images requires you to learn something that is almost similar to a new kind of language.

As soon as you start to automate this process, for example by adding some default attributes like "4k, 8k, hd" to every prompt, you introduce a huge amount of bias to the output and lose the freedom to get anything outside of those specifiers.

Sure, future iterations will have a better understanding of language input. But knowing exactly how to phrase your prompts will always be a skill that requires eloquent writing to get to the more interesting and appropriate results.

In part that's because using more esoteric language will automatically connect you to a specific subselection of the source material, that was described using those more uncommon words in the training of the model. Having an extensive vocabulary and knowing how to wield it is actually a huge boon in this particular field.

"Unreal Engine 5" is just a quick shortcut to output that is detailed, clean, often futuristic and usually looks impressive. But you can go a lot further, for example by manually subtracting weights. Teasing MidJourney with this prompt was entertaining:

clear view of a dense forest::5 plants::-.5 tree::-.5 trees::-.5 foliage::-.5 leaves::-.5 shrubs::-.5 bushes::-.5 blur::-.5 mist::-.5 winter::-.5

Btw, is anybody working on a "language florifier" model yet? I imagine writers would be interested. "Rewrite this story with more emotion and in the style of Kurt Vonnegut, cyberpunk".

spywaregorilla · on Aug 28, 2022

Yes, it is stupid. Adding 4k to every prompt introduces bias. Yes. That doesn't mean learning the ins and outs of each phrases bias is a reasonable idea. It's also not guaranteed to be a constant effect. Its great that you can become more skilled at prompts, that doesn't make it a good interaction model. The interface is a tool and tools are important. That there are people who are great at typewriters doesn't mean they're all that reasonable in the age of computers and word processors.

> But you can go a lot further, for example by manually subtracting weights. Teasing MidJourney with this prompt was entertaining:

This is an example of an improvement from basic prompts. It's still far from a good model. "Guess and check" is basically the worst UX one can create for a design process.

One should be able to specify content separately from style, and layer in stylistic choices in a clear hierarchy. Text is a good model for specifying content. It's a pretty shitty way to specify style. Style is something we could likely convey visually and with pallet reference points.

galangalalgol · on Aug 26, 2022

Do dall-e2 and stable diffusion models regularly get retrained? If so, as they get retrained using the ouputs of the models scraped from sites like this will we see some sort of mode collapse?

Is it reasonable for hobbyists to retrain these models with reduced or custom image sets or would that require a lot of money in compute?

Aachen · on Aug 26, 2022

For anyone else who has no idea what they're looking at, from https://duckduckgo.com/?q=stable+diffusion :

> Stable Diffusion is a text-to-image model [...] It is a breakthrough in speed and quality meaning that it can run on consumer GPUs.

mpaepper · on Aug 27, 2022

For more details on how it works: https://www.paepper.com/blog/posts/how-and-why-stable-diffus...

benob · on Aug 26, 2022

What is the license for those pictures? Neither lexica nor openart talk about it.

wnkrshm · on Aug 26, 2022

Same license treatment as the training data. Edit: Nobody cares about licenses since they don't want to be asked about how they licensed the training data.

criddell · on Aug 26, 2022

https://huggingface.co/spaces/CompVis/stable-diffusion-licen...

null_object · on Aug 26, 2022

I’m thinking if Greg Rutkowski can get himself removed from the AI-prompts half of this art will disappear overnight.

ralfd · on Aug 26, 2022

Here is his Twitter:

https://twitter.com/GrzegorzRutko14/with_replies

But he doesn't seem to comment on ai art.

corysama · on Aug 26, 2022

He has given multiple live presentations on the Midjourney Discord server. He's quite happy that his work is helping lots of people make great new art.

isaacfrond · on Aug 26, 2022

Page with his work is here:

https://www.artstation.com/Rutkowski

isaacfrond · on Aug 26, 2022

He seems very popular in prompts. But how is he? He doesn't even have a wikipedia page.

whywhywhywhy · on Aug 26, 2022

Isn't the data set lifted from Artstation for a lot of this stuff?

https://www.artstation.com/Rutkowski He's a fantasy concept artist, not a like wikipedia page having artist.

MonkeyMalarky · on Aug 26, 2022

I think its just a meme from users seeing his name in other prompts. In many cases it's used in combinations with other artists whose styles aren't at all similar, it doesn't make sense other than users spamming it to get "good" results?

hulahoof · on Aug 26, 2022

I was just wondering this, does including his name have the same effect as the trending on art station stuff ?

ilaksh · on Aug 26, 2022

Where is this coming from? Everything people submit is public?

ralfd · on Aug 26, 2022

> Hyperrealistic mixed media image of matt damon bald head resembles !!uncircumcised penis!!, stunning 3d render inspired art by istván sándorfi and greg rutkowski, perfect facial symmetry, realistic, highly detailed attributes and atmosphere, dim volumetric cinematic lighting, 8k octane extremely hyper-detailed render, post-processing, masterpiece

Why the penis prompt?

isaacfrond · on Aug 26, 2022

Will this get me into trouble?

https://lexica.art/prompt/f472f43f-cc61-4a58-a2a8-f599bb598d...

malshe · on Aug 26, 2022

Because the title doesn't give the warning- the website is NSFW!

jxcole · on Aug 26, 2022

For the curious: it's not intentionally NSFW but the filters aren't very good (if there are any) so you do see occasionally inappropriate images.

DrSiemer · on Aug 27, 2022

Try "tatters" for some truly horrifying imagery. That nsfw filter needs some work.

dchuk · on Aug 26, 2022

Is it possible to economically/efficiently run this on a 14" MBP? Or do you need an nvidia graphics card to actually run this thing?

ManuelKiessling · on Aug 26, 2022

Takes about 10-20 minutes to create 5 images on my 16“ M1 Pro 32 GiB MacBook Pro. It takes around 1 minute on my desktop system using a 6 GiB VRAM RTX 3070 Ti.

wdfx · on Aug 26, 2022

meta: this site is causing all sorts of graphical glitches when scrolling on a pixel 6 android 13 Firefox. There are flickering boxes filled with pixel junk between each item and in the header

Samin100 · on Aug 26, 2022

Could you take a screenshot or better yet — a screen recording? I’m not really sure what might be causing that.

wdfx · on Aug 26, 2022

Here's a screenshot https://imgur.com/a/gpStPID

Samin100 · on Aug 26, 2022

Ah, that might be because of the backdrop-blur on the navigation bar. Weird that it’s causing graphical issues on your phone.

albertzeyer · on Aug 26, 2022

And yesterday we had the post on OpenArt: https://news.ycombinator.com/item?id=32586439

bravura · on Aug 26, 2022

How do you submit new images and prompts?

criddell · on Aug 26, 2022

https://huggingface.co/spaces/stabilityai/stable-diffusion

isaacfrond · on Aug 26, 2022

https://beta.dreamstudio.ai/

immmmmm · on Aug 26, 2022

Are there diffusion models trained on face datasets only?

avocado2 · on Aug 26, 2022

web demo for stable diffusion: https://huggingface.co/spaces/stabilityai/stable-diffusion

can also run it in colab (includes img2img): https://colab.research.google.com/drive/1NfgqublyT_MWtR5Csmr...

and web ui for stable diffusion runs locally (includes gfpgan/realesrgan and alot of other features): https://github.com/hlky/stable-diffusion-webui

isaacfrond · on Aug 26, 2022

Dupe:

Discover stable diffusion prompts with Lexica (lexica.art) https://news.ycombinator.com/item?id=32594107

Seems the same idea as openart: OpenArt: “Pinterest” for Dalle-2 images and prompts (openart.ai) https://news.ycombinator.com/item?id=32586439