Stable Diffusion:Real time prompting with SDXL Turbo and ComfyUI running locally

smusamashah · 2023-11-29T18:15:38.000000Z

Someone also improved speed of LCM recently.

Code: https://github.com/discus0434/faster-lcm Blog post (in Japanese): https://zenn.dev/discus0434/articles/12427b887b4082

Have no idea how this can be used but they claim 26fps on a RTX 3090.

cchance · 2023-11-29T19:28:18.000000Z

I'm sorry ... WHAT! I pay attention to the SD on reddit and no one saw this or somehow it got swept under the rug somewhere lol... Thats insane!

I wonder if SDXL Turbo + LCM will be a thing, to get to realtime generation

dragonwriter · 2023-11-29T21:51:36.000000Z

> I wonder if SDXL Turbo + LCM will be a thing

SDXL Turbo works best (at least from my trials today) with the LCM sampler, producing better results in fewer iterations with it than it does with Euler A.

smusamashah · 2023-11-30T00:12:24.000000Z

This was one of the top posts on reddit yesterday, not as much as turbo though.

dragonwriter · 2023-11-29T17:56:27.000000Z

Using the comfyui workflow [0] I'm getting really impressive results (obviously, not as quick as single step, but still very fast [1]) at 768x768, 10 steps, using the lcm sampler instead of euler ancestral, and putting CFG at 2.0 instead of 1.0.

[0] drop this image on the comfyui canvas: https://comfyanonymous.github.io/ComfyUI_examples/sdturbo/sd...

[1] On a 3080Ti laptop card

tetris11 · 2023-11-29T20:49:12.000000Z

I love that they embed entire workflows into the meta of their images

dragonwriter · 2023-11-30T01:28:10.000000Z

Combined with the ComfyUI Manager extensions which provides an index of custom node packages and can install missing ones from a loaded workflow it makes it very easy to get up and running with a new workflow.

dbreunig · 2023-11-29T16:13:24.000000Z

Reading through the comments, it's hard to not think of "Pygmalion and Galatea."

colesantiago · 2023-11-29T16:00:20.000000Z

I'm so glad this is now available for free and not needing to ask or being charged extra by an artist for different concepts.

The barrier is really being lowered and this is beautiful.

What a great time to be alive.

dragonwriter · 2023-11-29T18:45:08.000000Z

Bad news for that: it is only free for noncommercial use, and this isn't just a temporary early-release thing, but the new general direction for StabilityAI:

https://twitter.com/EMostaque

LordDragonfang · 2023-11-29T18:58:56.000000Z

Slightly off-topic, but holy crap, some of these use cases he retweeted are bonkers:

https://twitter.com/toyxyz3/status/1729922123119104476

https://nitter.net/toyxyz3/status/1729922123119104476

cchance · 2023-11-29T19:32:47.000000Z

I mean the fact SD has shown its possible just means that other research groups can also use the same concept...

Someone on Reddit was actually pointing towards a model thats more narrow in scope to SDXL but that was trained by a single guy on an A100, so no reason we can't expect other groups to pop up or maybe a consortium of freelancers from the fine tuning community to maybe get together to start there own base model.

nico · 2023-11-29T02:08:25.000000Z

Really impressive. Is that video sped up at all? Crazy fast!

yreg · 2023-11-29T03:32:41.000000Z

Not sped up.

Here's a live demo, but you need to register an account.

https://clipdrop.co/stable-diffusion-turbo

simbolit · 2023-11-29T20:20:56.000000Z

You need to provide an email address and click on a link. Emails from grr.la work.

Yes, it is "register an account", but it is the lowest friction I know.

cchance · 2023-11-29T19:29:38.000000Z

Nope and this is just the beginning imagine a year from now or once the people that bring us LCM and controlnet and ipadapter start looking at the possibilities, not to mention fine tunes on turbo.

pilotneko · 2023-11-29T15:56:46.000000Z

It's not sped up. I tried it locally last night and it was just as fast. Running on Windows 11 w/ a RTX 3090.

wccrawford · 2023-11-29T16:31:51.000000Z

I just tried it locally with a 3070 and it was about 3 seconds per render. I'm far from great at this stuff and it was my first use of ComfyUI, so I don't know if that number could be improved on my setup.

pbalcer · 2023-11-29T17:53:56.000000Z

On my machine with AMD RX 7900XT, it takes ~0.17s per image. Are you using SD Turbo Scheduler node?

wccrawford · 2023-11-30T13:22:47.000000Z

At the time I hadn't, but now I've used the sample image (the one with the fox in the bottle in the snow) to load up the workflow with sdturboscheduler, and it's still about the same time.

Edit: Even thought the UI says sdxl turbo, I notice that the command prompt is saying sdxl.

  got prompt
  Requested to load SDXLClipModel
  Loading 1 new model
  Requested to load SDXL
  Loading 1 new model
  100%|| 1/1 [00:00<00:00, 11.30it/s]
  Requested to load AutoencoderKL
  Loading 1 new model
  Prompt executed in 4.04 seconds
  gc collect

Edit 2: More info... I noticed that if I just change the steps, it takes less than a second to generate an image. If I just change the prompt, it takes 3+ seconds. I don't know enough about this to know what that means.

rmckayfleming · 2023-11-29T18:06:10.000000Z

Have you used the 7900XT with LLMs at all?

pbalcer · 2023-11-29T18:19:23.000000Z

Yup, works just fine. Overall pytorch on ROCm 5.6 has been working very well. I'm impressed with how stable it is, given how much hate AMD driver stack has been getting.

jdright · 2023-11-29T18:52:55.000000Z

same on my end, the biggest issue sometimes is finding the correct torch/torchvision version that go with different projects as AMD users are still rare. So, the problem is still the AMD ecosystem is pretty niche and will be pretty stable unless you pick a dev or wrong version, then it may happen to cause kernel panics! In any case, things looks to be improving, but rocm has a lot to support (xformers and other core ML libraries)

Rastonbury · 2023-11-29T18:01:39.000000Z

Doesn't seem that crazy 3070s can do 20 step 768px in like 6 to 9s

acheong08 · 2023-11-29T13:02:00.000000Z

I really want to know if it would work on a CPU. I don’t have the money for a graphics card

NietTim · 2023-11-29T15:53:11.000000Z

You can already do that using existing models, but instead of generating 1 image taking a few seconds it will take at least a minute, perhaps SDXL Turbo brings that down

sebzim4500 · 2023-11-29T15:17:39.000000Z

You can run these models on CPU but it would be much slower than the demo.

unaindz · 2023-11-30T11:29:52.000000Z

Ideally you would run SDXL Turbo with OpenVino optimizations. I'm not aware of any project that has support for both but maybe there is something.

StantheBrain · 2023-11-29T23:36:56.000000Z

Any help ? When loading the graph, the following node types were not found: SDTurboScheduler

StantheBrain · 2023-11-29T23:36:35.000000Z

When loading the graph, the following node types were not found: SDTurboScheduler

jejeyyy77 · 2023-11-29T15:46:26.000000Z

is it the model that made it really fast?

dragonwriter · 2023-11-29T19:18:19.000000Z

Yes, but its not a change to the model architecture, so you can actually subtract out the base SDXL model, merge in any existing SDXL checkpoint, and get an Turbo version of the existing checkpoint..

(In ComfyUI, this just takes loading all three checkpoints -- turbo, base, and the target one -- and using a ModelMergeSubtract node to subtract base from turbo, to get the "turbocharger" alone, and a ModelMergeAdd to add the turbocharger to the target checkpoint.)

cchance · 2023-11-29T19:30:33.000000Z

So basically like what they do with the inpainting models. We need a jugernaught XL turbo

Tiberium · 2023-11-29T16:34:35.000000Z

Yes, SDXL Turbo can do inference in a single step.

dragonwriter · 2023-11-29T17:46:33.000000Z

Technically, any SD model can do inference in a single step, SDXL Turbo just is an improvement in quality with a small number of steps (per the comparisons in the announcement, at 4 steps it is a little bit better than SDXL base at 50 steps.)

EDIT: When I say "just", I'm not saying this isn't super impressive, just that the change is in quality of low-step-count generations, not making it possible.

Hoasi · 2023-11-29T16:46:39.000000Z

[flagged]

Hoasi · 2023-11-29T17:17:22.000000Z

[flagged]

andybak · 2023-11-29T17:39:33.000000Z

It's a tool. Some people will do something interesting with it. Most people won't.

Hoasi · 2023-11-29T17:46:47.000000Z

True. For most, it will be a tool to generate ugly pictures.

dragonwriter · 2023-11-29T18:01:58.000000Z

True, diffusion models do not cure Sturgeon's Law.

OTOH, there is nothing special about Sturgeon's Law applying to them as well as everything else.

orbital-decay · 2023-11-29T17:26:09.000000Z

This could be a fact if all you have in mind is kitsch, sure.

Hoasi · 2023-11-29T17:45:50.000000Z

I mean, this defeats the point of democratizing art. What's the point of driving the cost of an image to zero and replacing a craft with automated art of that level? Where exactly is the progress here? Are we expected to be impressed by speed or quality? Where is the value? The level of kitsch here is overwhelming. Should we applaud for more kitsch in the world?

"Was einer möchte und nicht kann, wird Kitsch." —Jan Tschichold

orbital-decay · 2023-11-29T18:02:03.000000Z

> Should we applaud for more kitsch in the world?

Of course. It gets nauseating fast and makes worthwhile messages stand out more, causing the balance swinging to the other side. You're creating something for someone, what's the point if they don't value your carefully crafted message and want kitsch instead?

> Are we expected to be impressed by speed or quality?

Realtime generation in particular makes 3D rendering in the viewport and live painting possible. [1] This kind of tooling makes a lot of difference. What you make with it, either kitsch or something worthwhile, is up to you.

[1] https://www.youtube.com/watch?v=AF2VyqSApjA

dragonwriter · 2023-11-29T18:04:45.000000Z

> Where exactly is the progress here?

The progress here is that driving the inference phase of generating AI imagery down to a much smaller time/cost stops it from holding back the rest of the creative process using that toolset.