On my machine with AMD RX 7900XT, it takes ~0.17s per image. Are you using SD Tu...

wccrawford · 2023-11-30T13:22:47 1701350567

At the time I hadn't, but now I've used the sample image (the one with the fox in the bottle in the snow) to load up the workflow with sdturboscheduler, and it's still about the same time.

Edit: Even thought the UI says sdxl turbo, I notice that the command prompt is saying sdxl.

  got prompt
  Requested to load SDXLClipModel
  Loading 1 new model
  Requested to load SDXL
  Loading 1 new model
  100%|| 1/1 [00:00<00:00, 11.30it/s]
  Requested to load AutoencoderKL
  Loading 1 new model
  Prompt executed in 4.04 seconds
  gc collect

Edit 2: More info... I noticed that if I just change the steps, it takes less than a second to generate an image. If I just change the prompt, it takes 3+ seconds. I don't know enough about this to know what that means.

rmckayfleming · 2023-11-29T18:06:10 1701281170

Have you used the 7900XT with LLMs at all?

pbalcer · 2023-11-29T18:19:23 1701281963

Yup, works just fine. Overall pytorch on ROCm 5.6 has been working very well. I'm impressed with how stable it is, given how much hate AMD driver stack has been getting.

jdright · 2023-11-29T18:52:55 1701283975

same on my end, the biggest issue sometimes is finding the correct torch/torchvision version that go with different projects as AMD users are still rare. So, the problem is still the AMD ecosystem is pretty niche and will be pretty stable unless you pick a dev or wrong version, then it may happen to cause kernel panics! In any case, things looks to be improving, but rocm has a lot to support (xformers and other core ML libraries)