Hacker News new | past | comments | ask | show | jobs | submit login

On my machine with AMD RX 7900XT, it takes ~0.17s per image. Are you using SD Turbo Scheduler node?



At the time I hadn't, but now I've used the sample image (the one with the fox in the bottle in the snow) to load up the workflow with sdturboscheduler, and it's still about the same time.

Edit: Even thought the UI says sdxl turbo, I notice that the command prompt is saying sdxl.

  got prompt
  Requested to load SDXLClipModel
  Loading 1 new model
  Requested to load SDXL
  Loading 1 new model
  100%|| 1/1 [00:00<00:00, 11.30it/s]
  Requested to load AutoencoderKL
  Loading 1 new model
  Prompt executed in 4.04 seconds
  gc collect
Edit 2: More info... I noticed that if I just change the steps, it takes less than a second to generate an image. If I just change the prompt, it takes 3+ seconds. I don't know enough about this to know what that means.


Have you used the 7900XT with LLMs at all?


Yup, works just fine. Overall pytorch on ROCm 5.6 has been working very well. I'm impressed with how stable it is, given how much hate AMD driver stack has been getting.


same on my end, the biggest issue sometimes is finding the correct torch/torchvision version that go with different projects as AMD users are still rare. So, the problem is still the AMD ecosystem is pretty niche and will be pretty stable unless you pick a dev or wrong version, then it may happen to cause kernel panics! In any case, things looks to be improving, but rocm has a lot to support (xformers and other core ML libraries)




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: