Woaaa love these inputs. Thank you! Wasn't aware of the JAX backend will check it out.
Right now we're on SD 1.5. We tried SDXL but found the quality improvement to be marginal.
Yes to area prompting/regional control to help people create more complex scenes. I need some design thinking first since it's easy to over build and spit out something super complicated. Immediate next step is to def add controlnet.
I have never gotten such high quality results from simple prompts, even in cloud models like Midjourney/GPT4. The question is how to port even part of that magic over to the diffusers pipeline...
The speed jump is massive on my desktop GPU, probably even more dramatic on cloud hardware, and it may support some things (weight swapping/lora swapping/resolution changing/controlnet) better than JAX.
My issue previously with these prebuilt backends is that you can't tweak it like sdwebui does, but to make our thing work it took a thousand tweaks. Can look into this first to see how customizable it is.
VoltaML is a relatively vanilla diffusers-based backend, so its not a hairy monster to hack like you may have seen with SAI-based UIs (like Comfy, Fooocus and Automatic)