Apple released CoreML Stable Diffusion library a little over a year ago [1]. Hugging Face released their version of the example app for the CoreML Stable Diffusion library [2].
The app should be able to run on iPhone 14 Pro, I believe the requirements is about 6-8Gb of RAM. And I was not able to run it on iPhone 13 Mini, because it has only 4Gb of RAM.
Comparing the App Store listings, it looks like this app has a much simpler interface and far fewer features than Draw Things. Some users might prefer the simplicity of this app.
(Draw Things is by far the most advanced app that supports on-device Stable Diffusion on iOS devices and Apple Silicon Macs. It had a non-standard UI, but otherwise is really good.)
I have to disagree w.r.t. UI and non-standard being preferable for a pro app.
Having used Drawing Things regularly for the last few weeks, I still get confused by certain interactions and UI elements, leading to mistakes, and 'lost productivity'. It would greatly benefit from a UX pass, as more standard UX improves expectations of what will happen upon performing an action.
Don't get me wrong: I appreciate that it was released –for free– and that its capabilities are what they are. I'm merely arguing that more cohesive UX and pro functionality are not mutually exclusive.
As an example of a 'pro' app, there's Pixelmator Pro, which is a very Mac-assed app. I was able to pick it and start using it immediately without tutorials as its typical UX is intuitive (to me, as a macOS user), even when it came to more complicated operations.
Some more examples that I can think of off the top of my head: Proxyman, TablePlus, Kaleidoscope, Tower. The only exception to my observation, based on tools in my daily arsenal: VSCode. Non-standard UX, yet still intuitive.
Everything else that's non-standard feels like I'm battling with the UI daily, even after years of use: Android Studio, Slack, and most of the complicated Electron apps.
Testing on my iPhone 15 pro - I couldn't find it in the app store with a search but I looked up the developer and was able to download it there. Working so far, first image took a while (a few minutes, as the app warned me), but subsequent images were a bit faster (~1.5 minutes). Phone does get pretty warm though.
iPhone 15 Pro with 30 steps at 512x512 resolution (SD v1.5) should take around 35 seconds on iPhone 15 Pro with Draw Things. 1.5 mins are too slow. (I am the author of DT).
It could indeed be faster. The app does not currently use the neural engine (ANE) because it has a tendency to crash the app, so it uses only CPU and GPU. The app also does upscaling, which adds ~10 seconds.
I am going to put model related code we use in a public repo soon (it is very similar to https://github.com/liuliu/swift-diffusion but in NHWC format). ANE will be around 25s if it runs. DT's default only uses GPUs and 35s is on GPU (yes, like you said, upscaling would take extra 10s).
The description is written by the software developer. The "Information" section near the end lists requirements for supported devices. iPhone and iPad say "Requires iOS 17.1 or later and a device with the A17 Pro chip or later", and the first iPhones with A17 chips or later are the iPhone 15 Pro and iPhone 15 Pro Max.
I tried using it to generate some sprites for a game I've been thinking about. Kept telling me it couldn't show me the image because it wasn't safe (I asked for robot pirates). Couldn't see a way to turn off the nsfw protection. Uninstalled it :(
Slightly OT but is there a decent setup for sprite generation out there? Non phone, I mean. It certainly seems like there's been some work in maintaining consistent style and even subjects across runs, does that work with 'character A walking frame 1, character A walking frame 2' etc anywhere yet?
This is a surprising thing when first working with these models (especially ones implementing NSFW filters, which are noisy). If you go check civitai.com you'll see that there's a lot of... well... porn. There's many LORAs to download but a very useful one ends up being the clothing slider. While I think the intention is to remove clothing, it is helpful in adding clothing. Unfortunately this doesn't look to support LORAs which are essential to getting many of those high quality images you see floating around.
My guess here is that the model is just trained on too many sexy pirates (it also has a propensity for producing asian women, which this model seems to do too). It does look like they support negative prompts but it requires you using "##" to separate positive and negative. Interesting design choice. You'll find these negative prompts helpful: disfigured, low quality, child, sexy, nude, extra limbs, ugly hands; and anything in the same vein. What works best is dependent on the base model and there is variance between different positive prompts. You may also have more success with something like automatic1111 which as long as you feel comfortable doing a git clone (which you're on HN, so I assume you are) then it'll be a better interface, but I don't know if there's a apple arm model or if baremetal has improved since last I checked.
> The iPhone 15 Pro and Pro Max can be configured with up to 1 TB of storage.
On the flip side, apparently the base-model iPhone 15 Pro comes with zero storage upgrades. You get the same 128gb base-model storage as the regular 15, and the 256gb entry model is reserved for the Pro Max. A bit surprising, given that iPhones don't support expandable storage.
My point is that, like the iPhone 15 Pro Max, they should increase the minimum storage option. The base spec is insulting for someone who pays for a "Pro" tier phone.
The app should be able to run on iPhone 14 Pro, I believe the requirements is about 6-8Gb of RAM. And I was not able to run it on iPhone 13 Mini, because it has only 4Gb of RAM.
- [1] https://github.com/apple/ml-stable-diffusion
- [2] https://github.com/huggingface/swift-coreml-diffusers