Stretch iPhone to its limit: 2GiB Stable Diffusion model runs locally on device

ollin · on Nov 10, 2022

here's a direct app store link, if anyone wants to try the iPhone app immediately: https://apps.apple.com/us/app/draw-things-ai-generation/id64...

congratulations to liuliu on the launch!

ttyyzz · on Nov 10, 2022

Crashes on my iPhone SE 2. Gen which was to be expected :-(

antihero · on Nov 10, 2022

Crashes on my 13 Mini, which is unexpected! Anything we need to do to prep the phones?

rsynnott · on Nov 10, 2022

I think that probably is to be expected, unfortunately. The model is 2GB; per the article iOS will kill apps that use 2GB on 4GB RAM devices (and the 13 Mini is 4GB).

harrisi · on Nov 10, 2022

It works just fine on my iPhone 12 Mini, which has (basically) the same RAM, so it seems like it's something else. They're basically the same phone in general though, so I would be surprised if it was a hardware issue.

pulvinar · on Nov 10, 2022

It's running on my 13 Mini (so far).

I did have trouble closing the "adjustments" dialog (upper-right button) due to its close button being underneath the status bar, but found that I could just drag the dialog down to the bottom and it closed.

Retr0id · on Nov 10, 2022

This is absolutely incredible. It takes about 45 seconds to generate a whole image on my iPhone SE3 - which is about as fast as my M1 Pro macbook was doing it with the original version!

liuliu · on Nov 10, 2022

SE 3rd Gen has 4GiB RAM, therefore the app defaults to 384x384 size. This is about 1/2 computation of your normal run (512x512) and the original version uses PLMS sampler, which defaults to 50 steps, while this one uses the newer DPM++ 2M Karras sampler, that defaults to 30 steps. All in all, your M1 Pro MBP is still 4x of your SE 3rd Gen in raw performance (although my implementation should be faster than PyTorch at about 2x on M1 chips)

codetrotter · on Nov 10, 2022

> although my implementation should be faster than PyTorch at about 2x on M1 chips

Do you plan to make a macOS version of your app also? Hope you will :)

ShamelessC · on Nov 10, 2022

For what it's worth, you can decrease resolution and use the sampler mentioned on the pytorch versions. The AUTOMATIC web UI supports this, for instance.

I would also welcome the additional optimizations, however.

2Gkashmiri · on Nov 11, 2022

hey. 9th gen ipad 10.9, i think it has 3GB ram so it will never work on it?

sneak · on Nov 10, 2022

Extreme respect to the developer for not including the "industry standard" clientside tracking/analytics/phone-home in this app. The fact that this runs locally on-device and doesn't send any information to anyone anywhere about what you're doing on your local device is wonderful.

All apps used to be like this, and now the ones that actually respect user privacy are a rare and glorious exception. Thank you!

WantonQuantum · on Nov 9, 2022

This is cool and looks like a great way to drain my phone battery :)

I just used the prompt "A person looking at their phone in amazement" and got a good picture.

Beware that on startup the app downloads almost 2 gig of data.

dylan604 · on Nov 10, 2022

It would be awesome if it just quickly took a picture from the front camera for that particular request and then just filter it to finish the "in the style of andy warhol"

donlinglok · on Nov 10, 2022

I have generated some images, I think it only takes less than 1% of the battery for an image, this is already much better than most of the game(for having fun).

isoprophlex · on Nov 10, 2022

It took my battery from 80 to 77% for one generation on the default settings (384^2, 30 iterations). Less than a minute of compute time to complete a generation.

Iphone battery health reports a battery at 100% health. This is an iphone SE3.

Amazing how huge the difference in energy consumption is for the system in standby vs going full throttle.

EDIT: I generated 3 more images; every subsequent generation reduced battery capacity by another 2%. My phone doesn't seem to heat up at all, interestingly.

4ggr0 · on Nov 10, 2022

My iPhone 12 mini dropped from 76% to 71% and got noticeably warmer. Battery health is way worse than yours, 85%.

I wonder if generating a picture uses so much juice that it even drops the percentage while charging...

EDIT: Generating a picture while charging did not drop the percentage.

skykooler · on Nov 10, 2022

I generated a bunch of pictures while charging and the percentage still went up (albeit slowly).

wut42 · on Nov 13, 2022

Late to the party but depending on your charger voltage, it may be. e.g. I can charge my mbp on a cheap usb-usb-c charger but as soon as I use it _too_ much, it will stall, or worse, lower.

tibbon · on Nov 10, 2022

It does warn you on startup about the download if you're not on wifi.

odysseus · on Nov 10, 2022

Good time to try 5G ultra capacity if you have it on an unlimited plan - will be faster than most people's wifi.

jcims · on Nov 10, 2022

Pivot alert

I reached 3Gbps over Verizon 5G in San Antonio last year and this year i get about 4Mbps over Verizon 5G in Ohio. It’s so bad I disabled it. I did read an article that iPhone 12 (which is what I have) have some kind of radio issue with 5G. Can anyone in here confirm?

badwolf · on Nov 10, 2022

Verizon has been real wonky all over Austin. If there's more than a handful of people in the area, bandwidth just goes to the crapper. I'll go from a couple hundred mbps on a good day with no clouds/wind/holding my phone just perfectly right in the right space, but usually get less than 1mbps on their 5g UW.

smaddox · on Nov 10, 2022

Yeah, its really bad. I keep having to disable 5G.

ncr100 · on Nov 10, 2022

(I have not downloaded it.)

Q: How long does a run typically take? 60 seconds?

ASalazarMX · on Nov 10, 2022

From the link:

> It took a minute to summon the picture on the latest and greatest iPhone 14 Pro, uses about 2GiB in-app memory, and requires you to download about 2GiB data to get started. Even though the app itself is rock solid, given these requirements, I would probably call it barely usable.

also

> Even if it took a minute to paint one image, now my Camera Roll is filled with drawings from this app. It is an addictive endeavor. More than that, I am getting better at it. If the face is cropped, now I know how to use the inpainting model to fill it in. If the inpainting model doesn’t do its job, you can always use a paint brush to paint it over and do an image-to-image generation again focused in that area.

Seems very worth a try. I'm downloading the model right now, it's going a bit slow, ~2MB/s.

ZeroCool2u · on Nov 10, 2022

This is super cool. I just tried the default prompt on my iPhone 13 with the image size set to 768x512 and using the 3D Model (Redshift v1) and it just crashed the whole phone and restarted. Just like when I get BSOD's at work on my Windows GPU desktop :)

liuliu · on Nov 10, 2022

You should see a warning when selecting that size? 4GiB model cannot run at that resolution until someone implemented FlashAttention on Metal :)

GistNoesis · on Nov 10, 2022

Nice work :)

Porting FlashAttention to Metal will be quite hard. Because for performance reasons, they did a lot of shenanigans to respect the memory hierarchy.

Thankfully, you can probably do something slower but more adapted to your memory constraints.

If you relax this need for performance and allow some re-computations, you can write a qkvatt function which takes q,k,v and a buffer to store the resulting attention, and compute without needing any extra memory.

The algorithm is still quadratic in time with respect to the attention horizon (although with a bigger constant (2x or 3x) due to the re computation). But it doesn't need any extra memory allocation which makes it easy to parallelize.

Alternatively you can use an O(attention horizon * number of thread in parallel) (like flash attention) extra memory buffer to avoid the re-computation.

Concerning the backward pass, that's the same thing, you don't need extra memory if you are willing to do some re-computation, or linear in attention horizon to not do re-computation.

One interesting thing to notice in the backward pass, is that it doesn't use the attn of the forward pass, so it doesn't need to be kept preserved (only need to preserve Q,K,V).

One little caveat of the backward pass (which you only need for training) is that it needs atomic_add to be easy to parallelize. This mean, it will be hard on Metal (afaik they don't have atomics for floats though they do have atomics for integer so you can probably use fixed points numbers).

ZeroCool2u · on Nov 10, 2022

I did, it's fully my fault, I just wanted to see what would happen.

Love the work, really great job!

aarkay · on Nov 10, 2022

This is amazing! Finally a use case for using all that compute power on the phone

cwmoore · on Nov 10, 2022

This illustrates the beginning of use cases for computing/ML on the edge. The total power of all the phones and their sensors is mindblowing.

criddell · on Nov 10, 2022

Some javascript crypto miners have been taking advantage of this for years now.

desro · on Nov 9, 2022

This is incredible work and a tremendous achievement. Bravo and thanks for sharing.

nl · on Nov 10, 2022

This is some impressive work.

You might like to look at the work HuggingFace has been doing (on non-iOS versions). They can run it in under 1GB RAM:

> If is also possible to chain it with attention slicing for minimal memory consumption, running it in as little as < 800mb of GPU vRAM

https://huggingface.co/docs/diffusers/optimization/fp16#offl...

liuliu · on Nov 10, 2022

CPU offloading doesn't work because Apple has shared memory arch already. The head slicing is similar to https://machinelearning.apple.com/research/neural-engine-tra... I think it is quite practical only if MPSGraph less mysterious about its allocation strategy. It is not the ideal way though. Ideally, FlashAttention / XFormer is the way to go.

RBerenguel · on Nov 10, 2022

It works extremely fast on an iPad Pro M1 (kind of expected, but it's _impressive_) although the app is built as "iPhone only", and strangely enough the iPad is cropping the upscaled iPhone app so the lower bar of image controls don't show at all, which is a pity

spideymans · on Nov 10, 2022

This is about to become this single most important app on my iPad Pro. It totally accelerates my workflow

jamil7 · on Nov 10, 2022

Reach out to the author, they can likely fix this easily.

RBerenguel · on Nov 10, 2022

Yup, done. I thought the author would see it better here (also would make it visible for other people stumbling on the issue) but I have contacted separately explaining the issue.

tacotime · on Nov 9, 2022

Hahah now I can use my phone as a hand warmer this winter. It's incredible that this is an app!

miohtama · on Nov 10, 2022

Could in-memory compression used to bring down the RAM requirements?

There are some performance compressors like Blosch tuned for this:

https://www.blosc.org/pages/blosc-in-depth/

“Faster than memcpy” is the slogan.

addaon · on Nov 10, 2022

MacOS has transparent memory compression. Unclear to me if that's made its way to iPhone, but if it hasn't yet it will sooner or later.

astrange · on Nov 10, 2022

Memory compression is a generalization of swap, which is only for dynamic memory; files on disk don't need it because you can just read them off the disk.

The problem is that GPUs don't support virtual memory paging, so they can't read files nor decompress nor swap anything unless you write it yourself, which is a lot slower.

Also, ML models (probably) can't be compressed because they already are compressed; learning and compression are the same thing!

earthscienceman · on Nov 10, 2022

Wait. This comment just blew my mind. Does that imply that you might be able to measure the efficiency of a model by it's compressibility? Note, I'm trying to recognize efficient and accurate are not the same. One could imagine evaluating a model on a 2d performance and compression map somehow.

ColonelPhantom · on Nov 10, 2022

> Also, ML models (probably) can't be compressed because they already are compressed; learning and compression are the same thing!

I feel like they're kind of two sides of the same coin: learning is about putting more information in the same data, while compression is about putting the same information in less data.

I'm wondering if some lossy floating-point compressor (such as zfp) would work.

astrange · on Nov 10, 2022

> I'm wondering if some lossy floating-point compressor (such as zfp) would work.

Well apparently this can work; StableDiffusion comes as 32-bit and 16-bit float versions. I'm kind of surprised they both work, but that's lossy compression.

ColonelPhantom · on Nov 11, 2022

Sure, but 16-bit float is pretty primitive compression, as it does not exploit any redundancy in the input. zfp groups numbers together in chunks, which means that correlated numbers can be represented more precisely. Its algorithm is described here: https://zfp.readthedocs.io/en/release1.0.0/algorithm.html#lo...

I would like to see if the zfp can be applied to something like Stable Diffusion (or other ML models) and give better results than regular floats at the same size.

comboy · on Nov 10, 2022

Memory compression? I can't find any good resources to read about it, any hints? I'm having trouble imaging how could it possibly work without totally destroying performance.

kccqzy · on Nov 10, 2022

It doesn't destroy performance for the simple reason that nowadays memory access is slower than pure compute. If you need to use compute to produce some data to be stored in memory, your overall throughput could very well be faster than without compression.

There have been a large amount of innovation on fast compression and fast decompression in recent years. Traditional compression tools like gzip or xz are geared towards higher compression ratio, but memory compression tends to favor speed. Check out those algorithms:

* lz4: https://lz4.github.io/lz4/

* Google's snappy: https://github.com/google/snappy

* Facebook's zstd in fast mode: http://facebook.github.io/zstd/#benchmarks

miohtama · on Nov 10, 2022

On Mac, you can find Compressed memory in Activity monitor.

It’s something similar to swap - apps do not need to have built in support for it.

flatiron · on Nov 10, 2022

It segments a certain amount of ram to “swap” to which means compress and store. Normal blue sky ram operations are not compressed on macOS

smcleod · on Nov 10, 2022

Many operations are actually a lot faster with compressed memory than without. It's all about where the bottleneck is.

comboy · on Nov 10, 2022

Oh, yes compressed swap makes much more sense, thanks.

kergonath · on Nov 10, 2022

It is not compressed swap, the compressed data is still in RAM. The OS just compresses inactive memory, with a couple of criteria to define “inactive”.

miohtama · on Nov 10, 2022

My guess is that iPhone is purely “kill app” instead of “compress memory / swap” OOM model. This makes more sense for mobile.

Sirened · on Nov 10, 2022

iOS uses memory compression but not swap. iOS devices actually have special CPU instructions to speed up compression of up to page size increments specifically to aid in this model [1]

[1] https://github.com/apple-oss-distributions/xnu/blob/bb611c8f...

musicale · on Nov 10, 2022

IIRC from WWDC they said that inactive/suspended apps get their memory compressed to free up memory for the current active/foreground app.

Seems to mesh well with the iOS idea of using a single app at a time and minimizing background processing in apps that you aren't actively using.

In an out of memory situation I think apps just get killed as you suggest.

liuliu · on Nov 10, 2022

Should be doable for parameters but at that point, you don't need compression rather just LLM.Int8 tricks would be sufficient. For activations, I wrote about it a while back: https://liuliu.me/eyes/reduce-another-70-memory-usage-for-de...

It is not as useful for this case (inference) because the activations holds long (UNet holds downsampling passes' activations and use that for upsampling) is not that much of a memory (in the range of a few megabytes). If it is for training, it is probably more useful.

conradev · on Nov 10, 2022

In-memory compression means the memory is inherently dirty memory

On Apple platforms if you mmap a read-only file into the process address space, then it is "clean" memory. It is clean because the kernel can drop it at any time because it already exists on disk. You essentially can offload the memory management to the kernel page cache.

The downside is that if you run up to the limit and the "working set" can't fit entirely in memory, then you run into page faults which incur an I/O cost.

The advantage is that the kernel will drop the page cache before it considers killing your process to reclaim memory.

That said, I don't know the typical access patterns for neural network inference, so I don't know how the page faults would effect performance

Scaevolus · on Nov 10, 2022

No. The memory usage is due to huge series of floating point numbers without much redundancy that you could squeeze out with a compressor.

tehjoker · on Nov 10, 2022

zfp compression might be an interesting thing to try. In fixed rate mode, it supports random access too.

smcleod · on Nov 10, 2022

Has anyone found anything similar (self-hosted or local on device) - but for text generation?

vardump · on Nov 10, 2022

Now I see why that coffee cup icon is on the "generate images" button... fingers burning after a few images.

Haha, awesome app!

vletal · on Nov 10, 2022

Awesome! Although I wish to see the intermediate denoised image instead of the progress bar. Just a suggestion.

machina_ex_deus · on Nov 10, 2022

This isn't recommended, the decoding takes as much time as processing next step. I learned it the hard way when I tried displaying the intermediate steps for debugging.

ollin · on Nov 10, 2022

yeah, running the full decoder takes a while. though, since the "latent" is just 4 channels and pretty close to representing RGB, you can use a linear combination of latent channels and get a basic (grainy, low-res) preview image like this [0] without much trouble. I expect you could go further, and train a shallow conv-only decoder to get nicer preview results, but I'm not sure if anyone's bothered yet.

[0] https://github.com/madebyollin/maple-diffusion

globalvisualmem · on Nov 10, 2022

I just tried it works beautifully! I suggest defaulting to lower resolution 384 X 384 since it will speed up.

sisama · on Nov 10, 2022

Unstable trying with 10 iterations on 348x348 on iPad 9th Gen https://support.apple.com/kb/SP849?locale=en_US. Looks cool tho!

kossTKR · on Nov 10, 2022

Yeah same for me. App just closes after 5 seconds. Would be fun to try!

anaganisk · on Nov 10, 2022

iPad has 3GB ram and model needs 2GB, so, I think your device is too underpowered.

dt3ft · on Nov 10, 2022

Congrats on the release!

I gave this and other available applications a try and I don’t understand what people see in ai image generation.

A simple prompt generated a person with 3 nostrils, 7 squished fingers, deformities everywhere I look, it just mashes a bunch of photographs together and generates abominations.

Pay close attention to generated models and you will find details which are simply wrong.

What is the use case that I’m missing?

fragmede · on Nov 10, 2022

Early cars were terrible too, but here we are. The promise is that future versions of the technology will be able to draw anatomically correct people and images. A computer program that can do in mere minutes what takes a person hours. If you've never wanted a picture of something you can describe but aren't able to draw in your life, then there is no use case for you. For anyone else that's interacted with the world of art and graphic designers or used stock photos; this goes an order of magnitude faster, and is basically free, compared to hiring a skilled professional for hours. It's a game changer for an industry that it sounds like you've just never interacted with.

fock · on Nov 10, 2022

> Early cars were terrible too, but here we are.

Were they? https://en.wikipedia.org/wiki/Benz_Patent-Motorwagen - as fast as a carriage, about the same stink. Carriages clearly had a usecase.

Generated images now: take enormous energy to generate. Main current usecase is to gobble up more energy (mass media/entertainment).

fragmede · on Nov 10, 2022

They were. They were loud and stinky and were unsuitable for dirt roads, spooking horses, causing the UK to basically ban them. Some were powered by steam or coal but those that were powered by gas had a different problem - there were no gas stations. You had to hand crank them to start. Moving goods and people around was already a solved problem with horses and trains and boats.

Cars then: take enormous energy to move very little, and slowly. Main use case then was as a rich person's toy (entertainment). They'll never replace work horses with them.

It's easy, in hindsight, to see cars as inevitable. But you had to see past the shortcomings of the earliest cars to "get it", much like you have to see past the 3 armed monstrosities that current image generation techniques produce and see the promise of the technology. There were undoubtedly those who saw cars as hype, much like image generation is seen today; I'm sure buggy whip manufacturers saw cars as hype and refused to get on what looked like a hype train to them.

toqy · on Nov 10, 2022

And images have a clear use case. Stable Diffusion is effectively moving the horse under the hood.

cdrini · on Nov 10, 2022

I can't speak for others, but I've personally been quite impressed by the dalle output. It creates things that would take me hours (if not days) to create, which no other tech I've tried has been able to generate. It feels like it can absolutely replace at least the stock photo industry. It's also terrific for things like blog photos if you don't have the time or talent to create something yourself, but want some creative control.

Expansions like dream booth, which let you fine tune the system with your own submitted images are also quite amazing. Being able to give it just a few photos, and say things like "show me surfing in the ocean" and get a reasonable image back.

_Much_ more broadly, this space in AI/ML with GPT3/Dalle is exciting because it feels kind of like what the internet was made for. There's too much data on the internet for any one person to ever meaningfully process. But a machine can. And you can ask that machine questions. And instead of getting just a list of references back, you get an "answer". Image generation is the "image answer" part of this system. It's an exciting space because it feels like these systems will affect large chunks of how we use computers.

Here's a cool GPT3 "programming" example: https://twitter.com/goodside/status/1581805503897735168

And here are some of my dalle uses I've been impressed by, that I feel is publish-ready:

- https://labs.openai.com/s/nkOTLRWzjgQTe4QsgoWChP7n

- https://labs.openai.com/s/uSP55qRf1SqCbYTa2UDXXEfA

- https://labs.openai.com/s/kO2purvEodK5UUxPIpL78bQh

- https://labs.openai.com/s/2P1Mb75JbS1mmpyi86xwyyUg

dt3ft · on Nov 10, 2022

Thanks, this was helpful to better understand the use cases.

grumbel · on Nov 10, 2022

The 3 nostrils, 7 squished fingers are not that big of a problem, you can run other image enhancing AIs on top of the generated images to fix that, or just use inpainting and give it a few more tries to get it right. The models are also slowly getting better at it.

> What is the use case that I’m missing?

It's generating images from nothing more than a text description, a year ago that was something you'd only saw an StarTrek. Now it's real and we have barely scratched the surface of what is possible.

The images still need some manual work, but try to generate images of that quality and complexity by hand and you might have more appreciation how mindblowing it is that AI can not only do it, but do it in seconds.

saberience · on Nov 10, 2022

Already on some of the homegrown models (https://rentry.co/sdmodels) these things are fixed already. For the Stable Diffusion "enthusiasts" the tools and models have improved at least 100% since the original release.

dave_sullivan · on Nov 10, 2022

It's more of a cool technology that is rapidly advancing. A couple years ago, it couldn't do this much. A couple years from now, it will be much better. It does much more than mash images together, which you would know if you dug into it a bit. That's it, that's the whole thing.

pdntspa · on Nov 10, 2022

try some of the prompts listed on lexica.art. Stable Diffusion needs good prompt engineering to turn out well

pmarreck · on Nov 10, 2022

There needs to be some sort of piece or filter that understands body geometries and inverse kinematics to prevent things like generating people with 3 limbs or joints in positions that would not normally be feasible without injury =). It'll come.

frankzander · on Nov 10, 2022

Nothing. It's IMHO just a hype of the younger nerdy generation. The real world applications of NN-based (there is no I in A) image generation are limited. One hype comes the other hype goes. IMHO it did not come to stay ;-)

gpderetta · on Nov 10, 2022

As a 40ish year old, the future shock from all these AI image generators is extreme.

This stuff was literally science fiction just a couple of years ago. Now you run it in your phone.

stefandesu · on Nov 10, 2022

I haven't been able to get any good results with Stable Diffusion (via DiffusionBee on my M1 MacBook Air), but I've seen really good images of other AI generators like Midjourney.

gpderetta · on Nov 10, 2022

It does require a lot of cherry picking and fine tuning to get anything good, and yes, SD is terrible, terrible at hands.

It is still extremely impressive and is improving every day.

nelsondev · on Nov 10, 2022

Play around with other prompts. For example, “golden gate bridge in the style of Van Gogh futuristic”.

Architecture is cool. It doesn’t do people well

scrumlord · on Nov 11, 2022

You're missing imagination and the ability to write a good prompt. Come back when you're smarter.

juliendorra · on Nov 10, 2022

The use cases that are being explored now are:

Movie preparation, storytelling https://twitter.com/juliendorra/status/1590058518174134272 https://twitter.com/mrjonfinger/status/1590021753979670528

Fan art! https://twitter.com/rainisto/status/1581169461167816704 https://twitter.com/rainisto/status/1579474636202708993

Product shots and generative marketing https://twitter.com/dtcforeverr/status/1589916644939161600 https://twitter.com/kylebrussell/status/1590563734317338624

2D game assets, character design https://twitter.com/emmanuel_2m/status/1588249026272448512 https://twitter.com/elsableda/status/1562465392563351552

Imaginary selfies (self-portrait is a huge human use case!) https://twitter.com/stevenpargett/status/1590047241183821824 https://twitter.com/dh7net/status/1581298913637646336 https://twitter.com/fabianstelzer/status/1579818105672302592

Styling by example https://twitter.com/norod78/status/1590056501544386560

Raw sketch to final image https://twitter.com/nousr_/status/1564797121412210688

Editing in the most generic sense (replacing part of an image) https://twitter.com/bigblueboo/status/1585761916718383110

tomcam · on Nov 10, 2022

> and generates abominations

You make it sound like that’s a bad thing

dt3ft · on Nov 10, 2022

Horror movie material for sure :) Who knows, maybe this inspires better horror movie creatures :)

sabalaba · on Nov 10, 2022

liuliu has always been a fucking god

wickedsight · on Nov 10, 2022

This is awesome. It has killed my battery over the last couple of hours, because I couldn't stop generating new images.

skykooler · on Nov 10, 2022

I wonder if any of these tricks would be applicable to make a version of Stable Diffusion which could run on the Steam Deck.

ShamelessC · on Nov 10, 2022

Apple's mobile processors (especially the M1) are waaay faster than a steam deck. Even with the optimizations, it would take like half an hour to run I bet.

IceWreck · on Nov 10, 2022

I dont think the Steam Deck is powerful enough. It might be possible but it will take hours.

asadm · on Nov 10, 2022

cant you just run it the regular way? it's a pc.

skykooler · on Nov 10, 2022

It doesn't have enough VRAM to load the regular model; it ends up triggering the OOM killer.

lll-o-lll · on Nov 10, 2022

Any tips on how you “ If the face is cropped, now I know how to use the inpainting model to fill it in. If the inpainting model doesn’t do its job, you can always use a paint brush to paint it over and do an image-to-image generation again focused in that area.” using the app?

liuliu · on Nov 10, 2022

I have a Twitter thread on this: https://twitter.com/liuliu/status/1587978815208407041?s=46&t...

(Note that at that time, there is an implemention bug in inpainting model that caused the weirdness that I need to manually fix)

ragazzina · on Nov 10, 2022

liuliu, this is simply incredible.

Were you focused on just making it work on the iPhone, or do you think you will keep adding functionalities to the app? Do you think it will ever be possible to train one's own model on an iPhone?

liuliu · on Nov 10, 2022

I think that fine-tuning the whole model (a.k.a. Dreambooth) on iPhone would require more RAM / processing power than it currently has. More viable path is to implement Hypernetwork + Textual Inversion, that is within possibility of today's hardware.

senthilnayagam · on Nov 10, 2022

on my iPhone 13, generates 384x384 images in under a minute.

discovered they have stable diffusion 1.4,1.5, waifu diffusion(Anime), redshhift(3d model) and other models.

iPhone becomes warm after couple of runs and starts draining battery, so do it while connected to charger.

haunter · on Nov 10, 2022

Default prompt gives me an arm, or feels like a crop of a full photo (iPhone 11, iOS 16.1) https://files.catbox.moe/ivy15m.PNG

nier · on Nov 10, 2022

Same here. Going to the maximum of 512 × 320 pixels on my device gives me the feeling that only more capable devices can see the whole picture.

This is not a resolution setting but a crop setting.

mrtranscendence · on Nov 10, 2022

Even though I understood very little of that it was still wild fun reading it. I'm glad such wizards exist, because I and most people I know certainly don't qualify.

TuringNYC · on Nov 10, 2022

On a related note, has anyone been able to utilize Apple silicon GPUs? Running CPU-only is incredibly slow (and sad, since i've got these Apple accelerators idle!)

stoobs · on Nov 10, 2022

Automatic1111's stable diffusion web gui works with apple silicon: https://github.com/AUTOMATIC1111/stable-diffusion-webui/wiki...

richrichardsson · on Nov 10, 2022

This repo is working well on my M1 Mini.

https://github.com/invoke-ai/InvokeAI

drawingthesun · on Nov 10, 2022

Is there any option to set so that every image is automatically saved, and not to camera roll, but to the local app folder (same folder that contains this apps data)

elanora96 · on Nov 10, 2022

This looks so cool! Unfortunately, my 16.1 11pro is crashing before an image can be generated, let me know if I can do anything to debug/test this.

liuliu · on Nov 10, 2022

Which generation of iPad is that? Honestly I haven't tested iPad yet, still in designing phase.

mbroncano · on Nov 10, 2022

I got I working nicely on my M1 iPad

MikePlacid · on Nov 10, 2022

IPad Air 4 is ok too.

elanora96 · on Nov 13, 2022

My bad I wasn't clear, iPhone 11 Pro on iOS 16.1

gorbypark · on Nov 10, 2022

It’s working fine on my iPhone 11 Pro / iOS 16.1 with all default settings. Just generated an image of an astronaut skiing that looks pretty good!

simonh · on Nov 10, 2022

Thanks, Ive been looking forward to something like this coming out. Runs great on my iPad but a UI optimised for the form factor would be nice.

holoduke · on Nov 10, 2022

I cant wait till this is possible with 24fps and do live camera view modifications. That woud be insane during meetings :) I guess 5 more years.

secretsatan · on Nov 10, 2022

Really cool, but for some reason, it fails to share the images using Airdrop for me. I have to save the images to photos then airdrop from there

donlinglok · on Nov 10, 2022

I have tested the app on my old iPhone XS Max, it takes less than 3 mins for an image.

And also can choose a model, steps, scale, and sampler!

Thank you for your great work!

stephenitis · on Nov 10, 2022

small add. display multiple examples while it's downloading the model. Its a long time to stare at an astronaut on a horse in space

xingped · on Nov 10, 2022

Fantastic! Can it be made available for iPad?

spideymans · on Nov 10, 2022

It works on the iPad, but it uses the iPhone UI. But it works just as well

gok · on Nov 10, 2022

Did you try using Core ML for inference?

ShamelessC · on Nov 10, 2022

The quality of comments here is absolutely abysmal given the deeply technical nature of the article.

ActionHank · on Nov 10, 2022

I wonder how long before the appstore policies are updated to require models be embedded in apps.

ajconway · on Nov 10, 2022

App Store has a CDN-like feature that allows uploading large resources separately and download them after running the app for the first time.

dirtyid · on Nov 10, 2022

Is this something that might make it's way to tensor/pixel or unique to Apple silicon?

propogandist · on Nov 10, 2022

the developer is about to have a MASSIVE hosting bill

the download restarts from 0% if the app is sent to the background, as there does not seem to be a download manager. This is especially problematic for the large 1.5gb file.

liuliu · on Nov 10, 2022

I am using Cloudflare R2, which doesn't have egress fee and I am getting about 5k Class B operations right now. Unless Cloudflare changes their end of the deal, I think it is OK.

propogandist · on Nov 10, 2022

great to hear! please consider introducing a mechanism to suspend the download vs restarting, this may be especially valuable for those with a slower connection. With the traffic growth you'll be seeing chances are Cloudflare's enterprise team will soon be in-touch ;)

Gigachad · on Nov 10, 2022

I wonder why the file can’t be distributed inside the app on the App Store. That way downloads would be much more convenient.

dylan604 · on Nov 10, 2022

Is P2P torrent type of sharing the load possible under AppStore guidelines (be they iOS or Android)? I've honestly never even thought about this being a thing, but with large shared data that doesn't change, why not?

nicd · on Nov 10, 2022

P2P, as in hosted from other people's phones? I think the issue is that people generally wouldn't be happy with P2P data uploads from their phones (compared to P2P on desktop, where internet is cheaper/faster, and battery isn't an issue).

Miraste · on Nov 10, 2022

They are banned on iOS but not on Android.

asadm · on Nov 10, 2022

appstore has app size limits.

lxgr · on Nov 10, 2022

There's also a way to host app resources in the App Store, usually used for levels in games and such:

https://developer.apple.com/library/archive/documentation/Fi...

squeaky-clean · on Nov 10, 2022

Looks like this has a limit of 512MB for any single file or resource tag group.

pellias · on Nov 10, 2022

Hmmm, my ipad keeps crashing and redownloading the sd_v1.4_f16.ckpt file.

liuliu · on Nov 10, 2022

Yeah, seems iPad has bunch of issues (curiously most of it related to how I translate tensor back to CGImage ...) stay tuned.

MarcusE1W · on Nov 10, 2022

On my iPad mini 5th generation with A12 the download is fast and fine. But with standard settings it first warns “Device capability warning” and then indeed crashes every time. Is there a way to solve this? A12 chip should work, no?

iseanstevens · on Nov 10, 2022

This is really impressive !

boppo1 · on Nov 10, 2022

Aww man, no ipad version? Tsk what did I get this 16gb of ram for?

liuliu · on Nov 10, 2022

Given couple of weeks. Still playing to see what's the optimal UI looks like for such large screen. 16GiB should be able to generate several images to select from at once.

boppo1 · on Nov 10, 2022

Rad, thanks!

2Gkashmiri · on Nov 10, 2022

i have an ipad. the regular non air/pro/m1 one and the app installed and when i run it, it says "could be device incompatibility" and subsequently crashes

patentatt · on Nov 10, 2022

Crashes every time... am I doing something wrong? iPhone 11 Pro.

liuliu · on Nov 10, 2022

Does it crash upon downloading models, or generating? I haven't tested on all the devices, but 11 Pro seems have 4GiB RAM, and should run with 384x384 resolution (check if that is the selected resolution at top right).

There are reports that iOS is not happy with how I computing SHA256 for downloaded model file by loading them all in memory for Xr (3GiB RAM). If this is happening for other devices, I may need to do streaming hash computing and put up a bugfix.

olliej · on Nov 10, 2022

It _might_ be easier just to use mmio and save any futzing around rewriting the actual code.

liuliu · on Nov 11, 2022

Yeah, I thought Data(contentsOf:) already do that, but it appears not (tested, indeed allocated all the memory to load the data). Adding `mapIfSafe` in the reading options solved this.

lll-o-lll · on Nov 10, 2022

I have an iPhone 11 and it works for me. Had to keep the app in focus until all the downloading completed.

gorbypark · on Nov 10, 2022

I can also confirm it works (for me) on iPhone 11 Pro.

alex_suzuki · on Nov 10, 2022

Kudos. I hope liuliu has a good deal on egress traffic... :-)

2Gkashmiri · on Nov 10, 2022

Uh..... Android is there? I mean android version?

ranguna · on Nov 10, 2022

Yeah was wondering the same, wanted to see if a snapdragon 8 gen 1 would beat these numbers out of the park or not.

hustwindmaple1 · on Nov 10, 2022

author is an iOS dev ...

Kye · on Nov 10, 2022

I'm downloading it now. This should be fun.

annoyingnoob · on Nov 10, 2022

Makes a nice hand warmer on a cool evening.

cph123 · on Nov 10, 2022

Works great on my iPhone 14 Pro Max

hamilyon2 · on Nov 10, 2022

This will sell Iphone to me.

wwarner · on Nov 10, 2022

works really well thank you!

tluyben2 · on Nov 10, 2022

Brilliant!

secretsatan · on Nov 10, 2022

Preserve your privacy, send us your phone number!

secretsatan · on Nov 10, 2022

I see now there's a download link in small grey text under that

happyopossum · on Nov 10, 2022

Kinda ironic that this links to a site that is almost unreadable on an iPhone (14 pro max if you’re wondering).

proxygeek · on Nov 10, 2022

I frequently see such comments about sites not loading fine on iPhone 13/14, while they continue to load just fine on for me on a 4 year old Android device (not pixel / Samsung).

I wonder if it's the hardware or just the blockers that i use. Might be worth trying using blockers to see if it makes the general browsing experience better on Apple devices

tshaddox · on Nov 10, 2022

It loads fine on an iPhone 14 Pro Max, it just has a very small font. Reader mode works great though.

chillfox · on Nov 10, 2022

The site works just fine on an iPhone 7

Tagbert · on Nov 10, 2022

The site loads fine but the font size is a little small for mobile. It reads fine if you rotate to landscape, though.

sosodev · on Nov 10, 2022

This is amazing. I'm kind of surprised that it doesn't have an NSFW image blocker. I want to be able to generate NSFW images but it probably should have one enabled by default.

mistersquid · on Nov 10, 2022

Update: Draw Things uses “One-time photo selection” which according to Settings > Privacy & Security > Photos “Even if your photos were recently shown to you to select from, the app did not have access to your photo library.” Still, I didn’t realize apps could save to Photos without explicitly asking permission.

I don’t recall giving “Draw Things” permissions to access my photo library, yet the app is able to save to my photo library without prompting and able to read existing images.

I may have misunderstood what permissions apps should ask for when saving to the photo library.

liuliu · on Nov 10, 2022

I use PHPickerViewController: https://developer.apple.com/documentation/photokit/phpickerv... which runs out of the process such that when you want to select a photo into the app, I have no access to any information about your other photos and the location info is erased from what PHPickerViewController passed to me.

When save the photo, I only use UIImageWriteToSavedPhotosAlbum (https://developer.apple.com/documentation/uikit/1619125-uiim...) which asks you permission to write to the album, not read permission (they are separate). There are more things I can do if I have read permissions (like create a "Draw Things" collections and save to that, rather than save to generic Camera Roll). Ultimately decided to not do that because I don't want more permissions than I minimally absolutely need.

mistersquid · on Nov 12, 2022

Thank you for the detailed explanation. I appreciate it.

ir77 · on Nov 10, 2022

Meh… 2gb download only to try “image of unicorn pooping” to get the lamest results ever. If I can’t amuse my 5 yr old with this AI it’s nonsense.

It transcribed as “image of unicorn poo ping” in tags :(