Stable Diffusion WebGPU demo

simonw · on July 18, 2023

The MLC team got that working back in March: https://github.com/mlc-ai/web-stable-diffusion

Even more impressively, they followed up with support for several Large Language Models: https://webllm.mlc.ai/

aislamov · on July 18, 2023

That's really impressive and much more performant. I was following a different approach: to run any ONNX model without prior modifications.

FL33TW00D · on July 18, 2023

ONNX is bloated! I got some LLMs working on my own Rust + WebGPU framework a few months ago: https://summize.fleetwood.dev/

I've since moved away from ONNX and to a more GGML style.

naillo · on July 18, 2023

Do you have any good resources or links on using ggml with wasm?

FL33TW00D · on July 18, 2023

I think the Whisper example is your best bet! https://github.com/ggerganov/whisper.cpp/tree/master/example...

bkitano19 · on July 18, 2023

Hey! This is what I've been working on, would love to chat, feel free to email

FL33TW00D · on July 18, 2023

Sure! My email is in my profile.

taminka · on July 18, 2023

what's the difference between onnx and ggml style?

FL33TW00D · on July 18, 2023

ONNX consumes a .onnx file, which is a definition of the network and weights. GGML instead just consumes the weights, and defines the network in code.

Being bound to ONNX means moving at a slower velocity - the field moves so fast that you need complete control.

michaelmior · on July 18, 2023

I haven't used ONNX or GGML, but presumably using GGML means you need to reimplement the network architecture?

FL33TW00D · on July 18, 2023

You do! But it offers quite a fluid API making it pretty simple. You can see my attempt at a torchesque API here: https://twitter.com/fleetwood___/status/1679889450623459328

3cats-in-a-coat · on July 18, 2023

"This will load 3.5GB and use 8GB of your RAM".

Interesting what browsers have become. The web ate the operating systems.

esperent · on July 18, 2023

I don't quite understand your criticism here. It's running stable diffusion on your computer via your browser. How would it do this without downloading it and then loading it into RAM?

It'll be the same size download and use about the same RAM if you download it and run it directly without using a browser.

3cats-in-a-coat · on July 18, 2023

It's not a criticism. I'm just pointing it out. For good or bad, it is what it is. There are two sides to it. For anyone familiar with system theory, this was the inevitable end game of the web.

However, the web also has a terrible bloat/legacy issue it refuses to deal with. So sooner or later, a new minimal platform will grow from within it, the way the web started in the 90s as the then humble browser. And the web will be replaced.

anon35 · on July 18, 2023

I think we've already seen the first shoots from that nascent new system: https://webassembly.org/. And this hilarious talk will continued to be viewed by future generations for marking the 2013 inflection point: https://www.destroyallsoftware.com/talks/the-birth-and-death...

3cats-in-a-coat · on July 18, 2023

I had great expectations of WASM, and maybe it can evolve into what we're discussing. But as it is, this system is too limited in two crucial ways.

First, it's explicitly designed to be easy to port existing software to. Like C libraries, say. Sounds good right? Well, it's not designed as a platform that arose from the needs of the users, like the web did. But the need of developers porting software, who previously compiled C libraries to JS gibberish and had to deal with garbage collection bottlenecks etc. This seems fairly narrow. WASM has almost no contact surface with the rest of the browser, aside from raw compute. It can't access DOM, or GPU or anything (last I checked).

Second, for reasons of safety, they eliminated jumps from the code. Instead it's structured, like a high-level language. It has an explicit call stack and everything. Which is great, except this is a complete mismatch for all new generation languages like Go, Rust etc. which focus heavily on concurrency, coroutines, generators, async functions etc. Translating code like that to WASM requires workarounds with significant performance penalty.

naillo · on July 18, 2023

WASM can access the GPU via middle layers like SDL. I.e. you can write a C program that uses opengl and compile it and as long as you include the right flags etc into `emcc` you will barely need to touch or glue it together on the JS side at all.

3cats-in-a-coat · on July 18, 2023

All those go through JS as far as I'm aware. Emscripten bridges everything for you, but technically it goes through JS so SDL's calls also go through JS.

naillo · on July 19, 2023

You're correct but I don't see why that matters. From your perspective as a C coder you get webgl access without having to write JS, that's what's important. Everything is "mixed together" into asm in the end anyway whether it's glue code in JS or browser side glue code.

Aerbil313 · on July 19, 2023

I for one am very happy for it. The promise of Java “write once, run anywhere” is mostly realized now. Now all we need is the browser to ship with interpreted language runtimes and language native bindings to DOM, GPU etc.

bawolff · on July 18, 2023

Its not a criticism that this is neccesary, its a criticism that it is possible to even do this in a browser.

esperent · on July 18, 2023

But why is that a criticism? I tried running SD on my computer a few months ago. I spent several hours trying to install the dependencies and eventually gave up. I'm sure it wouldn't have been a big deal for someone familiar with python but for me it was a massive hassle and I eventually failed to make it run at all.

For this one, as long as my browser supports WebGPU (which will be widely supported soon) and I have the system resources, it will run. Barely any technical knowledge needed, doesn't matter what OS or brand of GPU I have. Isn't that really cool? It reduces both technical and knowledge based barriers to entry. Why do people criticize this so strongly?

filoleg · on July 18, 2023

After seeing the sort of argument you are replying to countless times on HN, I came to a simple conclusion. Some people, esp on HN, just have disdain for anyone who might not be willing to deal with the hubris of running a piece of software, because it invites "lower common denominator", and they don't want to "taint" their hobby by the presence of "normies" in what used to be their exclusive domain.

In a similar vein, you can find plenty of comments on HN faulting the massive proliferation of smartphones among the general population throughout 2010s for "ruining" web, software ecosystems, application paradigms, etc. There are plenty of things one could potentially criticize smartphones for, and some of that criticism indeed has merit. But this specific point about "ruining" things feels almost like a different version of the same argument above - niche things becoming widely adopted by the masses and "ruining" their "cool kids club."

Another similar example from an entirely unrelated domain - comic books and their explosion in popularity after Marvel movies repeatedly killing it in the box office. I don't even like Marvel movies, barely watched any of them, but the elitism around hating things becoming more popular is just silly.

bawolff · on July 18, 2023

That's a pretty weird conclusion to draw.

The objection (more surprise than objection) is that web browsers are supposed to be sandboxed environments. They are not supposed to be able to do things that negatively impact system performance. It is surprising you can do things involving multi-gb of ram in a web browser. It has nothing to do with what you are using that ram for or if its cool or not.

I dont think anybody has an objection to making it easier to run stable diffusion and i think the only way you could come to that conclusion is intentionally misunterpreting people's comments.

filoleg · on July 19, 2023

> The objection (more surprise than objection) is that web browsers are supposed to be sandboxed environments. They are not supposed to be able to do things that negatively impact system performance.

I agree with the sandboxing model, but it is orthogonal to WebGPU and impacting system performance. Sandboxing is about making the environment hermetic (for security purposes and such), not about full hardware bandwidth isolation.

First, there is no way for web browsers to have no system performance impact whatsoever. Browsers already use hardware acceleration (which you can disable, thus alleviating your WebGPU concerns as well), your RAM, and your CPU.

Second, afaik WebGPU has limits on how much of your GPU resources it is allowed to use (for the exact purpose of limiting system performance impact).

quickthrower2 · on July 18, 2023

The web ate Java's original premise.

DrScientist · on July 18, 2023

Yes on the client, no on the server.

Java's real success was ( and still is ) on the server - powering a whole generation of internet applications, and creating a cross vendor ecosystem that stopped MS leveraging it's client dominance to take over the server space as well.

I don't believe Unix/Linux would have survived the Windows server onslaught without Java on the backend and the web on the front.

jholman · on July 18, 2023

It's true that the server is where Java has been most successful, by a large margin.

But it was never Java's "original premise", which is what the comment you are replying to was about. According to their (very heavy-handed) marketing at the time, Java was supposed to be for native desktop applications and for "applets". But yeah, in the many years it took for those promises to truly become hollow, Java carved out a surprisingly robust niche for itself on the enterprise server.

Also, I am skeptical of this last sentence of yours. The thing that resisted the Windows server onslaught, broadly, was the wide range of free-as-in-speech-and-as-in-beer backend technologies, like Perl, PHP, Python, Postgres, and some other things that start with "P", as well as, yeah, Java. Java played a role, but it was just one of many.

nunobrito · on July 18, 2023

Java was created from the beginning for embedded devices. Most people don't realize that is has been there since the beginning on each Nokia 3310 device all the way up to most Android apps on the newest smartphones.

On the desktop we had Swing which was OKish to build GUI apps (albeit still underneath Borland) and then totally lost sight of desktop with JavaFX that was created without hearing the community and then abandoned, also refusing to improve Swing. Quite a pity.

jholman · on July 20, 2023

> Java was created from the beginning for embedded devices

This is technically not true, as far as I know. That whole idea of Java ME, different "profiles", all that stuff happens in roughly 1998, which is definitely not "from the beginning". Though, looking it up now, apparently the Java Card stuff gets started a little earlier than that (which I didn't know/notice at the time, probably because it apparently wasn't initially a Sun initiative and so I'm guessing Sun's self-promotion didn't mention it in the really early days).

But depending on what your point is, maybe my first paragraph is merely a technical quibble, not a substantive disagreement. Maybe your point is that Java's success has been, in part, due to its ubiquity in small-but-not-tiny devices like "feature phones". Fair enough, I guess, and if that's your point then it doesn't really matter if it was truly "from the beginning", or just "one of the earliest pivots" (which I think is more accurate).

Myself, my point is that DrScientist's reply to quickthrower2 is, as a reply, just straight-up wrong wrong wrong. Java's original premise was twofold: web applets, and desktop apps that didn't need to be maximum performance (note that Swing was also not the original Java GUI toolkit, I've forgotten the name of the thing that preceded it, Swing was certainly much better). Building servers was NOT part of Java's original premise. And quickthrower2 is right: the web ate that original premise. Java had to pivot to live, and did.

I'm getting too pedantic here, but the historical revisionism is winding me up.

nunobrito · on July 24, 2023

Other people pointed relevant pages where you can read: "In 1985, Sun Microsystems was attempting to develop a new technology for programming next generation smart appliances, which Sun expected to be a major new opportunity"

This was common knowledge on that decade.

From memory don't recall Java being focused on server-side much later until the 2000s with Tomcat and JBoss making a lot of stride, can't say I was fan of either. Maybe that is the time when your person first saw Java trying to compete for whatever space was left of web to take. I'm failing to have the impression AWT was ever relevant, that's why it wasn't even mentioned as everyone seemed to be using only Swing except for some god-awful projects in the gov domain.

For embedded developers (phones, smartcards, electronic devices, ...) it was well-established since the early days because IMHO was _easy_ to use/deploy/maintain by comparison to other options. Even looking at the options available today, it is still on the top albeit C++ making quite a fantastic comeback with Arduino albeit continuing to be a pain in the rear to debug.

DrScientist · on July 21, 2023

It is true. Look up project Oak.

https://en.wikipedia.org/wiki/Oak_(programming_language)

No revisionism - you aren't just looking back far enough into history.

The original Java GUI toolkit was called AWT.

See https://wiki.c2.com/?TheStoryOfAwt for some interesting history - as you can see from the story - the web was a pivot, not the original intention.

DrScientist · on July 19, 2023

> Java was created from the beginning for embedded devices.

Exactly.

The biggest deployment of Java was/is on Java smart cards - billions of devices every year.

quickthrower2 · on July 18, 2023

Definitely agree. Java massively succeeded on the server. I admit I was a Java enthusiast in 2000 and then my jobs were all C# which is approximately Java :-) better in some ways and not as good in others.

colechristensen · on July 18, 2023

Yup, java just didn't actually do it very well.

Or java was too heavy for the computers at the time to get people to use "applets" for everyday things (i.e. go to a new website a do a thing on it)

Flash et al also failed to catch for long.

The web browser's success might have something to do with neverending feature creep as opposed to "this can do everything but as such it's broken and vulnerable".

DrScientist · on July 18, 2023

Applets implementations were terrible - the problem wasn't so much Java ( though early versions pre-jit were slow ) , but the interface between the browser and the applet.

Memory leaks abounded in particular.

Life cycle management was difficult as well.

Note the interface had to be implemented in each and every browser separately - compounding the problem - for applets to be viable it had to work on all the major browsers well.

Not blaming the people who worked on it - I suspect the origin design was put together in a rush, and the work under-resourced, and it required coordination of multiple parties.

mschuster91 · on July 18, 2023

Java's constant security nightmares and lack of integration into browsers was the bigger problem, IMHO.

antupis · on July 18, 2023

Also you had to install runtime vs browser which is bundled every OS.

quickthrower2 · on July 18, 2023

I think everyone instinctively knew that “once it is in the browser it cannot fail”, so it has been a schelling point for tech.

pjmlp · on July 18, 2023

That is why there is a company selling laptops where the OS is the browser, and another one doing the same for smart TVs.

3cats-in-a-coat · on July 18, 2023

There's however the important detail that this company has been doggedly working for achieving that end by first co-opting the browser of a competitor (Safari WebKit), then forking it, then taking over the web standards process, and putting in every API possible on the web, including access to USB devices and so on, so they can make an OS around it.

Because if it's the web, Google sees it. And if everything is the web, then Google sees everything.

wood_spirit · on July 18, 2023

And Apple had in turn had cop-opted KDE’s KHTML and KJS project to start WebKit. An illustrious linage.

(I remember several awesome hobby OS projects ported KHTML to get a really good browser back in those days. It was a really solid and portable codebase and much tidier than Firefox.)

fauigerzigerk · on July 18, 2023

Google doesn't see anything just because it uses web technology. For instance, the payroll system I use is a web app, but Google doesn't see my company's payroll data. What Google sees is a marketing blurb about the payroll system.

Google sees everything that is public and everything that uses their ad network, including data from apps that don't use the web at all.

solardev · on July 18, 2023

Wish Windows would take a cue from that other operating system and stop shipping with so much cruft.

Can't remember the last time I used Windows for anything more than launching Chrome and Steam...

pjmlp · on July 18, 2023

Actually Windows was one of the first to follow upon this idea with Active Desktop and packaged Web applications.

I guess you would be better off with SteamOS then.

solardev · on July 18, 2023

We still have that in the form of PWAs. I don't mean the web "runtime"/webview, but all the cruft that Windows ships with... the endless ads, multiple UI and config layers, Office trial, the stupid games, OneDrive, Skype/Teams, the endless notifications, Bing everywhere... it's an over-monetized in your face nightmare on every fresh boot.

If not for DirectX and Windows-only games, I'd totally ditch it. Maybe when Proton gets there.

scioto · on July 18, 2023

I only use Windows for games.

Waterluvian · on July 18, 2023

As bandwidth increases and the web sandbox matures, it’s fascinating to watch the evolution towards apps you just use rather than download and install and maintain. This will bother some but for the masses it opens a lot of doors.

Timon3 · on July 18, 2023

I love web apps because they mean I have to trust the developer a lot less than with native apps. Of course there are still things you'd have to monitor (e.g. network requests) to fully trust any web app. A good solution could be something like OpenBSD's pledge, to allow me to prove to the user nothing malicious is possible (e.g. by disabling fetch and any new requests, even from src attributes, altogether).

Waterluvian · on July 18, 2023

As a sandbox, I especially like that there's a dropdown with a huge list of things a website/app can do. Many of them on by default, but I have total control over that. And of course the API for asking.

"Hey this game wants to use your motion controls and USB gamepad." Okay sure.

Timon3 · on July 18, 2023

Yeah, the sandbox is nice, but it doesn't go far enough. Let's say I build a JSON viewer. Why should the page have any ability to make network requests? So what I'm asking for is an ability to pledge that I'm not going to make any network requests.

jimmySixDOF · on July 18, 2023

Yes, and I am coming round to the idea that WebGPU is useful like this for cases other than realtime interactive WebXR pages with streaming multiplayer live state and loaded up with draw calls etc. There is a simplicity to curating the experience through the browser like this and there isn't any easier way to get SD up and running so I hope these kinds of projects keep getting support. Thanks for building this OP!

MrBra · on July 20, 2023

Or better: the web ate the OS resources. Or better: an app ate the OS resources.

Which is exactly what resources are for, when eating is performed correctly :)

massifist · on July 18, 2023

I don't know, something tells me a (actual) operating system will still be required (at some level).

Though, they might eat the desktop environment or UIs (in general).

jsight · on July 18, 2023

That's what Netscape wanted, back when they still existed. The vision survived, though the company didn't.

ale42 · on July 18, 2023

I guess data is compressed? Otherwise, this makes no sense (even if it might happen in practice).

josefx · on July 18, 2023

AI models can get quite memory intensive while running. I have seen a primitive image improvement AI eat up over 80 GB of RAM on high res images. The data of the model itself "only" used up 4 GB.

amelius · on July 18, 2023

As long as they cache it so I only have to load it once ...

ccooffee · on July 18, 2023

> You need latest Chrome with "Experimental WebAssembly" and "Experimental WebAssembly JavaScript Promise Integration (JSPI)" flags enabled!

Darn, guess I'll have to wait for stuff to land in Firefox.

moffkalast · on July 18, 2023

At least Firefox now allows for CSS scrollbar styling, so you have that going for you which is nice.

tpowell · on July 18, 2023

I enabled the requested chrome:flags in Brave, but it still doesn't work. I haven't downloaded Chrome on any of my M1 Macs, and don't plan to start now.

radiKal07 · on July 18, 2023

same here, I use Chromium but it's the latest version and I enabled those flags... website still does not work

bsimpson · on July 18, 2023

I tried turning them on in Canary, but I still couldn't get it to work.

veb · on July 18, 2023

I tried on latest (normal) Chrome, beta and canary/nightly and enabled both options one by one and let it relaunch but still wouldn't work at all. ¯\_(ツ)_/¯

How are others getting it working?!

From someone else's comment, this one works fine: https://websd.mlc.ai/#text-to-image-generation-demo

marcod · on July 18, 2023

It's working on Edge / win 11 ;)

If you can't run it, here's how the output looks with default settings https://i.imgur.com/WCQc8hO.png

ojosilva · on July 18, 2023

How long approximately did it take you to get to state of the posted image? Any hardware specifics would be helpful to understand performance. Thanks!

marcod · on July 21, 2023

It's running only a single thread, so I think the specs are a little less relevant. It takes about 80s per iteration and I ran the 4 iterations set by default, so a little less than 6 minutes.

aislamov · on July 18, 2023

What error message do you get?

veb · on July 18, 2023

Hold on, to run your demo does one have to click the "Load Model" button before doing anything? 'cos what I see is a form that is greyed out with the error message still at the top:

> You need latest Chrome with "Experimental WebAssembly" and "Experimental WebAssembly JavaScript Promise Integration (JSPI)" flags enabled!

Now I'm wondering whether the top message goes away once the flags are enabled?

aislamov · on July 18, 2023

> Hold on, to run your demo does one have to click the "Load Model" button before doing anything?

Yes. I thought it won't be good if it would download 3.5gb once you open the page.

>Now I'm wondering whether the top message goes away once the flags are enabled?

No, I haven't added any checks for that (and I'm not sure how the first one can be properly checked), so it's just an info bar. Which is, eventually, misleading.

1letterunixname · on July 18, 2023

It works on canary on M1 mac and Windows w/ an NVIDIA RTX GPU. I believe there are custom command line options that have to be passed to make it work. The MLC site has the deets that work.

cortesoft · on July 18, 2023

You refuse to even try something on chrome?

ccooffee · on July 18, 2023

Nah, I don't use Chrome so I don't have it installed. I'm not a web developer, so testing across different platforms isn't useful to me. I've used StableDiffusion before, so hacking around to make this demo work in my browser isn't particularly interesting either.

ChuckMcM · on July 18, 2023

Yes. You don't?

INGSOCIALITE · on July 18, 2023

I agree with the poster 100%. Im convinced any Google applications immediately suck every iota of data they possibly can at install time / first launch. It’s not worth it to me either.

flangola7 · on July 18, 2023

What does the second flag do

bhaney · on July 18, 2023

Turns on an implementation of this proposal

https://github.com/WebAssembly/js-promise-integration/blob/m...

insanitybit · on July 18, 2023

Why is it that implementing something in wasm is stalled for so long but doing it as a js feature is so fast? Anyone have insights? As an outsider it feels like wasm is being developed in an impossibly slow way.

kevingadd · on July 18, 2023

Implementing something new in JS can be done relatively easily using a slow path, where you just write some privileged JS or C++ and then wrap it, without doing any optimizations. Then if it gets popular the vendors can optimize it at their own pace.

Implementing a new feature in WebAssembly is a bit more complex due to its execution model and security constraints. I expect it's also just the case that a lot of these new WASM features are very complex - promise integration is super nontrivial to get right, so are WebAssembly GC and SIMD.

insanitybit · on July 18, 2023

OK, I'd believe that I guess.

rictic · on July 18, 2023

JS Promises in something like their modern form were first played around with in ~2010, and it was ~2016 before browsers were shipping them natively. Good standards can take a while!

pjmlp · on July 18, 2023

Because it basically covers what PNaCL, Java plugin, Flash plugin, Silverlight asm.js were doing.

Anything beyond those use cases it is really meh, specially given how clunky compiling and debugin WASM code tends to be.

Then we have all those startups trying to reivent bytecode executable formats in the server, as if it wasn't something that has been done every couple of years since late 1950's.

insanitybit · on July 18, 2023

> Because it basically covers what PNaCL, Java plugin, Flash plugin, Silverlight asm.js were doing.

Right but it doesn't right now? Like you can't just write arbitrary code as you would with a Java plugin, or a PNaCL C++ plugin. Wasm is extremely difficult to use for those use cases.

> Then we have all those startups trying to reivent bytecode executable formats in the server, as if it wasn't something that has been done every couple of years since late 1950's.

Yes, because people really want this and the solutions have all been fraught with security issues historically.

pjmlp · on July 18, 2023

What makes you think WASM isn't without flaws, just because their advocates say so?

"Everything Old is New Again: Binary Security of WebAssembly"

https://www.usenix.org/conference/usenixsecurity20/presentat...

"Swivel: Hardening WebAssembly against Spectre"

https://www.usenix.org/conference/usenixsecurity21/presentat...

insanitybit · on July 18, 2023

I didn't say WASM is without flaws, I said the predecessors had flaws but that the premise is valuable, which is why we keep trying it over and over again.

Notably, the first paper is about exploitation of webassembly processes. That's valuable but the flaws of previous systems wasn't that the programs in those systems were exploitable but that the virtual machines were. Some of this was due to the fact that the underlying virtual machines, like the JVM, were de-facto unconstrained and the web use case attempted to add constraitns on after the fact; obviously webassembly has been designed differently.

I hope wasm sees more mitigations, but I also expect that wasm is going to be a target primarily for memory safe languages where these problems are already significantly less significant. And to reiterate, the issue was not the exploitation of programs but exploitation of the virtual machines isolation mechanisms.

chrisweekly · on July 18, 2023

I don't know for sure, but gut says the primary factor must be the ratio of devs working on JS runtimes vs WASM (over 10,000:1?)

brucethemoose2 · on July 18, 2023

Also already done here: https://websd.mlc.ai/#text-to-image-generation-demo

MLC uses Apache TVM to generate and autotune the webGPU code, and its respectably performant.

DustinBrett · on July 18, 2023

I integrated MLC-AI's Stable Diffusion into my website as a custom background generator. It grabs the model data from Hugging Face.

https://dustinbrett.com/

mNovak · on July 18, 2023

How smooth and fast this is just makes me sad that every app on my phone requires 200MB to show a couple views and send API calls.

stavros · on July 18, 2023

Exactly. We have supercomputers in our pockets, yet showing a simple todo list takes seconds. Where does all the power go?!

archerx · on July 18, 2023

Over engineered frameworks that try to solve everyone’s problems at the same time in exchange for terrible performance.

INGSOCIALITE · on July 18, 2023

People don’t code for performance. Every application is built on top of “framework” bloat.

Remember when people would boast about how little lines of code their program had or how little memory their programs used?

taf2 · on July 18, 2023

That is really impressive loaded on my iPhone 12 mini and my phone was not over heating

franzb · on July 18, 2023

Impressive website, everything just works super nicely and smoothly! Ended up playing Doom more than I should have :)

kurishutofu · on July 18, 2023

Consider using a service worker for the CPU bound process so the main thread doesn't hang.

aussieguy1234 · on July 18, 2023

Next up....WebGPU chatbot? ChatGPT style chat with your own GPU in the browser?

brucethemoose2 · on July 18, 2023

Also already done... With some limitations.

LLMs are especially tricky for WebGPU because the good models are so RAM/VRAM heavy.

moffkalast · on July 18, 2023

https://webllm.mlc.ai/

Already done.

baddash · on July 18, 2023

Out of curiosity, what are use cases/applications of this?

So what I know is that this generates images via browser rather than server. The only thing I can think of is not having to refresh the page in order to change an image or generate a new image. Which... hmm, well, that could mean websites whose visual design changes in real-time? And maybe changes in a way that would be functionally relevant/useful? That does seem pretty cool, although I'm not sure how useful Stable Diffusion is for generating UI components/visual aspects of a site.

8n4vidtmkvmk · on July 18, 2023

It's useful because it's expensive to generate these images en masse. It moves the compute cost to the client.

ShamelessC · on July 18, 2023

What? The use case is to easily run it on your own computer. Without needing to be a programmer or install an application.

asynchronous · on July 18, 2023

Just like what most web apps allow- quick applications in a sandboxed environment on almost every type of hardware

lionkor · on July 18, 2023

Any hardware! As long as that hardware is overpowered for the job, so that the browser overhead is acceptable. Oh and it needs internet. Oh and it needs a reasonably large screen because padding and margins. Oh and it needs quite a bit of RAM to start. Maybe not any hardware.

Eduard · on July 18, 2023

Privacy is a big plus for (purely) locally running models.

Sensitive prompts will not leak to some remote party.

bb88 · on July 18, 2023

Anyone have metrics on this on performance in webgpu vs running this natively?

aislamov · on July 18, 2023

UNET takes about a 1:10 on WebGPU and around a minute on CPU in one thread. VAE is 2 minutes on CPU and about 10 seconds on GPU. It should be because most GPU ops for VAE are already implemented but for UNET are not. So in the latter case browser is just tossing data from GPU to CPU and back on each step

seeknotfind · on July 18, 2023

If this is fast enough, then you could use it to render images locally for personal use. Websites could deliver prompts only, perhaps rendering different images for different users. At that point, what does it mean for copyrights? Is the model itself copyrighted or does the system break down?

schoen · on July 18, 2023

> If this is fast enough, then you could use it to render images locally for personal use. Websites could deliver prompts only, perhaps rendering different images for different users.

That's a fascinating possibility, but we're very far from that world right now: elsewhere in the thread it's mentioned that this actively uses 8 GB of RAM. And I doubt many web designers would accept the risk that a model misinterprets a prompt, produces distorted output (like the wrong number of fingers on someone's hands), or accidentally produces sexual or violent content in a context where it's not intended.

For many generative image models today, people often pick the best of a dozen or more images, and the others that they throw away may actually be quite bad.

The quality and predictability of the models would need to be significantly higher than it is now in order to routinely dynamically illustrate web sites.

But I don't want to say that we'll never get there. All of the recent models are doing things now that would have been considered inconceivable just a few years ago. (Compare https://xkcd.com/1425/ where it may even be a challenge to explain the issue behind the joke to some younger readers!)

amelius · on July 18, 2023

It might as well mine some bitcoins while the GPU is not being used.

Perhaps we could have an ad-free internet after all.

k__ · on July 18, 2023

Ads create more overhead to pay all the executives.

A miner would probably just generate enough work to pay the engineers and infrastructure.

jb1991 · on July 18, 2023

So I assume this is using WebGPU "compute" kernels, is that right?

yboris · on July 18, 2023

Link to repo from the FAQ at the bottom of the page:

https://github.com/dakenf/stable-diffusion-webgpu-minimal

amelius · on July 18, 2023

When will WebGPU come to Firefox?

(Not going to install Chrome for a random shiny thing)

Kiro · on July 18, 2023

> Having DevTools open will slow everything down to about 2x.

What causes this?

capableweb · on July 18, 2023

Browser dev tools (and most of them really) hook into the application flow to do additional work compared to running it without any developer tools. Depending on what the application does, it can mean that it has to do a lot of extra work and need a lot of extra memory, just to be able to process and store all the extra information that it needs.

I don't know the specifics of why the slowdown is so extreme in this case, usually it has a negligible impact. But I'm guessing it's related to what I wrote above.

chris37879 · on July 18, 2023

Another one that slows a lot of things down is if the application uses the console. Before you open the inspector, those methods are no-ops and just get skipped, basically, but once open, all of those strings have to get copied and it can slow things down quite a bit.

This isn't unique to the web, either, adding the verbose flag to most linux file utilities and then operating on a large set of files will be slower than without the verbose flag, too, just because printing to stdout takes time.