How come you always have to install some version of pytorch or tensor flow to ru...

GeekyBear · on Dec 2, 2022

>How come you always have to install some version of pytorch or tensor flow to run these ml models?

The repo is aimed at developers and has two parts. The first adapts the ML model to run on Apple Silicon (CPU, GPU, Neural Engine), and the second allows you to easily add Stable Diffusion functionality to your own app.

If you just want an end user app, those already exist, but now it will be easier to make ones that take advantage of Apple's dedicated ML hardware as well as the CPU and GPU.

>This repository comprises:

    python_coreml_stable_diffusion, a Python package for converting PyTorch models to Core ML format and performing image generation with Hugging Face diffusers in Python

    StableDiffusion, a Swift package that developers can add to their Xcode projects as a dependency to deploy image generation capabilities in their apps. The Swift package relies on the Core ML model files generated by python_coreml_stable_diffusion

https://github.com/apple/ml-stable-diffusion

m3at · on Dec 2, 2022

That's done in professional contexts, when you only care about inference onnxruntime does the job well (including for coreml [1]).

I imagine that here apple wants to highlight a more research/interactive use, for example to allow fine tuning SD on a few samples from a particular domain (a popular customization).

[1] https://onnxruntime.ai/docs/execution-providers/CoreML-Execu...

jeroenhd · on Dec 2, 2022

Most models seem to be distributed by/for researchers and industry professionals. Stable Diffusion is state of the art technology, for example.

People who can't get the models to work by themselves given the source code aren't the target audience. There are other projects, though, that do distribute quick and easy scripts and tools to run these models.

Apple stepping in to get Stable Diffusion working on their platform is probably an attempt to get people to take their ML hardware more seriously. I read this more like "look, ma, no CUDA!" than "Mac users can easily use SD now". This module seemed to be designed so that the upstream SD code can easily be ported back to macOS without special tricks.

LoganDark · on Dec 2, 2022

Seconded, I wish for a way to work with ML models using native code rather than through some Python scripting interface. I believe TensorFlow is there with C++, but it works only with C++ and not through FFI.

cedws · on Dec 2, 2022

It would increase my interest in experimenting with these models 1000% at the least. I really can't be bothered to spend hours fucking around with pip/pipenv/poetry/virtualenv/anaconda/god knows what other flavour of the month package manager is in use. I just want to clone it and run it, like a Go project. I don't want to download some files from a random website and move them into a special directory in the repo only created after running a script with special flags or some bullshit. I want to clone and run.

ggerganov · on Dec 2, 2022

It's one of the reasons I recently ported the Whisper model to plain C/C++. You just clone the repo, run `make [model]` and you are ready to go. No Python, no frameworks, no packages - plain and simple.

https://github.com/ggerganov/whisper.cpp

danieldk · on Dec 2, 2022

PyTorch has libtorch as its purely native library. There are also Rust bindings for libtorch:

https://github.com/LaurentMazare/tch-rs

I used this in the past to make a transformer-based syntax annotator. Fully in Rust, no Python required:

https://github.com/tensordot/syntaxdot

0x008 · on Dec 2, 2022

If you are okay with using nvidia-ecosystem, check out tensor rt.

zitterbewegung · on Dec 2, 2022

Apple has their own mlmodel format but they can’t distribute this model as a direct download due to the models EULA. The first task is to translate the model.

EMIRELADERO · on Dec 2, 2022

What part of the SD license prohibits that?

ronsor · on Dec 2, 2022

No part of it.

judge2020 · on Dec 2, 2022

I mean, it is a legal time bomb in general[0], with a non-standard license that has special stipulations in an amendment. Do you really incur the weeks of lead time that it would take Legal to review the legality of redistributing this model?

0: https://github.com/CompVis/stable-diffusion/blob/main/LICENS...

zerohp · on Dec 2, 2022

Redistributing that model to end users that violate Attachment A seems like a minefield.

EMIRELADERO · on Dec 2, 2022

Not really. You're not responsible for how users use products you distribute. The license is passed along to them, they would be the ones violating it.

zerohp · on Dec 2, 2022

Are you an attorney?

EMIRELADERO · on Dec 2, 2022

No, I'm not. If you have supporting precedent for your position (that a licensor can be held liable for the unpreventable actions of a licensee) I would like to see it.

0x008 · on Dec 2, 2022

In the professional context (apart of individual apps distributed by small creators / indiehackers) usually models are run using standardized runtimes in native code (C++ usually), using runtimes TensorRT (for Nvidia Devices), onnxruntime (agnostic), etc.

pmarreck · on Dec 2, 2022

DiffusionBee is an app that is completely self-contained and lets you play with this stuff completely trivially, no installs required.

https://diffusionbee.com/

janandonly · on Dec 2, 2022

But it's not optimised to work with Apple's CoreML (yet), isn't it?

nessus42 · on Dec 3, 2022

It's pretty fast. On an 8GB M2 MacBook Air, it produces more than 2 images per minute using the default settings.

E.g., it's about 20x as fast as InvokeAI, which doesn't have an FP16 option that works on a Mac.

pmarreck · on Dec 5, 2022

I don't know, but it seems extremely fast.

kuwoze · on Dec 2, 2022

If you want it and it doesn't exist, why not simply do it yourself? It's open source no?