How come you always have to install some version of pytorch or tensor flow to run these ml models? When I'm only doing inference shouldn't there be easier ways of doing that, with automatic hardware selection etc. Why aren't models distributed in a standard format like onnx, and inference on different platforms solved once per platform?
>How come you always have to install some version of pytorch or tensor flow to run these ml models?
The repo is aimed at developers and has two parts. The first adapts the ML model to run on Apple Silicon (CPU, GPU, Neural Engine), and the second allows you to easily add Stable Diffusion functionality to your own app.
If you just want an end user app, those already exist, but now it will be easier to make ones that take advantage of Apple's dedicated ML hardware as well as the CPU and GPU.
>This repository comprises:
python_coreml_stable_diffusion, a Python package for converting PyTorch models to Core ML format and performing image generation with Hugging Face diffusers in Python
StableDiffusion, a Swift package that developers can add to their Xcode projects as a dependency to deploy image generation capabilities in their apps. The Swift package relies on the Core ML model files generated by python_coreml_stable_diffusion
That's done in professional contexts, when you only care about inference onnxruntime does the job well (including for coreml [1]).
I imagine that here apple wants to highlight a more research/interactive use, for example to allow fine tuning SD on a few samples from a particular domain (a popular customization).
Most models seem to be distributed by/for researchers and industry professionals. Stable Diffusion is state of the art technology, for example.
People who can't get the models to work by themselves given the source code aren't the target audience. There are other projects, though, that do distribute quick and easy scripts and tools to run these models.
Apple stepping in to get Stable Diffusion working on their platform is probably an attempt to get people to take their ML hardware more seriously. I read this more like "look, ma, no CUDA!" than "Mac users can easily use SD now". This module seemed to be designed so that the upstream SD code can easily be ported back to macOS without special tricks.
Seconded, I wish for a way to work with ML models using native code rather than through some Python scripting interface. I believe TensorFlow is there with C++, but it works only with C++ and not through FFI.
It would increase my interest in experimenting with these models 1000% at the least. I really can't be bothered to spend hours fucking around with pip/pipenv/poetry/virtualenv/anaconda/god knows what other flavour of the month package manager is in use. I just want to clone it and run it, like a Go project. I don't want to download some files from a random website and move them into a special directory in the repo only created after running a script with special flags or some bullshit. I want to clone and run.
It's one of the reasons I recently ported the Whisper model to plain C/C++. You just clone the repo, run `make [model]` and you are ready to go. No Python, no frameworks, no packages - plain and simple.
Apple has their own mlmodel format but they can’t distribute this model as a direct download due to the models EULA. The first task is to translate the model.
I mean, it is a legal time bomb in general[0], with a non-standard license that has special stipulations in an amendment. Do you really incur the weeks of lead time that it would take Legal to review the legality of redistributing this model?
Not really. You're not responsible for how users use products you distribute. The license is passed along to them, they would be the ones violating it.
No, I'm not. If you have supporting precedent for your position (that a licensor can be held liable for the unpreventable actions of a licensee) I would like to see it.
In the professional context (apart of individual apps distributed by small creators / indiehackers) usually models are run using standardized runtimes in native code (C++ usually), using runtimes TensorRT (for Nvidia Devices), onnxruntime (agnostic), etc.