Hacker News new | past | comments | ask | show | jobs | submit login

How come you always have to install some version of pytorch or tensor flow to run these ml models? When I'm only doing inference shouldn't there be easier ways of doing that, with automatic hardware selection etc. Why aren't models distributed in a standard format like onnx, and inference on different platforms solved once per platform?



>How come you always have to install some version of pytorch or tensor flow to run these ml models?

The repo is aimed at developers and has two parts. The first adapts the ML model to run on Apple Silicon (CPU, GPU, Neural Engine), and the second allows you to easily add Stable Diffusion functionality to your own app.

If you just want an end user app, those already exist, but now it will be easier to make ones that take advantage of Apple's dedicated ML hardware as well as the CPU and GPU.

>This repository comprises:

    python_coreml_stable_diffusion, a Python package for converting PyTorch models to Core ML format and performing image generation with Hugging Face diffusers in Python

    StableDiffusion, a Swift package that developers can add to their Xcode projects as a dependency to deploy image generation capabilities in their apps. The Swift package relies on the Core ML model files generated by python_coreml_stable_diffusion
https://github.com/apple/ml-stable-diffusion


That's done in professional contexts, when you only care about inference onnxruntime does the job well (including for coreml [1]).

I imagine that here apple wants to highlight a more research/interactive use, for example to allow fine tuning SD on a few samples from a particular domain (a popular customization).

[1] https://onnxruntime.ai/docs/execution-providers/CoreML-Execu...


Most models seem to be distributed by/for researchers and industry professionals. Stable Diffusion is state of the art technology, for example.

People who can't get the models to work by themselves given the source code aren't the target audience. There are other projects, though, that do distribute quick and easy scripts and tools to run these models.

Apple stepping in to get Stable Diffusion working on their platform is probably an attempt to get people to take their ML hardware more seriously. I read this more like "look, ma, no CUDA!" than "Mac users can easily use SD now". This module seemed to be designed so that the upstream SD code can easily be ported back to macOS without special tricks.


Seconded, I wish for a way to work with ML models using native code rather than through some Python scripting interface. I believe TensorFlow is there with C++, but it works only with C++ and not through FFI.


It would increase my interest in experimenting with these models 1000% at the least. I really can't be bothered to spend hours fucking around with pip/pipenv/poetry/virtualenv/anaconda/god knows what other flavour of the month package manager is in use. I just want to clone it and run it, like a Go project. I don't want to download some files from a random website and move them into a special directory in the repo only created after running a script with special flags or some bullshit. I want to clone and run.


It's one of the reasons I recently ported the Whisper model to plain C/C++. You just clone the repo, run `make [model]` and you are ready to go. No Python, no frameworks, no packages - plain and simple.

https://github.com/ggerganov/whisper.cpp


PyTorch has libtorch as its purely native library. There are also Rust bindings for libtorch:

https://github.com/LaurentMazare/tch-rs

I used this in the past to make a transformer-based syntax annotator. Fully in Rust, no Python required:

https://github.com/tensordot/syntaxdot


If you are okay with using nvidia-ecosystem, check out tensor rt.


Apple has their own mlmodel format but they can’t distribute this model as a direct download due to the models EULA. The first task is to translate the model.


What part of the SD license prohibits that?


No part of it.


I mean, it is a legal time bomb in general[0], with a non-standard license that has special stipulations in an amendment. Do you really incur the weeks of lead time that it would take Legal to review the legality of redistributing this model?

0: https://github.com/CompVis/stable-diffusion/blob/main/LICENS...


Redistributing that model to end users that violate Attachment A seems like a minefield.


Not really. You're not responsible for how users use products you distribute. The license is passed along to them, they would be the ones violating it.


Are you an attorney?


No, I'm not. If you have supporting precedent for your position (that a licensor can be held liable for the unpreventable actions of a licensee) I would like to see it.


In the professional context (apart of individual apps distributed by small creators / indiehackers) usually models are run using standardized runtimes in native code (C++ usually), using runtimes TensorRT (for Nvidia Devices), onnxruntime (agnostic), etc.


DiffusionBee is an app that is completely self-contained and lets you play with this stuff completely trivially, no installs required.

https://diffusionbee.com/


But it's not optimised to work with Apple's CoreML (yet), isn't it?


It's pretty fast. On an 8GB M2 MacBook Air, it produces more than 2 images per minute using the default settings.

E.g., it's about 20x as fast as InvokeAI, which doesn't have an FP16 option that works on a Mac.


I don't know, but it seems extremely fast.


If you want it and it doesn't exist, why not simply do it yourself? It's open source no?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: