PyTorch at the Edge: Deploy 964 TIMM Models on Android with TorchScript

dnth · on Feb 15, 2023

Model deployment is painful. Running a model on a mobile phone?

Forget it .

The frustration is real. I remember spending nights exporting models into ONNX and it still failed me. Deploying models on mobile for edge inference used to be complex.

Not anymore.

In this post, I’m going to show you how you can pick from over 900+ SOTA models on TIMM, train them using best practices with Fastai, and deploy them on Android using Flutter.

pozol · on Feb 15, 2023

are there any object detection models? we went with apple to use CoreML, which works great, but this is cool. Running the pytorch version of our model took way to long for inference.

dnth · on Feb 15, 2023

The pytorch lite package also supports yolov5 models.

I posted on my LinkedIn awhile ago

https://www.linkedin.com/posts/dickson-neoh_deploying-object...

raihansaputra · on Feb 15, 2023

That's really cool. I see you have much faster inference compared to the PlayTorch example. What size YOLOv5 are you using?

nl · on Feb 15, 2023

Have you tried https://google.github.io/mediapipe/solutions/object_detectio...?

dnth · on Feb 15, 2023

Media pipe looks really cool. I havent tried it. Have you?

nl · on Feb 15, 2023

Yes, although haven't deployed on mobile. Works well in the browser.

nl · on Feb 15, 2023

Some other interesting links for deployment of ML on mobile:

* https://liuliu.me/eyes/stretch-iphone-to-its-limit-a-2gib-mo... (really good technical post on getting Stable Diffusion running on iOS)

* https://google.github.io/mediapipe/getting_started/getting_s... Google's mediapipe, which has a bunch of models that run on both iOS and Android

smoldesu · on Feb 15, 2023

Badass. I've spent the last week digging into ARM optimization for these models because it's really fascinating how close we are to local deployment for this stuff - writeups like these should help spread awareness.

dnth · on Feb 15, 2023

Thanks a lot there! I was hesitating to write this piece actually, thinking it's not going to be valuable. I'm glad you find value in them!

synctext · on Feb 15, 2023

Thnx for writing! In academia we're getting the next step operational: training on Android. Any advise for us what to watch out for?

Obviously you need a bit of patience and lots of volunteer devices. With unsupervised continuous learning this is solved, in emulation. See "G-Rank: Unsupervised Continuous Learn-to-Rank for Edge Devices in a P2P Network" [1]. Optimal learning rate is left as an exercise for the developer. (disclaimer: our own work, I run a lab with "systems for peer-to-peer machine learning")

[1] https://arxiv.org/abs/2301.12530

dnth · on Feb 16, 2023

You're welcome! I'm not sure if I'm the right person to advise on this. But this idea is also known as federated learning right?

dnth · on Feb 16, 2023

Update - After some code optimization, I got the inference time down to below 100ms. The lowest I got on my Pixel 3 XL is 37ms!

https://dicksonneoh.com/portfolio/pytorch_at_the_edge_timm_t...

_joel · on Feb 15, 2023

This is great, I've got a few days downtime and wanted to hone my skills a little so this is a great starter. I've skimmed the article and it all looks very doable for my level.

dnth · on Feb 15, 2023

Thank you for the feedback! Let me know if you have questions :)

gerty · on Feb 15, 2023

What went wrong with ONNX? Why didn't it work out?

kbumsik · on Feb 15, 2023

There is nothing wrong with ONNX, but rather the limitations of PyTorch.

1. PyTorch model files are neither portable nor self-contained, because PyTorch model files are pickled Python classes containing weights so you need Python class code to run it.

Because it needs real Python code to run the models, PyTorch suffers from a numerous issues porting to other non-Python platforms such as ONNX.

PyTorch offers a way to export to ONNX but you will encounter various errors. [1]

Sure, you might be lucky enough to troubleshoot a specific model to export it to ONNX, but if your objective is to export arbitrary 964 models from a model zoo (TIMM) it is almost impossible.

2. There are organizational or cultural problems with it. Because of the above problem, PyTorch model needs to be designed with portability in mind from beginning. But porting & serving models are what engineers do, whereas researchers, who design models, don't care about it when writing papers. So it is often hard to use SOTA models that comes from an academic research.

[1] https://pytorch.org/docs/stable/onnx.html#limitations

skrunch · on Feb 15, 2023

> PyTorch offers a way to export to ONNX but you will encounter various errors. [1]

I mean sure, there are limitations, but this is greatly exaggerating their impact in my experience. I'd be curious to hear from anyone where these have been serious blockers, I've been exporting PyTorch models to ONNX (for CV applications) for the last couple of years without any major issues (and any issues that did pop up were resolved in a matter of hours).

davidatbu · on Feb 15, 2023

So I tried converting an ASR model[0] to ONNX about a year or two back. It was really painful. The pain could largely be ascribed to:

(1) code that is very dynamic, making it hard for Pytorch to convert the modules to TorchScript (which it does before converting them to ONNX)

(2) ops that were simply not available in ONNX. Especially, torch.fft, also some others.

[0] https://github.com/burchim/EfficientConformer

pharmakom · on Feb 15, 2023

How did this happen? a pickle is not a sensible storage format. it's insecure, hard to version, not very portable. isnt a model basically a big matrix of numbers?

kbumsik · on Feb 15, 2023

Not in PyTorch. A model is Python dictionaries containing states and Python module/class objects. I don't know why the PyTorch team did this but that happened. Maybe it boils down to the point #2 I said.

melony · on Feb 15, 2023

How does Snapchat do filters in real-time?

nl · on Feb 15, 2023

Not sure if Snapchat is using this, but I've written the same basic thing using this running in the browser (it also runs on both iOS and Android): https://google.github.io/mediapipe/solutions/face_mesh.html

mejutoco · on Feb 15, 2023

I read it to find this applies to a mobile app (using a flutter library). That is great, and I was wondering if there is a similar library for Javascript to run in the browser or nodejs (I could not find one, other than ONNX)

dnth · on Feb 15, 2023

With flutter you can also build web apps.

https://flutter.dev/multi-platform/web

tccole · on Feb 15, 2023

Tinygrad in shambles

throw7777 · on Feb 15, 2023

synergy20 · on Feb 15, 2023

what is the freemium limit at kaggle and colab?

voz_ · on Feb 15, 2023

Very cool.

dnth · on Feb 15, 2023

Thank you!

voz_ · on Feb 15, 2023

Have you had a chance to play with Pytorch 2.0's export yet?

dnth · on Feb 15, 2023

Note yet! But I heard things are going to be a lot faster in 2.0. Have you tried?

brianjking · on Feb 15, 2023

this is seriously cool!

dnth · on Feb 15, 2023

Thank you!!!