Hacker News new | past | comments | ask | show | jobs | submit login
PyTorch at the Edge: Deploy 964 TIMM Models on Android with TorchScript (dicksonneoh.com)
102 points by dnth on Feb 15, 2023 | hide | past | favorite | 34 comments



Model deployment is painful. Running a model on a mobile phone?

Forget it .

The frustration is real. I remember spending nights exporting models into ONNX and it still failed me. Deploying models on mobile for edge inference used to be complex.

Not anymore.

In this post, I’m going to show you how you can pick from over 900+ SOTA models on TIMM, train them using best practices with Fastai, and deploy them on Android using Flutter.


are there any object detection models? we went with apple to use CoreML, which works great, but this is cool. Running the pytorch version of our model took way to long for inference.


The pytorch lite package also supports yolov5 models.

I posted on my LinkedIn awhile ago

https://www.linkedin.com/posts/dickson-neoh_deploying-object...


That's really cool. I see you have much faster inference compared to the PlayTorch example. What size YOLOv5 are you using?



Media pipe looks really cool. I havent tried it. Have you?


Yes, although haven't deployed on mobile. Works well in the browser.


Some other interesting links for deployment of ML on mobile:

* https://liuliu.me/eyes/stretch-iphone-to-its-limit-a-2gib-mo... (really good technical post on getting Stable Diffusion running on iOS)

* https://google.github.io/mediapipe/getting_started/getting_s... Google's mediapipe, which has a bunch of models that run on both iOS and Android


Badass. I've spent the last week digging into ARM optimization for these models because it's really fascinating how close we are to local deployment for this stuff - writeups like these should help spread awareness.


Thanks a lot there! I was hesitating to write this piece actually, thinking it's not going to be valuable. I'm glad you find value in them!


Thnx for writing! In academia we're getting the next step operational: training on Android. Any advise for us what to watch out for?

Obviously you need a bit of patience and lots of volunteer devices. With unsupervised continuous learning this is solved, in emulation. See "G-Rank: Unsupervised Continuous Learn-to-Rank for Edge Devices in a P2P Network" [1]. Optimal learning rate is left as an exercise for the developer. (disclaimer: our own work, I run a lab with "systems for peer-to-peer machine learning")

[1] https://arxiv.org/abs/2301.12530


You're welcome! I'm not sure if I'm the right person to advise on this. But this idea is also known as federated learning right?


Update - After some code optimization, I got the inference time down to below 100ms. The lowest I got on my Pixel 3 XL is 37ms!

https://dicksonneoh.com/portfolio/pytorch_at_the_edge_timm_t...


This is great, I've got a few days downtime and wanted to hone my skills a little so this is a great starter. I've skimmed the article and it all looks very doable for my level.


Thank you for the feedback! Let me know if you have questions :)


What went wrong with ONNX? Why didn't it work out?


There is nothing wrong with ONNX, but rather the limitations of PyTorch.

1. PyTorch model files are neither portable nor self-contained, because PyTorch model files are pickled Python classes containing weights so you need Python class code to run it.

Because it needs real Python code to run the models, PyTorch suffers from a numerous issues porting to other non-Python platforms such as ONNX.

PyTorch offers a way to export to ONNX but you will encounter various errors. [1]

Sure, you might be lucky enough to troubleshoot a specific model to export it to ONNX, but if your objective is to export arbitrary 964 models from a model zoo (TIMM) it is almost impossible.

2. There are organizational or cultural problems with it. Because of the above problem, PyTorch model needs to be designed with portability in mind from beginning. But porting & serving models are what engineers do, whereas researchers, who design models, don't care about it when writing papers. So it is often hard to use SOTA models that comes from an academic research.

[1] https://pytorch.org/docs/stable/onnx.html#limitations


> PyTorch offers a way to export to ONNX but you will encounter various errors. [1]

I mean sure, there are limitations, but this is greatly exaggerating their impact in my experience. I'd be curious to hear from anyone where these have been serious blockers, I've been exporting PyTorch models to ONNX (for CV applications) for the last couple of years without any major issues (and any issues that did pop up were resolved in a matter of hours).


So I tried converting an ASR model[0] to ONNX about a year or two back. It was really painful. The pain could largely be ascribed to:

(1) code that is very dynamic, making it hard for Pytorch to convert the modules to TorchScript (which it does before converting them to ONNX)

(2) ops that were simply not available in ONNX. Especially, torch.fft, also some others.

[0] https://github.com/burchim/EfficientConformer


How did this happen? a pickle is not a sensible storage format. it's insecure, hard to version, not very portable. isnt a model basically a big matrix of numbers?


Not in PyTorch. A model is Python dictionaries containing states and Python module/class objects. I don't know why the PyTorch team did this but that happened. Maybe it boils down to the point #2 I said.


How does Snapchat do filters in real-time?


Not sure if Snapchat is using this, but I've written the same basic thing using this running in the browser (it also runs on both iOS and Android): https://google.github.io/mediapipe/solutions/face_mesh.html


I read it to find this applies to a mobile app (using a flutter library). That is great, and I was wondering if there is a similar library for Javascript to run in the browser or nodejs (I could not find one, other than ONNX)


With flutter you can also build web apps.

https://flutter.dev/multi-platform/web


Tinygrad in shambles


How?


what is the freemium limit at kaggle and colab?


Very cool.


Thank you!


Have you had a chance to play with Pytorch 2.0's export yet?


Note yet! But I heard things are going to be a lot faster in 2.0. Have you tried?


this is seriously cool!


Thank you!!!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: