Hacker News new | past | comments | ask | show | jobs | submit login
Human Pose Estimation with Deep Learning (nanonets.com)
135 points by ole_gooner on May 8, 2019 | hide | past | favorite | 13 comments



Thanks for posting this. Around 2 months back I was trying my hands on something alike - I think this one will help me take some steps further on it.

I was working on an ad-blocker for live TV, specially news channels. Annoyed by the amount of ad sequences they have on news channels, I wanted to build a dashboard that would show me color coded state of news channels, {green -> news content | red -> ads | yellow -> not sure}. So that I instantly know which channel to switch to.

I used pose detection as one of the heuristic to figure out if there is an anchor on the screen. Used openpose with golang, worked pretty well to start with but later felt it was bit limited. (I have no ML background)


Browsing through the abstracts and papers using this continuously updated overview, is more interesting (to me): https://paperswithcode.com/task/pose-estimation


How fast are the algorithms? does anyone of them work in real time?


Certainly. For example, https://storage.googleapis.com/tfjs-models/demos/posenet/cam... will run in your browser. https://github.com/CMU-Perceptual-Computing-Lab/openpose is also a decent choice for realtime pose estimation, although I'm not sure if they have a web demo anywhere.


They should work in real time. Atleast the architecture looks like end to end neural network so just like other CNN based models, this should as well work in real time after quantization etc


They showed an app like this running on a phone in laggy but real time at I/O yesterday


Yeah nowadays with post-training quantization techniques as well as things like squeezenet which is quantization aware training technique, models are becoming fairly small to be able to run on phone smoothly


Interesting, wonder what their use case was?


I can think of a couple of use cases, none of them particularly useful though. Imagine you had the pose data for the image and the pose data of someone else, even an imaginary someone else thanks to GaNs.


If it was in google IO, it will be some consumer use case probably? Althought there are a lot of commercial use cases of this technology. For example, you can train a simple classifier on top of pose which will help you record time spent on different types of activities which is useful in measuring productivity.


Dance Dance Revolution, but without the sensor mat?

https://en.wikipedia.org/wiki/Dance_Dance_Revolution


Just a fun dance recognition app to show off ML.


Hey guys! Author of the article here. Let me know if you have any questions wrt implementation, SOTA, usability etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: