DensePose – Dense Human Pose Estimation in the Wild

bsenftner · on June 19, 2018

I find this highly interesting from a personal perspective, as I am an author of a Personalized Advertising patent[1] from the early 2000's. I was unable to capitalize on my work, and ultimately sold the patent. Interesting to note, the month the patents' expired (due to aggressive early filing dates of my original patents) I noticed Facebook's AR Kit was released, which touched multiple aspects of what is protected by the global patents.

This tech is key for Personalized Advertising where consumers are inserted into still and video adverting in place of current spokespersons and side-by-side with celebrities. Advertising is about to get surreal and the fake news consumers are about to get exploited something unbelievable. "Deep Fakes" for porn is kid stuff compared to what this tech opens: Pandora's Box if you ask me.

[1] https://patents.justia.com/inventor/blake-senftner

xg15 · on June 19, 2018

Respect for your efforts - but I'm also amazed that apparently the tech was already there 18 years ago! Yet nothing was ever written about it.

I wonder what other technological breakthroughs are locked away behind patents.

searine · on June 19, 2018

>This tech is key for Personalized Advertising where consumers are inserted into still and video adverting in place of current spokespersons and side-by-side with celebrities.

What a boring dystopia.

Seriously though, if a company ever did without my permission I would sue the pants off of them.

hbosch · on June 19, 2018

By creating a Facebook account you should assume they have your permission.

icebraining · on June 20, 2018

No, it's personal data, you need specific consent under the GDPR :)

bsenftner · on June 19, 2018

The idea was to use them as fantasy fulfillment: take any A level film or music artist, and many consumers will pay for a video clip of them as Black Panther or you-name-it high grossing entertainment of the moment. You put yourself in, willingly, and for cost reduction allow product placements.

pavel_lishin · on June 19, 2018

Even assuming that there isn't already some precedent that a checkbox allows them to use your uploaded likeness in whatever they want, is your premise here that you have more time and money to spend in a legal battle with Facebook or Google?

394549 · on June 19, 2018

> Even assuming that there isn't already some precedent that a checkbox allows them to use your uploaded likeness in whatever they want, is your premise here that you have more time and money to spend in a legal battle with Facebook or Google?

There's currently a class-action lawsuit in progress against Facebook's use of facial tagging of Illinois residents: http://www.chicagotribune.com/business/ct-biz-facebook-taggi....

Regardless of how deep Facebook's pockets are, I could see another class-action lawsuit taking them on over recruiting its users into becoming uncompensated spokesmen in deepfake ads hocking products to their friends.

Also the legal situation in other jurisdictions may be less friendly to Facebook's usage of this technology, to the point where their deep pockets won't help them. I'm no international lawyer but I think that's definitely a possibility.

bigiain · on June 19, 2018

And there are _still_ people complaining that the GDPR is not a good idea...

If this becomes "a thing", I fully intend to use my UK citizenship and send GDPR boilerplate deletion requests to all the data brokers, social networks, and digital advertising services I can find.

taejo · on June 20, 2018

> use my UK citizenship

Better hope it becomes a thing before March.

fabianhjr · on June 19, 2018

The issue is that Facebook, Google et al. already get a ton of consent for targeted/personalized advertisement and usage of your images and data. So you might not see this in a mall, but you will surely see it online.

lioeters · on June 19, 2018

Funny and scary that you describe it as a Pandora's box, as an author of a related patent! "Advertising is about to get surreal." Minority-Report-style ads, here they come..

bsenftner · on June 19, 2018

I realized the potential, and wanted to steward the deployment in such a manner that civil society is not subjected to fake news, or privacy attacks such as public and private persons inserted into media against their will. The ultimate reason I was unsuccessful was a firm stance against pornography, which VC, media partners and angels insisted on an ability to exploit.

sequoia · on June 19, 2018

I'm sure it's little consolation, but kudos to you for sticking to your ethics. I sure wish more in the tech community would take this approach!

mottosso · on June 19, 2018

And what approach is that? Delaying the inevitable? :)

jcims · on June 20, 2018

How long will it be before a resourceful individual will be able to meaningfully engineer a biological or synthetic pathogen that could wipe out a good portion of the human race? 1000 years? 200 years? 50 years?

There's nothing wrong with buying a little time while society catches up with the technology.

pixelbash · on June 19, 2018

By around 15 years, not bad?

lioeters · on June 19, 2018

Much respect to making a moral stand! I suppose it's true that others will continue developing the "full potential" (including ethically questionable applications) of the technology, but at least one can make the choice to not contribute to undesired directions.

black_puppydog · on June 19, 2018

Personally, I'm waiting for AR UBlock. That would put the "augmented" in AR. :)

bernardom · on June 19, 2018

Would AR UBlock allow you to replace all the people in ads with people of your choice? So all spokesmen for me could be Gilbert Gottfried in a neon colored t-shirt?

drb91 · on June 19, 2018

I imagine it’d just cancel or black the ads out, I’d hope.

_blrj · on June 19, 2018

Someone here on HN had recently shared some hairstyles, facial accessories, and alterations that prevented common face detection implementations from recognizing your head, which seemed relevant and pretty cool from a tech punk perspective, too.

TeMPOraL · on June 19, 2018

I wouldn't really blame a person for not realizing, back almost 20 years ago, the full extent to which this technology could be used, when coupled with similar tech for audio processing, cheap compute, and the world consuming information primarily from the Internet.

andrepd · on June 19, 2018

Interesting from a technical perspective, sure. Absolutely terrifying from any other perspective.

unit91 · on June 19, 2018

A few years ago, I would have thought this technical feat was amazing, and stopped there. Now I think it's creepy. Ah how times have changed...

ryanblakeley · on June 19, 2018

Is there any other upside to this besides novelty and amusement? It's easy to imagine a dozen applications where this would threaten human privacy and safety.

pavel_lishin · on June 19, 2018

> Is there any other upside to this besides novelty and amusement?

* Automated injury detection. You've got a warehouse, you've got cameras, now you've got an instant alert when one of your workers appears to be injured but out of sight of other workers. You've got street cameras, now you've got automatic detection of someone having a heart attack and laying down on a sidewalk. (Dystopian application: "homeless person detected, deploying zap-drones") Hospitals and old folks' homes could use this, too.

* Lifeguard Assist programs - automatic detection of drowning-like behaviors. (Of course, over-reliance on this would be bad...)

* Children separated from parents might be easier to detect in places like malls, etc. (I'm going to stop listing obvious parenthetical dystopian applications)

applecrazy · on June 19, 2018

I see a huge possibility of replacing expensive mocap hardware and software using this tech, allowing for video games and VFX to become more accessible and use commodity hardware.

I’ve been eyeing things like the Kinect and iPhone X face tracking for this kind of task (for a fun side project I’m working on), but it would be great if I could track at least position and pose of multiple actors in a scene using just a standard webcam or camcorder.

ryanblakeley · on June 19, 2018

Those are good examples. Basically surveying groups of people for body language that indicates a dangerous situation might be unfolding.

TeMPOraL · on June 19, 2018

What's wrong with amusement? Seriously, I'd love to have this in videogames - to import myself and other people into the game. I'd also love to get my hands on a good, fully automated [series of photos] -> [textured 3D human model] pipeline, for anything from silly renders to 3D printing mementos.

Technology is inherently usable both for good and evil. You get both by default. It takes active countermeasures - usually non-tech, like regulations - to limit evil applications without sacrificing the good ones. As a society, we do that to some extent, but unfortunately we're not as successful as I'd like (e.g. if it were up to me, I'd seriously curtail the advertising industry).

ryanblakeley · on June 19, 2018

I didn't say there was anything wrong with amusement. I just asked if there was any utility besides that. If it's only value is amusement and the potential negatives are enormous, it's at least worth a discussion.

I don't think it's right to just shrug and say "everything can be used for good or bad". The details matter. If you're talking about a Yo-yo, yeah, you can hit people with it or just amuse yourself; nothing catastrophic is likely to happen. This tech though has greater implications.

namibj · on June 19, 2018

The structure from motion stuff is good enough if you got a couple of pictures from around the human without the human moving (much), but I suspect getting enough pictures without the human moving too far might be difficult.

avenius · on June 20, 2018

I imagine this could be developed to observe e.g. a budding powerlifter's form, pointing out any obvious errors and putting hordes of PTs out of work ;)

bsenftner · on June 19, 2018

This is a key tech behind personalized advertising: replacing actors in ads with anyone. From an Advertising Industry perspective, this is Holy Grail ingredients.

wiz21c · on June 19, 2018

Could you explain why this is Holy Grail ? As someone who's totally hostile to advertisement my brain is not wired to actually envision that holy grail. I ask the question honestly: having my face on a picture with a celebrity rings zero bells for me (even if it was RMS :-))

bsenftner · on June 19, 2018

Perhaps it is not meant for you, but for fans of a given media whose popularity warrants creating the consumer-insertion profitable. It is a different type of media, beyond product placements; consider the ability for fantasy fulfillment or impressionability. A lot of marketers would love to have such a technology. The idea is to offer it to consumers who willingly insert themselves into desirable scenarios for the media itself. A lot of consumers would do this, and the product placement capabilities are endless.

ryanblakeley · on June 19, 2018

Hard to count that as a positive, but I see your point. Seems like it would benefit a very small group.

skybrian · on June 19, 2018

Driverless cars doing better at recognizing humans?

ryanblakeley · on June 19, 2018

Or projecting human motion. That's a fair one.

cal5k · on June 19, 2018

There's nothing inherently creepy about it. I imagine it will be very useful for adding context to a scene - which is important for self-driving cars, AR, image classification, etc.

If you take a bunch of photos involving people doing things and extract pose information, I'd imagine it would be helpful in figuring out what's going on in other situations that are otherwise dissimilar.

jaxbot · on June 19, 2018

CC-Non commercial is a disappointing license to find. The project is very cool otherwise.

black_puppydog · on June 19, 2018

I wonder how they arrived at Creative Commons' CC-BY-NC as the license. These licenses are not meant for code but for artwork, Creative Commons actually discourage the use of their licenses for code [1]. I recently noticed the same with the FastPhotoStyle code [2] by nvidia, so I'm wondering if there is something that draws their legal departments to this license?

[1]: https://creativecommons.org/faq/#can-i-apply-a-creative-comm...

[2]: https://github.com/NVIDIA/FastPhotoStyle

levesque · on June 19, 2018

If the dataset it was trained on is CC-BY-NC, I'm pretty sure the model also has to be CC-BY-NC. However I think this is not respected, or even considered by most people.

I'd go with limiting how competitors can use it as the main deciding factor.

bigiain · on June 19, 2018

<cynical thought>Note that two of the three team members are from Facebook...

m1sta_ · on June 19, 2018

I'd never thought of this. It is a very interesting idea.

dr_zoidberg · on June 19, 2018

Hmmm... I guess it's been selected that way because it'd covevr the model files and the datasets. It's not about the code as much as the datasets/models.

rhizome · on June 19, 2018

They want to be the only people who can sell this where the real money is: military and law enforcement.

detaro · on June 19, 2018

discourage people that aren't just hobbyists messing around from using it?

black_puppydog · on June 19, 2018

yeah, that's what I came up with myself. But I thought a main point, if not the whole point of publishing code for these companies was to appeal to developer-types who are fond of real open source/science. And those should be able to tell the difference...

It's a bit like allowing your scientists to publish their research, but only in prohibitively expensive and thus exceedingly niche journals.

sequoia · on June 19, 2018

Have you tried contacting the author/copyright holder to find out whether you can negotiate a commercial licensing agreement for the software? I assume you're open to paying commercial licensing fees.

nerdponx · on June 19, 2018

Cheaper than a patent.

ricardobeat · on June 20, 2018

Would it be possible to map the first detected pose to a 3D model, scale and deform it to match the pose, and then use each next pose to manipulate the 3D model (vs generating all the vertices again)? This should result in smooth animation, without artifacts, and joint limits might even help with position estimation.

natch · on June 19, 2018

I'm too much of a newbie to figure this out but maybe someone here can tell me: Do they provide the final trained model? Or just a precursor model and code that trains a final model given bring-your-own data?

icebraining · on June 19, 2018

The site says "The dataset will soon be available on this website!", but apparently it's been saying that for at least four months.

natch · on June 19, 2018

Yes thanks. I was asking about the trained model though. I realize you can get the model by training with the data, but I don't believe for a minute that any data they release will be anything but a small fraction of what they trained their real model with.

abrichr · on June 19, 2018

Yes. From https://github.com/facebookresearch/DensePose/blob/master/GE...:

> DensePose should automatically download the model from the URL specified by the --wts argument

acd · on June 19, 2018

It can probably be used to identify potential criminals on the way the pose walk threatening pose. Police can then screen them. Like the movie Minority report.

milesokeefe · on June 19, 2018

Do you mean using gait analysis to identify humans and match them to a known criminal database or do you mean finding suspect criminals based on some “criminal” way they walk? If the latter, I don’t think that’s really based on anything more than current cultural profiling.

pavel_lishin · on June 19, 2018

"Take that pep out of your step, citizen!"

acd · on June 22, 2018

I mean by tracking criminals cell phones locations, watching how they walk. Checking the network effect ie if a person contacts a previous criminal in the same gang network. Matching photos form surveillance cameras and public social media pictures. Matching walking pose you can pretty much track down criminals.

The power is when you combine the different databases and build a profile of the person. It is very similar to how advertisement companies like Google build up profiles of customers(gmail,search behavior,dns name resolution tracking,cookie tracking) only in a different field.

It is probably even more powerful when you combine physical behavior with online behavior-

sonnyblarney · on June 20, 2018

" I don’t think that’s really based on anything more than current cultural profiling."

Yes, but if those individuals were actually more likely to commit crime, the AI would learn those things anyhow, leaving us with the question: if a specific demographic is considerably more likely to commit crime, and the AI picks up on it, is the AI 'racist'? Because racism is a moral judgement, moreover, the intersectionalists would indicate that it also requires the notion of 'power'.

This is not some novel issue I think and will fast become a real ethical dilemma.

tw1010 · on June 19, 2018

Society definitely needs more law-enforcers with a high false positive rate.

greggman · on June 19, 2018

that product/service already exists in Japan. No idea how effective it is but the company claims it uses deeplearning to recognize suspicious behavior of retail customers and then alert the staff to check.

Sorry I don't have a link. Saw it on a business news program.

_bxg1 · on June 19, 2018

Suddenly that movie's iris-scanning seems quaint.

bsenftner · on June 19, 2018

Makes me wonder how much better they have in closed source to release this openly...

cdibona · on June 19, 2018

It is closed, the non-commercial license means almost no one can use it.

icebraining · on June 19, 2018

Previous thread (I hadn't seen it either): https://news.ycombinator.com/item?id=16289057

bigiain · on June 19, 2018

I _assume_ this would make it fairly easy to do bone-length estimation and comparison, leading to a way to uniquely identify someone from a video feed of them walking - even without any facial features...

Now I wonder how much of that tech is already deployed...

infectoid · on June 20, 2018

Gait analysis has been around for awhile. I guess this technique just makes it even easier.

What's crazy to think about is Gait Analysis from orbit https://www.schneier.com/blog/archives/2008/09/gait_analysis...

lunch · on June 19, 2018

I can imagine this tech being using in some pretty interesting/scary ways:

* Generating avatars in Facebook's VR land from photos you're tagged in

* Recognizing a person IRL from photos they're tagged in

simonvc · on June 19, 2018

No-one's said it yet so i will (but not my idea..) This is going to be super useful for a future Oculus AR headset.

Basically, imagine the current oculus go headset, but with cameras on the front, and instead of showing you the actual world, it shows you a game, based on the existing current world, but morphed to look like Starship Troopers or something.

Qworg · on June 19, 2018

Rainbow's End is delves deeply into this idea - everyone can choose their reality and share it with others.

simonvc · on June 19, 2018

The Diamond Age - Neil Stephenson, touched on this.

neohaven · on June 19, 2018

I think you mean Syndicate. ;)

_xlr2 · on June 19, 2018

You could achieve similar results for cheaper by taking LSD. There's the added benefit that your brain actually believes it's all real instead of merely perceiving it.

joemi · on June 19, 2018

Since you mentioned LSD, I wonder if technology like this could one day replace hallucinogenic drugs? At least partially - I'm sure some people will always prefer the chemical experience. I've always been pretty interested in the effects of hallucinogens, but could never bring myself to try any of them due the potential risks.

I wonder how much "your brain actually believes it's all real instead of merely perceiving it" matters... I know there have been a few times where I've gotten so sucked into a movie in a darkened theater that when it ends its incredibly jarring to be brought back to the real world, and the times that's happened to me haven't even been 3D movies, let alone interactive like AR.

daenz · on June 20, 2018

Speaking from personal experience with psychedelics, it's not really the visual hallucinations that are impactful, but the way your brain thinks about things in a completely different way. So I don't think you can replace that chemical experience with a visual experience.

_xlr2 · on June 19, 2018

What would be interesting is the effect of both together; a sort of Brave New World 2.0. Great fodder for the likes of Black Mirror, at least.

lioeters · on June 19, 2018

Seeing how we live in an age of decadence and self-indulgence - perhaps just driven by instinct to pursue pleasure/curiosity - I'm sure that this is a possible future development: chemically-enhanced augmented reality experiences. "Augmented" in more than one sense of the word.

davidw · on June 19, 2018

There are a variety of technical things where I think "wow, that's pretty clever and/or innovative - hats off!", and then there are things like this where I'm like "OMG wizardry!". But that's one of the things I love about the field of "computer stuff": there's so much interesting stuff I don't know.

ambicapter · on June 19, 2018

Is there any reason this would be restricted to human pose estimation, as opposed to, say, rangefinding of moving objects of a specific roughly-known size?

lwansbrough · on June 19, 2018

I can't tell - is this able to extrapolate pose information into three dimensions, or can it only project onto two dimensional scenes?

gugagore · on June 19, 2018

It's something in between the two possibilities you describe.

Each human pixel in the image is labeled with an index and two coordinates: x, y (u and v are the traditional names, but think of 2D x, y coordinates)

The index specifies which patch that pixel is on, and the x, y coordinates specify where in the patch the pixel is on. This is for a pre-specified set of patches that cover a human mesh. See https://github.com/facebookresearch/DensePose/blob/master/no... for more detail.

So, no, it does not extrapolate the full mesh, but also for all human pixels, you are getting 3D information.

tastythrowaway · on June 19, 2018

So is this basically 3-D rotoscoping?

calahad · on June 19, 2018

This just popped up in my github feed, looks interesting.

tempodox · on June 20, 2018

Leave it to FB to come up with the most creepy and invasive tracking and surveillance.