PIFuHD: High-Resolution 3D Human Digitization

c22 · on July 18, 2020

I'm glad I clicked this because it's awesome, but it wasn't at all what I was expecting. Maybe change the title to PIFuHD: High-Resolution 3D Human Digitization?

SequoiaHope · on July 19, 2020

So I am building a farm robot, and for various reasons I think having a 3D reconstruction of the plants would be useful. I’m hoping that this technique might be useful for reconstructing models of plants from a small number of images. The model doesn’t need to be perfect, just close. I think this will help in doing reinforcement learning on simulated data, among other things.

Great research coming out of Facebook Reality Labs!

positivity10 · on July 18, 2020

More info on this blog: https://ai.facebook.com/blog/facebook-research-at-cvpr-2020/

makeworld · on July 18, 2020

This is amazing. Does anyone have any insight on what makes the back view possible?

dogma1138 · on July 18, 2020

I’m guessing this was trained on images and their 3D scanned models so it kinda fits the unseen parts of the body.

I can nearly guarantee you that if you try to to use an image with something on the back it won’t be able to fit it, backpacks don’t work and probably something like the Victoria’s Secret angel wings won’t either.

You can also see where it really fudges things like the shoes on some of their examples, their examples are also the best case scenario (similar to the StyleGAN2 results) and not necessarily representative of what the model achieves regularly.

I’ve tried using some pictures of my partner in order to 3D print her but it didn’t really work well I’m guessing you need a really clean background.

dorkwood · on July 19, 2020

Make sure you use a long focal length, too. A normal wide angle photo from your camera phone won't produce very desirable results.

This guy on Twitter did some of his own tests and they came out pretty good. https://mobile.twitter.com/Yokohara_h/status/127239671259470...

swframe2 · on July 19, 2020

It requires a video and the person needs to rotate. It doesn't imagine the back view. It appears to be doing optical flow and photogrammetry.

SequoiaHope · on July 19, 2020

It is not doing photogrammetry and it works on a single image. It has learned what people look like (and training may have used photogrammetry in some way) and it can imagine a good 3D model from even a single image. The last line of the abstract:

“We demonstrate that our approach significantly outperforms existing state-of-the-art techniques on single image human shape reconstruction by fully leveraging 1k-resolution input images.”

makeworld · on July 19, 2020

The site clearly shows nice 3D outputs created from a single static image. Even the video models are created from a single frame alone.

adhoc32 · on July 18, 2020

The pretrained model is unavailable.

edit: downloading now, maybe the server was overloaded.

swframe2 · on July 19, 2020

If you try it your own video, make sure the person is rotating so the model see all sides.

dorkwood · on July 19, 2020

I'd recommend watching the "1 minute presentation" video linked at the top of the page. It mentions generating the model from a single image.

https://youtu.be/-1XYTmm8HhE

makeworld · on July 19, 2020

I replied to this above, this isn't required.

monkeydust · on July 19, 2020

Application - better fitting of clothes brought online? Consumer happy, company returns rate drops. Seems like an obvious one no?

coronadisaster · on July 19, 2020

High resolution doesn't mean high accuracy in this case.

swframe2 · on July 19, 2020

Similar to how the early GANs for generating faces were low res. I would expect follow up work to get much better fairly quickly.

coronadisaster · on July 19, 2020

You can't create an accurate 3D model from what you can't see. It can be high resolution though...

dogma1138 · on July 19, 2020

It’s not about being “accurate” as in representing exactly what it can’t see but rather be “authentic”, since you don’t care about things like patterns or logos clothes aren’t that hard to figure out and so are body shapes.

Backpacks, and other attachments that are on the back or occluded ofc can’t be predicted or fitted with-in the model to represent their real life counterparts well not unless you train the model on every possible backpack.

That said training this model on people’s photographs and their 3D scans and then possibly on a dataset of clothes and accessories from big brands might actually allow you to much more accurately predict the unseen parts of the image by not only filling the gaps based on predictions but by actually specifically matching the items of clothing and accessories to draw additional information that way.

Most fashion brands already have detailed images of their products from multiple angles and even have 3D views of many items.

Combine that dataset with this model and you’ll probably be able to increase the accuracy of these models considerably.

jcims · on July 19, 2020

Tweet of Obama reconstructed as Larry Bird in 3...2...