Hacker News new | past | comments | ask | show | jobs | submit login

This is pretty cool. I'm looking forward to the day that not only can we reconstruct what was removed, but reconstruct what's not in the photo (image sensor), but could be inferred. Can we reconstruct the 3D scene and fill in the parts the camera can't see? For example, if it's a front view of a person, make a 3d model of a person and texture the backside clothing.

It seems like if we put together all the advancements these deep inference engines are producing, we may be able to reconstruct a reality, and allow people to walk around in it using VR glasses, or less advanced 3D shooter style controls.

It could be like a live earth view, any photo with a timestamp can contribute to this view and we'd create a sweet 4D model of earth across time and space.




This is not reconstruction. It's guessing from what 'i' have seen before. So this would never reconstruct anything unseen. In a broad sense.


Partially correct, but not necessarily true. Speaking extremely high level, the model is currently trained to fill in the blank space with something like the most probable options per pixels, based on the training set data. However, it is conceivable that it could be trained to also, say, insert an object into a scene, based on other characteristics found in the scene. Inotherwords, it could be trained to maximize a joint goal, where the second goal involved generating an object.


I think I follow what you're saying, we aren't going to be able to construct what aliens look like on a distant planet (or some other unknown). We're essentially looking at the information we have, which is why I expect continued interest in exploring for new information. Even as we continue to improve our reconstructions of 4D.


Like so:

http://vision.princeton.edu/projects/2016/SSCNet/

There's similar work at (at least) Berkeley and Stanford.


That's cool. I liked the use of scene rendering to supply training data to the network.

It'd be nice to see texture prediction on some of the voxels, so painting the occluded voxels in the scene as well as texturing those in the image.

Texture accuracy could be measured by rendering the other side of the bed and see how close the texture predictions were.

Now this would be quite a challenge, but if you could train a network to give D, given RGB, you'd have RGBD and could maybe use internet video to create some structure. Use something like a SLAM algorithm to get camera position, then detect when a model is viewed from the occluded side and get a lot of texture prediction data using real world internet video.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: