Hacker News new | past | comments | ask | show | jobs | submit login
Walk through a 3D model of Y Combinator (matterport.com)
126 points by wildpeaks on Oct 30, 2014 | hide | past | favorite | 50 comments



Looks like the 3D model is just for navigating the street-view-like set of panoramic photospheres. It's so far a necessary compromise, to get much more realistic photos before you can have perfect structure, material and lighting capture/rendering. (the benefit is clear looking at their scan of the old yc building), but if this is to be useful with VR (and it definitely would be) they'll need something in between those panoramic photospheres.


I couldn't help but notice that it doesn't let you zoom in on the "dollhouse" view, i.e. the one actually showing the 3D model. No way to see how these models look up close.

This'll work great for navigating buildings—I can imagine it being a hit with real-estate sites—but the hiding of detail in the 3D view suggests the tech wouldn't work as well for other applications, like VR. (I'd love to be wrong here, though!)


The dollhouse view is actually a low detail version of the 3d model, with low resolution textures, it's just really not meant for a close inspection :) Load time and performance has been high priority for us, particularly because we also support mobile devices. We might do something like streaming in higher quality mesh/textures in the future, and allow closer inspection.

As for VR... we're playing around with it a whole lot, and it actually looks really great! Hopefully it'll get in front of more people somehow soon.


As a second ask, why does the dollhouse view have all the noise and random parts of the ceiling visible? It makes it difficult to navigate to 'see through' to the ground, and I would suspect just cutting the ceiling off completely would make for a better visualization.


It's actually harder than it seems to cut off the ceiling, houses may have all sorts of weird heights, angles and corners, doors, arches... Any simple solution might work for 80-90% of models, but our viewer has to work for 100%. So we only remove faces with backface culling currently, we're going to do better soon though :)


I think it's early to know what will and won't ultimately work for VR. Ultimately, VR is going to be a very imperfect representation of physical reality for a while. But, I think representations that are unrealistic in all sorts of ways still have a lot of potential for tricking us enough to be fun and useful. There are going to be tradeoffs and we don't know which ones they are until they're tried.

This is a cool thesis. I would definitely like to see how this could work in VR.


Tssk. They should fix the misleading "model" illustrations.. the 2012 is a render, the 2014 is a photo!

Duplicate of https://news.ycombinator.com/item?id=8524360


Yeah - they build the model from photos, but they limit the movement to the spots in the model where the photo was taken, then show you the photo. It's the same thing Microsoft Photosynth did a few years ago.

Saying "Our model quality has changed a lot in 2 years as well." and then presenting the 2014 photo is pretty sketchy. An accurate comparison would be viewing the model from the same position as the photo, using the model, not the photo. Or allowing arbitrary angles - I bet if you look under the machines on the front desk on the actual model, they've melted into the desk a little bit.

Or - I do understand the quality is not there yet - say 'Our viewing experience has changed a lot in 2 years as well.' and show the old viewer and new 2D/3D combined viewer.


I think the "Dolls House" view is a render.

TBH I just want to know when it auto-compiles into Doom WAD files (or the modern equivalent) so I can run through famous buildings shooting monsters.


You could bring in the entire .object into Unreal Engine first person template right now. Having whole rooms as models (rather than BSP brushes) is actually pretty common now. You might have to fix lightmaps and maybe make a low poly version first.


Correct, they should compare the 2012 model quality to the 2014 dollhouse perspective (not the "streetview-esque photos"). (still better)


I got curious how much space it would take to have enough spherical panoramas of the YC office to walk freely around it like in a first person shooter game. No 3D model involved, just a panorama for every possible position the user might want to walk to.

I think 60 frames per second would be smooth enough. Walking speed 5km/hour is roughly 1 meter/second. Per frame distance then is about 0.02 meters.

Looking at the map the YC office seems to be about 30 x 40 meters. Imagining a grid with lines each 0.02 meters overlaid on it, it seems you would need 3000000 panoramas. That might sound way too many, but wait! We live in the future.

One spherical panorama of 10000x5000 pixels seems to be about 4MB jpeg-compressed. So you would need only 12 terabytes of space. Also since you need 60 of these each second, you need storage that can move 240MB/sec, which is lower than the current speeds of SSDs.

1TB SSD seems to cost about $400, so for only $4800 you would have enough speedy space to store the panoramas. Enough space to explore whole YC building with no snapping at all between frames, with complete realism. Actually you could even do stereo 3D, as you already have the data.

Now it's a different question entirely on how we could take those 3000000 panoramas. Even if you had a Double Robotics bot with a spherical camera attached to it going around the space, snapping 10 panoramas per second, it would take 4 days to complete. While that itself is tolerable, how it would know its position and control its movement to the required accuracy I have no idea.

Still it blows my mind that storing all that would be possible.


Microsoft Research published an early version of what you describe in 2004, "Image-Based Interactive Exploration of Real-World Environments":

Paper: http://research.microsoft.com/en-us/um/redmond/groups/ivm/iv...

Web Page, Video, Slides: http://research.microsoft.com/en-us/um/redmond/groups/ivm/iv...

Quoting: "Whereas many previous systems have used still pho- tography and 3D scene modeling, we avoid explicit 3D reconstruction because it tends to be brittle."


A simple optimization can probably make things much better. If you think of it, the delta between panaromas corresponding to two adjacent points is very little. So, compressing image data across panaromas instead of each separately, I think, will make it _so_ much better.

Interesting thought exercise, thanks.


Such grid would mean that you can move straight only in 8 directions, all other camera rotations would need to be rounded to these 8 directions. I think this can result in quite 'jumpy' movement, but maybe human won't be able to notice such jumps?


I would definitely notice a jump like that in a video game, I'm guessing you'd need over 100 to make it seem smooth.


i wonder how much being stuck on a 2d plane would matter. otherwise, how many degrees of freedom do we need in the vertical axis?

also, it assumes you can only continuously walk through the space right? what about standing still and swaying your head sub-2cm?


I'm not sure copying the controls from google maps is a good thing, I've always hated them, for me they're incredibly frustrating to use. It could simply be that I'm so used to FPS controls so they feel too uncanny valley. But I've had years to get used to them and they still feel awkward.

I appreciate it could just be a personal thing, I've never asked anyone else what they think, but I seem to regularly accidentally look up because click to location often competes with viewport manipulation, you then can't get it 'level' which is mildly OCD annoying and the controls also feel backwards as you have to pull right to go left, pull down to look up, etc. Also the sensitivity of up/down seems exaggerated to left/right, but probably because it's windowed, I actually don't know.

I can't describe very well what's wrong, it just generally 'feels' wrong.

Google aren't exactly known for their UIs and as far as I can remember that control system was their first go at doing it and they've never changed it.


So how long until I'll have a .bsp of that (or whatever the kids have these days) and can fire my rocket launcher at the bus?


If you're curious what the camera+tripod setup looks like when it's taking photos, I found this "selfie" in the 3D model for the Four Seasons Silicon Valley, Presidential Suite: http://i.imgur.com/clHfeC5.png


"view inside" mode feels more like a street view style interpolation of 360° photos, not a 3d model.


I told a landscape designer it should be possible to walk around a property with a camera and then automatically build a 3d model. Turns out others are working on it. Good job. FYI what he does is take measurements and make a 2D top down drawing and then start sketching concepts for hard-scape ideas around the building.

Ultimately he'd want a drone to fly around a house and automatically creaete a 3d model and 2d plan. That would probably be exceptionally useful. He's pretty efficient at doing measurements and drawings, so fully automatic is almost the only way to be useful to him. Of course that's just one guy but I thought the example may be helpful.


I tried a Rift last week for the first time and the first thing I thought of was getting a drone to do a scan of a room. People could use it to set up their "home space" like in Snow Crash/Gibson's stuff.


The Cooper Hewitt (a Smithsonian museum in New York) recently 3d scanned itself: http://www.cooperhewitt.org/2014/10/10/3d-scanning-the-carne...

The model is freely-available (CC0) in FBX ("full geometries and color textures including interiors") and STL formats. http://www.cooperhewitt.org/about/mansionmodel/.

http://www.3dsystems.com/ did the scanning/photography.


I've been interested in the two approaches to mapping out cities and larger places. Google, for example, has StreetView cars, building a point cloud and texture map of parts of the world.

In video games like GTA, LA Noire and Watch Dogs real cities are mapped out using a sort of "conceptual compression" that leaves landmarks but somehow brings the space closer together. My sense is that this is a labor intensive process, but what if it could be automated?

It would be interesting way to explore a place, though with the obvious pitfall that what's not included in the map "doesn't exist".


The process would still be somewhat labor-intensive, since someone would have to apply significance to the landmarks.

GTA, specifically, doesn't use all the landmarks they find - they instead use rough facsimiles that give the same impression. It's really rather brilliant - a different design, but somehow it's familiar if you've been there or seen that.

Ultimately, for video games, they are still designed by hand, because of the "if it's not there, it doesn't exist" problem and for a host of other reasons. Unfortunately, for space reasons, most of it is non-interactive in a meaningful way.

There's some interesting work being done by ESRI that could hopefully lead to virtual city designs that are almost fully interactive. Imagine GTA where every building could be entered and every object could be interacted with because they are generated on the fly.

Sort of a realistic Minecraft. Very sort of, but still.


I would argue the "significant landmark" problem has essentially been solved as a side effect of online photo sharing sites (originally Flickr, now everything) -- the most frequently photographed things are the landmarks, and the number of photos scales with the significance of the landmark. When you search for "Rome" on Flickr, the clusters of photos that pop out as being 3D-reconstructable are precisely the landmarks.

See this popular work from 2009, "Building Rome in a Day": http://grail.cs.washington.edu/rome/

Quoting: "The data set consists of 150,000 images from Flickr.com associated with the tags "Rome" or "Roma". Matching and reconstruction took a total of 21 hours on a cluster with 496 compute cores. Upon matching, the images organized themselves into a number of groups corresponding to the major landmarks in the city of Rome. Amongst these clusters can be found the Colosseum, St. Peter's Basilica, Trevi Fountain and the Pantheon."


Google Chrome 38.0.2125.104 on a Xubuntu 12.04 LTS and a modern Nvidia graphics card... and my "device may not be supported".

I thought we were in the 21st century :/


Does WebGL usually work for you? What do you see at http://get.webgl.org/ ?


Hi, I downloaded the 3d model you have on your site. It looks really good. Here's a link to a standalone I made using Unity that allows you to move around the model (OSX only).

https://mega.co.nz/#!e9EmiaqK!89u5YTkFFGHFxpCVFNZnE22uetkLBY...

Is this the Highest resolution the model and textures possible on the camera currently?


Got it working: a few questions:

Is this photogrammetry?

If it makes an obj, why can't I move arbitrarily?

How does it compare to AgiSoft Photoscan or 123D catch?


You can call it photogrammetry, it uses both 2d imagery and 3d depth data from depth sensors (think Microsoft Kinect or Google Tango) to make the 3d model. The viewer limits your movement when you are inside to be able to project panorama images onto the model, so you get the image quality of 2d images while you're still in a 3d model. 3d scanning tech just isn't good enough yet that you would be happy with how it looked up close if you are trying to sell a house, for instance.

Don't know Photoscan or 123D catch enough to comment spesifically, but in general many other 3d companies focus on scanning small objects or features, while we do large buildings/indoor spaces/rooms better (faster/better quality/cheaper/more convenient) than anyone else I've seen :)


Thanks. Photoscan's worth checking out - there's a few people doing large spaces with drones, as well as the usual object scans you mention. See http://www.agisoft.com/community/showcase/ They've done churches, cliffs, and whole valleys before: http://www.theastronauts.com/2014/03/visual-revolution-vanis...

That said, Photoscan's UI is incredibly poor, and the software has bugs (particularly around CUDA) so I'd be interested in alternatives.

123D catch (from Autodesk) does cloud based processing like you guys.


That's amazing, even considering the $4500 for the camera (which will probably pay for itself in a couple of months if you're a small architecture/design studio)

I've lost touch with architectural 3D "capture" - who else is working on this space? Are there any consumer-level offerings?


Looks very cool, but can't get the demo to work at all.

On iPad it said "upgrade to iOS8", on W7/FF33 it said "Oops, something went wrong", on W8.1/FF33 it went all the way through the loading and then got stuck with just one pixel in the progress bar left to go. :-|


Hey, dev here, thanks for not giving up on the first try :) WebGL is still pretty new, so it has some quirks.. but you seemed to be particularly unlucky. Does WebGL usually work on your W7/FF33 setup? (does http://get.webgl.org/ work?) As for the W8.1/FF33 case, does it work if you try again, or is it still the same?


On W8.1 it cycles between "500 Internal server error" (with a blank page, and not just for the above link, but for the homepage as well) and the stuck case. When it gets stuck, here's the console from the firebug -

http://i.imgur.com/lNKLbj5.png

Typekit errors are due to the Referrers being blocked, this is very common and never disastrous. Disqus and Google Analytics are just blocked at the domain level.


On W7/FF33 - yes, the get.webgl.org works as do other WebGL'd sites.


Nice modelling. "Y" U NO have computers? This place looks like a wardroom :)


i can't make sense of the reoccurring subscription plans on the "buy a camera" page

"3 free models per month"

so the camera by itself (+ whatever software) doesn't just spit out a point cloud i can do whatever with?


Not relevant to the 3D model, but the text in this article is really difficult to read for me:

http://i.imgur.com/ctPgj2v.png

Chrome 38.0.2125.104 on a mac.


Doesn't work on FF 33.0 Ubuntu x86_64. It just hangs when it's almost finished with the loading bar.


This is really cool technology, I felt like i really was walking into a building I had never seen before.


Nice! I really like how smooth it is to walk around in high resolution images (3D model)! :)


Flawlessly on the iPhone 6 ios8.


really cool, but tbh I was a little frustrated with the navigation interface. Was there supposed to be only 2 degrees of freedom? I would love to turn my head without clicking the mouse.


Looks like the work of Autodesk ReCap.

http://www.autodesk.com/products/recap/overview


--To be clear, I mean it looks like, not is..


Y Combinator's physical space looks a lot different than I remember it too. It looks like they have expanded.


So cool.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: