Hacker News new | past | comments | ask | show | jobs | submit login
Army photogrammetry technique makes 3D aerial maps in minutes (techcrunch.com)
125 points by jonbaer on Nov 17, 2019 | hide | past | favorite | 53 comments



I was looking for the deep learning. But there is none in the patent and I don't think there is any in the process.

"Method 100 automatically processes FMV data for delivering rapid two-dimensional (2-D) and three-dimensional (3-D) geospatial outputs." According to [0041] 'Method 100' is FIG 1? Looking at FIG 1, it talks a lot about clusters and metadata. From [0026] and [0039] and [0064] it seems like a cluster might be a collection of images containing a point of interest. [0089] says they do measure the altitude of the aircraft. Since they have the metadata of camera FOV they can place each image in a 3D world. I think they take a lot of redundant images and they have many views of each object of interest, which is enough to give them a (x,y,z) estimate for their points of interest.

"The increased processing speed of the system is achieved by configuration of system components to perform novel geometric calculations rather than by increased processing capability."

This gives me the idea that there is a ridiculous amount of data per scene.

I suppose when you control your own data collection, the methods can be completely different than what is worked on in academia. Compare with single image depth estimates, or even for depth maps from video for self driving.

I guess it's good that deep learning isn't being used here. Now that I think of it, it is not appropriate.

I like this part lol: "Method 100 further may be automated using various programming scripts and tools such as a novel Python script generator."


Yup, there is no need to involve deep learning if you have an efficient algorithmic solution pinned down. Too often people try to use AI as a substitute for thought on the normal, tractable problems.


The point is to reach the singularity, so by making computers smarter and people not think, you advance this project two steps for the price of one! ;)


I pursued this a little further out of curiosity and found the inventor's CV: [0]

He had a publication on there belonging to "9th NATO Military Sensing Symposium, 2017" from which papers/presentations are made public: [1], surprising! But I didn't find Massaro's paper there. I also searched for his "Videogrammetric Mapping from Unmanned Aerial Systems" but did not find it in ISPRS or elsewhere. A commercially available software that seems to use a similar method: [2]

The Techcrunch article is likely a derivative of the earlier Techlink article: [3]

[0]: http://mason.gmu.edu/~rmassaro/CV_Massaro_complete_20180816.... [1]: https://www.sto.nato.int/publications/STO%20Meeting%20Procee... [2]: https://www.pix4d.com/blog/underwater-mapping-videogrammetry [3]: https://techlinkcenter.org/new-us-army-software-rapidly-conv...


IME, introducing deep learning to a problem like this causes more problems than it solves.


I'll throw https://github.com/AIBluefisher/EGSfM into the discussion as an example for an AFAIK state-of-the-art structure from motion tool. It takes images/video, and reconstructs the relative positioning of the individual images/frames to each other, as well as the camera parameters required for multi-view stereo techniques like https://arxiv.org/abs/1903.10929 to generate meshes.


You have been able to do this for a long time.

Photogrammetry apps with a good, automated drone pipeline:

https://www.dronedeploy.com/

https://www.pix4d.com/

Generic photogrammetry:

https://www.capturingreality.com/

https://www.agisoft.com/

https://github.com/alicevision/meshroom

I don't really understand what's unique about this patent, Structure from Motion has been around for ages, from video too. Maybe they have some optimisation around the processing, which traditionally is quite compute intensive


Stereoscopy is very difficult in computer vision if you want accuracy. There are certainly many unexplored possibilities and strategies to match two images. Something optimized for buildings, trees, vehicles and their geometric shapes is probably likely.

What I ask myself is why they do not use a form of projection like lasers. If I were to build a drone, it would have a ton of lasers...


You want to use the sun as a light source. The power you'd need due to the numerical aperture being very small and thus most illumination not getting back to your drone reaches infeasible levels from about a few hundred meters onwards.


Photogrammetry isn't really about stereoscopy, it's about building a dense point cloud representation based on many images from different angles, then transforming that to a mesh (and often mapping the textures from photos onto it).

You can also do point cloud with lidar from drones, just tends to be more expensive and heavier - https://enterprise.dji.com/news/detail/how-lidar-is-revoluti...


Isn't that called stereoscopy? Matching points from different positions and your viewport coordinates lets you extract point coordinates via triangulation? That would net you a heightmap if you manage to match everything present in all images which then can be transformed into a mesh.

But sure, just having a good camera is probably a lot cheaper than using a projection of any form. Probably also significantly faster.


Drone laser scanning is a thing! It’s just more expensive, heavy and power consuming (so in a sense power consumption increase from two aspects).


You can do this yourself crudely from any airplane window. Before you get too high, take two pictures of the same target (building or mountain) within 1 or 2 seconds (try both, so 2 pairs of photos, because it depends on your altitude). Then open the 2 pictures in an app where they are overlaid and you can switch between them quickly (I use irfanview on Windows--still after all these years). Each eye will pick up elements of the corresponding right and left photo, giving an approximation of 3D.

https://en.wikipedia.org/wiki/Wiggle_stereoscopy

If you got 2 good pictures (good subject at the right altitude and separation) you could put them into a stereoscopic program and create a nice stereogram (either red-blue colored or side-by-side, depending on your preferred viewing device).

What the software in the article does is compute all the parallax from multiple images to give essentially a 3D scan of the landscape. It has been theoretically possible for a long time, maybe this is one of the first to do it successfully in practice.


Man, that default example there is terrible. The "street in Cork" is pretty good.

That old gif of the T-Rex from The Lost World is always a good example[1] but I suppose it's probably not licenced the right way for Wikipedia.

[1] https://i.imgur.com/DDce0hO.gif


This would be a very neat effect in a horror movie, like a monster running out of the shadows toward the viewer, and would probably work well as the colors are often desaturated in the darker scenes.


In theaters you'd run into the problem of only have 24 frames per second to deal with.


The street in Cork looks strangely toy-like. Kinda like a tilt-shifted photo, but without the tilt-shifting.


Probably because you don't get that much parallax in real life, but you would if it were the size of a small model.


Reminds me of the bouncing, dancing scenery from Bowser's Kingdom in the Super Mario Oddyssey: https://www.youtube.com/watch?v=PnT8w10euCk&t=57m0s (should start at 57 minutes 0 seconds)


> maybe this is one of the first to do it successfully in practice.

Photogrammetry software has been commercially available for a while. I played around with Agisoft photoscan [0] some time ago, and while the results were impressive, it did take a lot of human interaction to select the right source images and parameters.

I think maybe The Army has automated this pipeline to go from drone camera -> 3d map without needing a human.

[0] https://www.agisoft.com/


Exactly. I can't wait until I can do this with my own drone--though maybe I shouldn't hold my breath for the open source package because it seems to have military applications right now.

There's a 1000-ft hill behind my house, and I thought it would be cool to get a 3D print of it. You can do this now with some online apps, based on the terrain model that you see in google maps. But the public data is "only" 30m resolution (https://en.wikipedia.org/wiki/Shuttle_Radar_Topography_Missi...), whereas this technique could probably make a 1m resolution.


>Exactly. I can't wait until I can do this with my own drone--though maybe I shouldn't hold my breath for the open source package because it seems to have military applications right now.

I produce 3D models from photogrammetry from drones for the Engineering & Construction industry in Australia. The free software to use is VisualSFM.

I recommend flying your drone over your target in a sort of 4-circle Venn diagram, with the overlapping center the actual subject you want to scan. Have the drone pointing inward to the center of each circle while flying, the camera being at a ~55deg angle. You're essentially circle-strafing in 4 circles with a little overlapping bit in the middle.

Take video footage of the whole trip and extract individual frames... can't remember if VisualSFM does that for you or not.


If you've got a DJI drone, you can do this for tens of dollars using inexpensive software and services.

I've used the SparkPro iOS app (for mission planning and automated flight with a programmed set of waypoints with camera instructions).

And this service who'll stitch images together (for free for small enough areas. I needed a plot about 200 x 200m, and its didn't cost anything): https://www.mapsmadeeasy.com/

Takes a bit of thinking and experimenting to get a good result. You need to get 4 or more times overlap for everything you want good stitching and elevation data from. (And the DJI specs on their cameras are misleading. I wasn't expecting the FoV spec to be diagonal, so my first attempt didn't overlap as much as Idol intended...)


Depending on where you live, the USGS may already have LIDAR data for your area: https://usgs.entwine.io/


Thanks to all replies for the tips.


No worries. check out /r/photogrammetry for a 50/50 mix of hobbyists and professionals.


I used to do this with just my smartphone (and the default photo editing apps ok my phone). It's quite easy to make an image that fuses well, but it's quite hard to get a photo of human scale stuff that maintains scale correctly.


I suspect the default focal distance for smartphone cameras is a bit too wide-angle. I haven't tried recently, but zooming in a bit (say 1/4 to 1/3) would help, as would tracking the target (panning slightly so the main subject stays in the center).

It all really depends on how close the subject is. You need to exaggerate the separation a little but not too much. In an airplane, the time between shots (and speed of course) determines the separation. If you're doing the same thing on the ground, it depends in how far your subject is, but you can step 1-2 feet sideways for your second shot. You just have to experiment to find what works.

I have tried on an iPhone before, but the default "slide" between photos isn't fast enough to get the effect, so I had to download them to my PC. Maybe there's an iOS app for it now.

Edited to add: I knew someone in the age of film who screwed 2 SLRs to a piece of wood about 16 inches apart, for 3d landscape photography. He used 2 mechanical remote shutter releases to make sure the photos were in sync. A similar rig with DLSRs and electronic remotes should be easy and more precise (to get same settings and synchronous release). I'm surprised I haven't seen it done yet.


this has been a thing for some time for drones

http://opendronemap.org/

http://pix4d.com/

etc


Photogrammetry has been around for centuries. Post War, before digital imaging and fast computers, people would manually match features in images to derive the terrain models.

https://ibis.geog.ubc.ca/courses/geob373/lectures/Handouts/H...


As the article mentions what's new is a technique to render using only a single photo (instead of using multiple angles) and to do in real-time.


Please quote. My understanding is that they managed to find a way to do it without human supervision almost real time using video (=multiple images)



https://www.bbc.com/news/magazine-13359064

British intelligence used overlapping photography to create a 3D map of occupied Europe during WW2. In more recent times, the maps were used to identify possible archaeological sites in Scotland.


I have been looking for photogrammetry based 3D modeling software. Most existing photogrammetry solutions produce a point cloud, which usually does not result in nice geometry. Often, we as humans know that a certain object as a certain geometry. Adding that knowledge could result in much more accurate models. This type of software should be interactive, using human input to recalculate the 3D positions of certain points. It would be nice if this could be integrated with an existing 3D modeling package, such as Blender.


Any good photogrammetry software that can use handheld camera shots? I have a few hundred clay figurines my mother made that I have wanted to digitize in order to free up some space. I always thought 3d scanning required some kind of special scanner so this gives me hope.


Meshroom, as suggested by others. If you do this, I think you'd do well to spend a little time setting up a little "studio" for it. A rotating plate, e.g. a vinyl player to place the items on, plus even lighting, would help a lot. If possible, two cameras on tripod at different heights would be good.

You can do it handheld for sure, just make sure to have plenty of light and put the items on a small pedestal so you can get low angles of them without a table getting in the way. But for several hundred objects, it's going to take a while.



There are plenty of 3d scanning apps for phones. I don't know which one to recommend but at a glance this one looks nice

https://www.qlone.pro/


How can this get a patent today?


It's easy: You submit a bullshit patent application, wait to find out what criteria it was rejected for, add more meaningless technical bullshit related to that topic, then repeat the process until the patent office feels they have extracted enough fees from you. You are now free to sue companies for using technology that sounds like your patent.


One of the movies the South Park guys made used this technique to get an R rating out of some gratuitously NC 17 movie. They included some over the top 20 minute graphic sex scene and they cut and cut and cut until there was nothing left but a short gross sex scene and the "regular" movie.

See also the duck from battle chess.


They mention farming applications... but is there any chance of this actually becoming available for farmers?


The us government particularly the Us military has been responsible for quite a significant amount of scientific breakthrough, either through their ability to coordinate human resources and material resources towards a particular end point or their large possession of the later.

Sadly or Happily capitalism in particular venture capital investments in startups has largely surpassed them regarding the material resources part of the equation in recent times but they still suck significantly at the human coordination aspect of things.


The trillions of dollars "investment" couldn't have hurt either :)


is this substantially different from how we do 3d reconstruction from motion indoors?


It's approximately the same technique.

Outdoor 3D reconstruction is actually easier, because outdoors you have highly varied and detailed scenes, which enables dense feature matching and accurate dense depth map calculations. Also, you probably have decent GPS priors* and a preplanned flight path, which makes localization of each RGB image pretty easy.

* This is a big advantage that usually can't be replicated indoors. GPS is more accurate outside, and the significantly larger flight path means that GPS' main issue, low precision, isn't as much of a barrier.


Agreed, the GPS priors are very helpful -- the ability to specify prior known positions/orientations of photos, to some known error tolerances, is one of the power features in Agisoft Metashape Pro, which I wish was available in more photogrammetry software. Being able to know the actual nearby photos a priori and avoid the costly O(N^2) pairwise comparison is a great benefit.

Some research I did a few months ago was about trying to find ways to effectively bring in that advantage of outdoor photogrammetry into indoor environments, by piggybacking onto AR HMDs with built-in SLAM: https://www.youtube.com/watch?v=ldrGpGrOaZc


GPS is more accurate because the location of the satellites is known to a high degree. If you knew the location of the WiFi access points to the same degree you could get pretty similar accuracy indoors. Apple indoor mapping gets 5-10m with dedicated staff walking the space & inputting their position on a mobile map (wide open spaces is where the biggest challenge is). If more APs adopt WiFi RTT then that's actually going to give you much more GPS-like behavior indoors.


I'm not sure how useful a 5-10 metre accuracy is going to be for indoor photogrammetry. 10 metres cubed is probably larger than the majority of indoor photo sets.

Now, for a military drone, I think they get better accuracy than 5 metres, but also the distances involved in an outdoor dataset mean that inaccuracies in the initial position estimate are much less significant.


You can get accuracy down to a few centimetres thanks to RTK Gnss. Most professional drones offer that now, though it is usually rather expensive.


It's also a question of the precision of timing. The GPS signal is entirely designed around precise timing, whereas Wifi isn't. There are a number of variables like buffer lengths that can be measured around to some extent, but are never going to be as precise.


Just using the very coarse RSSI signal Apple was getting 5-20m level accuracy depending on the venue 5 years ago & Google gets ~15m accuracy indoors worldwide (different mapping techniques).

You're thinking at the software level though. Look up 802.11mc though - that enables cm-level mapping.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: