Hacker News new | past | comments | ask | show | jobs | submit login
Grab is messing up OpenStreetMap data in Southeast Asia (techcrunch.com)
188 points by danso on Dec 20, 2018 | hide | past | favorite | 38 comments



"The problems came to a head in November when the Open Street Map Foundation’s board of directors rejected membership requests for “more than 100 applicants” from GlobalLogic, thereby restricting the number of outsourced representatives working on maps for Grab and other clients of the agency.

“There had been a mass sign-up of 100 new accounts on 15.11.2018 from India, most coming from one single IP address from a company “well known” to OpenStreetMap. There had been a larger amount of complaints regarding edits from that company, who provide “mapping services” to other companies,” read a circular issued by the board."

This is poor journalism. If you read the linked post ( https://lists.openstreetmap.org/pipermail/osmf-talk/2018-Nov... ) it says the complete opposite. The new members were not rejected. (The request to block them was rejected)


There's also the implication that without OSMF affiliation you can't edit the map, which is wrong.


Yes that's totally wrong. The OSMF (now) has about 1,000 members, and OpenStreetMap has ~4,200,000 registered accounts.


SE Asia is a mapping nightmare. There are frequent duplications, and Google Maps in unreliable at best.

My first gym appointment was on Mao Tze Tung Ave, Phnom Penh. I turned up at the correct number at the correct time, to learn that I was 30 mins ride away over the wrong side of town. This is common. House numbers relate to plot numbers, plot numbers relate to developers, street numbers relate to an original plan of the city that is unrecognisable now. Drench it in rain for six months and nothing looks recognisable or anything like what you started out with. There are three developers building the same number house on the same street now. You want a Grab? hahahahahahahhahhhahhahah


The meta issue here is that they didn’t have a plan to then validate the changes afterwards. And Grab’s contractors being from India know that streets can be unpredictable and change since it’s assuredly the same for them too and not just a quirk in Thailand. It’s an astonishing disconnect from reality based on western standards of paying attention, but I think this is pretty normal in SE Asia and causes a lot of problems. Hopefully they’ll be able to shed these habits at some point.


> Hopefully they’ll be able to shed these habits at some point.

...or, you know, companies like Grab can host their own data and provide their own services.


Having a single spot where broadly useful data can live is sort of the point of OSM.

Grab would already be using the OSM data in bulk to host their own services. That's sort of the model that the OSM community has pursued, the openstreetmap.org website and associated services are a tech demo with no service guarantee, so you wouldn't want to rely on them much for a business.


But then the data is tied up in their proprietary service. Like, I appreciate the attempt to contribute this kind of thing back, even if it was a bit ham-fisted.


> But then the data is tied up in their proprietary service.

It doesn't need to be proprietary. They can follow the lead of OSM and publish their data so that others can use and edit it.


OSM has full edit history like wikipedia, so 'undoing the damage' doesn't need to be hard.

They could have a simple flag on edits which say if they're done from satellite data, and if so what age the data is. Any real-world surveys shouldn't be allowed to be overwritten by older satellite data without good reason (eg. spam, or the real world survey was done badly)


Objects are dependent on each other and new, legitimate edits can happen on top of damaging ones which we would not like to revert automatically.

Yes, there are tools in place which can revert edits, but there is still manual checking required. The annoying thing is that companies sometimes go ahead and "invest" money into edits which turn out to be garbage and then volunteers need to revert or fix them up.


There are also a couple of subtle details of the OSM editing model that could make it a lot easier for a knowledgeable vandal to make changes that are rather more painful to undo than they should be.


One word. Merging. The database is massively multiuser, plus the process is far worse for cartographic data than for code.


You can add a "source" tag to changesets. But if someone doesn't want to use that, they can just turn it off.


I'm a little confused as to why Grab would hire a firm to update OSM via satellite images. OSM had to be more accurate than the outdated sat. images so why was there a need in the first place?


Plenty of volunteer mapping is based on satellite data. People justifiably get irritated when they actually do a physical survey and then it gets overwritten remotely.

As far as the imagery, DigitalGlobe provides relatively up to date imagery for most of the planet. The errors are probably more from overinterpreting the satellite data than from it being obsolete.


Not defending grab, but osm has lots of false negatives. Satellite image does not have to be outdated - you can get less than 6 months old satellite image accurate enough to extract roads and buildings from Digital Globe or Airbus.

Shameless plug: At tensorflight.com we have deep learning models that can do just that.


6 months old is outdated, depending on how fast things change on the ground.

Even 2 months old can be outdated (e.g. can be missing houses built since, or showing houses that no longer exist). In some areas, this can be a very significant effect.


For most use cases it's more cost effective to use existing, slightly old imagery. On the other hand, for some use cases, it's necessary to get a very fresh image and commision an airplane flight (or drone if the area is very small) - e.g. hurricane damage mapping.


It's more cost effective if your choices are to use the two vintages of imagery, sure. I understand why imagery can be stale.

My point was that OSM data can be much more accurate than the imagery. Or not, depending on location.


OpenStreetMap has a lot of things left to map. Sometimes, some correct data is better than no data.


I guess it depends on the sat image source which leads to costs.

I'm sure you can use the newest/best data for your company because you can pass on the costs to your clients.


We indeed pass the cost of imagery to customers, but it's flexible. I.e. given the location and customer requirements we have a decision tree that picks the best image source from multiple providers based on freshness, spatial resolution, cost, and some other factors.


How does that tree approximately look? Which providers are there?


> Which providers are there?

In the "high spatial resolution" - i.e. 50 cm per pixel or less the well-known providers everyone is using are Digital Globe, Airbus or Nearmap. Above 50 cm it starts to get hard to confidently see enough detail.

Google maps terms of service actually prohibits using satellite imagery for feature extraction ( https://cloud.google.com/maps-platform/terms/ ).

Bing is the best on the cheap end. Actually, Bing donated imagery for OSM mapping purposes ( https://wiki.openstreetmap.org/wiki/Bing_Maps#Bing_Aerial_Im... ).


I think that they wouldn't use only satellite imagery, but gps data from the fleet and user reports. See also https://qz.com/1481849/grab-southeast-asias-biggest-ride-hai...


When OSM got started, the idea was to use GPS track data almost exclusively, or at least for the finer resolution stuff. The satellite imagery available freely at the time wasn't that good anyway.

But cleaning a bunch of GPS tracks can be a real mess. Many GPS tracks don't include the error/accuracy measurements and just pretend to be infinitely-small points scattered in the general vicinity of what might be a road. I think people tend to overestimate how accurate their satnav or phone's GPS is, because in most uses it's "locked" to a road or other map feature.

Anyway, not an intractable problem by any means, but preprocessing and cleaning GPS data to derive the actual terrain can be quite difficult. Paying somebody to click repeatedly on a satellite image might be temptingly easier.


That was by necessity, IIRC: almost everything else would be under restrictive licenses. The open data landscape has changed a lot since.


Aren’t you able to purchase fresh sat images on request? I’m pretty sure I’ve seen these services online.


Fresh sat images are much more expensive than old ones


How much?


In many places, OSM can be improved with satellite imagery


Yes, agreed. In built-up areas it's not like roads move very often.

I wonder if someone couldn't do "deltas" of satellite imagery to produce some sort of heatmap of how quickly various areas are changing. That would give you an idea of what areas are 'safe' to map based on satellite imagery, and what areas are changing too quickly and need to be mapped some other way.

It would also give you a way of automatically flagging stale data on the vector map; if something hasn't been updated in 12 months and you know from imagery that the area is under heavy development, those map features are probably not trustworthy anymore.


Development Seed has a project Urchn that was trying to do something like that: https://medium.com/devseed/make-sense-of-urban-change-fabb9f...

The idea was to take satellite imagery of the same location from different timestamps and I believe use deep learning to determine if anything in the imagery changed.

I don't think they've released any code for that project yet though.

But for determining whether or not the imagery could be used in a certain region, it might make more sense to just compare the timestamps of the vector data and the imagery. For instance, if the road doesn't appear in the imagery, but the vector data timestamp is after the imagery timestamp, then keep the road. And don't add a road if it was covered by a map segment that was deleted some time between the imagery timestamp and the current time.


Not sure how that's going to be possible, but there's a need for big companies to "fork" open data before massively manipulating it. And merge with main system if everything works for them, hence contributing positively instead of screwing everything up.


That's the Mapbox way. /s


WTF

Grab is spending big $ to improve the maps and contribute back to the community.

How about less bitching and more helping them spend their $ better.

And a side note, Grab and the other products like it are revolutionising Asia.

They are allowing the poor to have more rides, giving the drivers consistent work, wiping out the tuk tuk mafias, are making the vehicles far more efficient and clean, allowing non local language people to use local people to take trips easily.

I've never seen anything like it. It's a revolution. It's how tech can truly change the world.


improve the maps

The contention of the "bitching" is that they are not actually doing that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: