Hacker News new | past | comments | ask | show | jobs | submit login
Modifying Telegram's “People Nearby” feature to pinpoint people's homes (owlspace.xyz)
232 points by todsacerdoti on Feb 5, 2021 | hide | past | favorite | 68 comments



Even the author’s suggestion of “close”, “far”, “very far” would reveal enough information, even with noise being added, to more or less pinpoint someone’s location. Measuring the transition from “far” to “close” would give you a data point, and enough data points would let you model (and therefore subtract) any added noise.

If your app really needs to have a feature like this, it needs to have aggressive rate limiting that makes it impossible to gather a statistically significant number of samples before someone changes their location.


Would this work?

1. Overlay the world with a hex grid.

2. Make "close" mean in the same grid cell, "far" mean in an adjacent grid cell, and "very far" mean somewhere else.

3. In sparsely populated areas, merge groups of 7 cells into a larger cell.

4. Add some kind of random delay to people who are moving around to reduce the information you get if you are stationary and they cross a cell boundary. The idea here is that if you are not moving and they are "close" and then they change to "far", you would know that they just crossed one of your cell's boundaries. In many places there might only be a small number of places where people cross those boundaries, and so you'd be able to narrow them down quite a bit.


Yes, this would work. The problem with random error is that it can be averaged out. The error vector must be consistent. One way to provide a consistent vector is quantization, which is as you propose where you assign a point deterministically to a nearby discrete set (in this case hexagonal grid centers).

This still leaves some issues with non-static positions, or temporal variation, as you noted in (4), where boundary crossings (or worse, movement near 3-cell boundaries) allow better precision. I think a good solution in this case is to also add temporal quantization (limited update rate of position), as well as some hysteresis (to avoid back-and-forth between cells for people living near borders). This way you cannot pinpoint the exact time the transition occurred (you cannot locate the person in space-time), and with hysteresis you cannot tell he is consistently near a border.

edit: Interestingly, all of those suggestions appear elsewhere in this thread! They all more or less appear forms of quantization.


I must thank you for your precise verbiage. The way you put this has given me a lot of clarity about how one can solve this particular problem, and is definitely going to be helpful in any future similiar issues I might face as well. Thanks for writing this out.


I think without quantizing you can trade off between error size and update frequency, no?


This would prevent precise triangulation of the stationary targets, but still leaks potentially unacceptable amount of information. Imagine an attacker who mapped out all of the cells. This is not too hard as you see close/far shifts when you cross cell boundaries.

Now place a device monitoring people nearby in each cell. This allows you to geotag people to the cell and monitor their movements. If you cross correlate with other tracking services (such as cell phone tracking datasets) you will be able to identify the individual pretty quickly.

For the above to work tracking does not have to be continuous. As long as a person sometimes is trackable you will accumulate information that allows for cross correlation. With intermittent tracking it would just take a lot longer.


I'm not sure rate limiting is really an option to still make it useful. Most people are at the same place from more or less 9pm to 5-6am occasionally shifted for people doing late shift work but most people have a steady home so you could gather that data over many nights.


Definitely true. Although, if it took a month to gather the dozens of samples the author used in the article rather than minutes, a dedicated tracker would probably have more success using more traditional stakeout methods.

And then, if you’re someone who is concerned about that kind of surveillance, you’re probably not someone who is sharing your location on an app like this.


Per-account rate limiting doesn't work so long as you can have as many Telegram accounts as you like.


Nah you just quantise location like filoleg described. No need for random noise or rate limiting.


I always felt that if I was going to provide a location that every time the query returns the result it would add randomness... but even then, with enough queries you'd be painting a circle over the user's actual location. So the real way to do this correctly is to lump people into the nearest intersection (like in the middle of a road intersection). And all queries related to that user would move them there. Then again, what about extremely rural people that have 1 intersection that pinpoints just them?

There's no good way to return location data about other people.


Here’s an interesting article on Tinder’s solution to this problem: https://robertheaton.com/2018/07/09/how-tinder-keeps-your-lo...


The article asks why they don’t just use grid snapping only.

If you lived on a boundary it would be very clear because your location would change very often, perhaps just by walking to the kitchen.


Perhaps you could add some sort of hysteresis such that it continues to report you as being in the previous grid square unless you go >1/2 a grid square distance away from it


It could use neighborhood. Use GPS to locate your neighborhood then give your location as the townhall or central park or whatever in that area.


Changing what you use as the boundary doesn't change the fact that if you're close enough to the actual boundary, you will jump a lot. So you have to go quite large with the boundary for it to limit the pinpointing. Having larger areas within your boundaries makes the feature much less 'useful' though.

Which granularity: Harlem/Hell's Kitchen etc? West Harlem/East Harlem etc? Manhattan/Long Island? New York/New Jersey etc.?

Feature wise you would probably want at least something like Harlem/Hell's kitchen granularity and there are unfortunately enough people living on the borders of all of these that you could pinpoint those just from GPS inaccuracies.


It can be easier. One could geohash the location with a certain precision, say 5 = ±2.4 km and only display people in that geohash and the neigbouring geohashes.

https://en.m.wikipedia.org/wiki/Geohash


The functional component of this technique is still just quantization (as in the parent). One might argue that this is actually more complicated.


Naively, for democratic countries, I'd imagine you could piggyback off existing political divisions, in the UK that would be electoral wards or shire/city districts. Such divisions should span a reasonable range of people, rather than a geography. In Scotland it would be "council areas".

Though maybe some places have political divisions with only one or two people in that would seem strange?


The service itself will know how many users are where and can make its own boundaries. "Same city" would often be pretty useless for finding people to meet immediately in real life.


Or you could add a 'salt' to the location, before adding randomness, so there is an unknown offset.


That’s a good idea, although if you were able to poll often enough and the location was always updated, you might be able to work out the offset by looking at positions when travelling (e.g. if travelling down a desert road). Probably an extreme attack vector thoughZ


I suppose instead of adding Gaussian random noise to the coordinates, you could draw the random perturbations from a power law distribution with long tails so it’s less clear what the “center of the circle” is from many random draws.


You need to lump people into higher and higher level "intersections" (district, city center, region, state, continent, planet) along with other identifying data, until at least "k" people are in each group and can not be told apart: https://en.wikipedia.org/wiki/K-anonymity


Wow, this is quite a throwback, because that's almost the exact same way of triangulating a location that me and a few of my classmates discovered in regards to Tinder around 6-7 years ago (which got patched up shortly after). Just a note, it was done purely out of academic interest and was not ever used in any capacity other than just finding out that it was possible.

Basically, Tinder used to give you only the distance to a user in the UI, but the exact coordinate location of the user in their API responses. They patched it up and made it so that it only returns the distance in the API as well. And that's when it became really similar to the current Telegram situation. This was before we got our hands on this. However, reading older articles and blog posts about those "times before us" was what gave us the idea. Mostly because it seemed like their fix was not sufficient enough to prevent anyone (assuming they know basic trigonometry) from pinpointing the location just as easily, but with a few extra steps added.

Knowing the distance between you and another user, you could quickly spoof your GPS location to 3 different coordinates that would create 3 circles all intersecting in one small area. You could easily pick coordinates for those circle centers based on the change in distance to the user, so if you picked a bad coordinate for one of the circles, you can adjust and pick a better one based on the feedback you got. E.g., if the first circle center coordinate was 1 mile away, but the second one was 5 miles away, and the third one was even further away, you should probably try re-picking the 2nd and 3rd circle better, since the goal is to not move those far away, but to have them have a similar distance to the user location, just from different directions.

Shortly after, Tinder fixed it in a much smarter way. Instead of assigning each user to a precise location and reporting a distance to them in the API response, they would break up the map into a grid of roughly 1mile by 1mile squares (or maybe hexes or maybe slightly different size? I am very rusty on the actual details of their fix, but the principle is still the same), and then assign each user to one of those squares. So the API would instead give you the distance between the center of the square you are assigned to and the center of their square.

AFAIK, that last approach they used to solve the issue is still unbeaten, and it makes sense as to why, since it is logically pretty robust at its core (plus/minus minor optimizations and improvements, of course).


> Tinder used to give you only the distance to a user in the UI, but the exact coordinate location of the user in their API responses

Wasn’t there another one that just sent out coordinates and left the distance finding to the client?


No idea about other apps, as it was all done purely for fun and for learning purposes, so we didn't really have any interest in potentially implementing the exact same thing but for different apps. Especially since there is no real fun or learning happening when the API just spits out the exact coordinates back at you. To clarify, this was already patched by the time we set our sights on it.

>another one that just sent out coordinates and left the distance finding to the client

That's exactly what Tinder used to do back then, they simply sent out the coordinates and left distance calculations to the client. Given that Tinder (being one of the most popular dating platforms that was also known to be one of the most "tech-forward" ones) did that, I have zero doubt that some other dating platforms could have been using similar methods for determining the distance between users around that time.


It's very interesting that you can use such a feature to triangulate a location, I never thought of that.

I discovered a "secure" and "private" dating app, that just sent the location of the users directly through the API, and then it was up to the clients to do the calculation: http://kaspergrubbe.com/teazr-a-secure-dating-app-with-secur...


Tinder used to do that as well with location data.

Also they used to send your raw birth date over the wire (in order to display your age to other users, calculated on the frontend) until I told them to stop.


The HelloTalk app works the same way. If a users chooses to share their location, it sends the raw GPS coordinates (afaik the only thing limiting them is the device's accuracy and if the user choose not to share "precise" location in iOS) to the server. Then, these get added to a sqlite database on the client and the way they get obscured is when you zoom in past a certain point on the map, their code turns off the dot that shows their position.


>you can use such a feature to triangulate a location, I never thought of that.

Of course this is exactly what people are going to do with that kind of feature. Sharing of location data is such an obvious thing to get exploited. It is part of the human base instinct is to take any new thing to the worst places it can go. There are certain aspects that the first question should be how can this get exploited for uses other than how we want to use it. If nobody in the room can come up with a way, then you need different people in the room.


> He was under the impression that using https and Parse everything was secure.

That's not something you want to hear from the developer of a "secure" service...


you're right, that _does_ look like a perfect location for a cozy couch


Tinder had a similar problem where you could triangulate a user's location to within 100 feet.

https://techcrunch.com/2014/02/20/problem-in-tinder-dating-a...


I read an account a few years ago of somebody using this "feature" during a military training exercise to find the "enemy" camps and call in simulated "artillery" strikes. The people on the receiving end got hopping mad and couldn't figure out how he kept finding them so quickly.


Similar account, but here's an article about Fitbit revealing soldiers/base information: https://www.washingtonpost.com/world/a-map-showing-the-users...


Hi. I work for Fitbit, but don't speak for Fitbit.

I am, however, curious about how people on HN talk about Fitbit, so I subscribe to HNWatcher alerts. Unfortunately, I only got an alert about this comment a week later.

If I did officially speak for Fitbit, this reply would be more diplomatic.

Personally, I'm losing patience with this lie being casually repeated so many times.

If you would even bother to read the article you yourself linked to, you'd see that it was Strava that revealed this information, not Fitbit. Fitbit's role in the story was to display the "Are you sure?" message when soldiers explicitly chose to share their Fitbit data with Strava.

There is no excuse for saying Fitbit did the revealing. If you were serious about privacy and security, you'd be careful about accusing the right party.


Clever and fair game. A real opposing force would do the same given the chance.


Well, hopefully, the soldiers would not be carrying and using their personal cell phones in a situation with a real opposing force.


I mean, one important method of causing them not to do that is to deal them a humiliating series of losses in a wargame.


I found a 4chan greentext story along these lines: https://www.reddit.com/r/Military/comments/9u2zib/the_proper...


A long time ago DeviantArt had a similar problem. If I remember correctly it allowed you to sort people by distance to you, or something like that. The feature was exploited in much the same way and they had to remove it.


Prior discussion about the same subject (229 comments): https://news.ycombinator.com/item?id=25641399


> One morning I woke up and found that Telegram implemented a new feature called “People Nearby”. If you choose to share your location publicly on Telegram, you’ll appear in a list for users who are physically close to you.

As always, it feels like this "discovery", at least for the headline (which, let's be honest, it's what most people read anyway) is based on glancing over the fact that this feature is opt-in, and that 99,999% of Telegram users do not use it. There's nothing in the headline indicating whether this is a critical data leak or simply expected behavior.

Sure, it definitely falls under unexpected usage of the data, but at the end of the day, the data was shared willfully through user action.


Second paragraph :

> If you’ve never heard of this feature and you suddenly feel the urge to delete your Telegram account forever, let me stress something very important: “People Nearby” is opt-in. By default, no one can see how far away you are on Telegram. You’ll only ever end up in other people’s lists by pressing the “Make Myself Visible” button. If you choose to try it out, remember to disable it once you’re done.


As I said, my issue is mostly with the headline.


Fair - although I'm not big on shifting the blame from people only relying on headlines to form an opinion. That's on them, so-called "clickbait" or not.


If you have "Last Seen & Online" set to "My Contacts", then those people will see you in People nearby even if "Make Myself Visible" is turned off


Do you have a source for this claim? I don’t find this exception documented.


I don't know why i am being downvoted. I tested then and i also have test now. I thought this was the expected behavior. It shows me(who is in their contact list and on completely different network) as online 100m away


> the data was shared willfully through user action.

So it's opt-in, even if you don't tell people what it does when you click that button?

The headline is totally accurate - "Modifying Telegram's “People Nearby” feature to pinpoint people's homes"

> 99,999% of Telegram users do not use it.

Given I can see many people around me this is not true, unless the population is around me is orders of magnitude bigger than it is.


Lol if this is intended behaviour, no one should use Telegram ever again.


I wrote few weeks ago a small app for this. So nearby users can be tracked anywhere.

https://github.com/tejado/telegram-nearby-map


> pinpoint people's homes

In the context of a city, I guess it makes sense that this is fairly anonymous.

In my moderately more rural part of the world, where owning a home is much more common, a person's house is part of the public record, and very easy to look up.

Granted that doesn't work for people renting, but the number of people renting instead of owning drops pretty dramatically once you start looking at demographics older than 25-30 around here.


If the author is here, they should look up geocaching tools; they will find programs that solve the trilateralization problem exactly using the same oblate spheroid model of the earth that GPS uses.


I wonder if this would work. Divide up the world into a grid. Once you have the cell associated with your GPS location, randomly pick a nearby cell. Stick with this mapping from cell to cell forever and keep this mapping only on the phone. Don’t send GPS coordinates. Send the cell instead. Have the server define nearness by cell nearness. The server should never send cells to other users, just their nearness.

This way, your actual location as fuzzy as how randomly you pick cells and the size of cells. You are also introducing a skew that the other party cannot compensate for. It’s your secret and by not repeatedly recalculating it, you aren’t vulnerable to the other party calculating the mean cell.

So if each cell is say 1 mile across and you pick randomly from say a 3 by 3 grid around your cell, then your location can only be localized to within a 3x3 mile square.

The tradeoff is that near isn’t so near but the tradeoff can at least be tuned by cell randomness and nearness. If you want to be more private, pick randomly from a larger number of nearby cells. If you want to know more people, widen the circle that you define to be near. Most people probably want defaults.

One difficulty may be when there are cells that are less likely to be populated. It may make sense to chop up the world into varying size cells based on a combination of area and population density.


I wonder about the idea of providing a product and emphasizing security... and providing an option in it that undoes all that security.

It seems like those are two incompatible / conflicting things.


How about replacing People Nearby with Friends Nearby where friends != contacts but a limited subset, maybe 10 so it's a list easy to keep control of, specifically enabled for this feature? You should offer a contact to track you and become (temporarily) a friend. The friend must agree to track you. Tracking is one way only. Unfriending somebody must be secret.


When an app asks for your location for the first time you can choose not to give an exact location. Doesn’t that fix this problem?


By default you don't share your location with other. You have to activate it yourself in the setting.


What I mean is that when any app asks the user for the first time to grant geolocation privileges, the OS shows a dialog where you can choose not to share a precise location. If you use that they cannot triangulate you that easily.


Which OS?

I think on Android such questions are only about energy preservation, e.g. low-resolution data can be provided via WiFi information, while more precise can require GPS. And if there happens to be more precise information available at the same time (e.g. you use Google Maps), then the app will receive that.

I haven't seen such a dialog, though, so I'm uncertain if it reflects the regular Android location precision system.


Android apps have to request [0] Location Permissions to access the device's location at all. There are different permissions for foreground and background access and for coarse and fine precision, but it does require some explicit permission no matter what.

Now, the app's behavior when denied that permission (i.e., whether you will be allowed to use Tinder at all if you deny location permissions) are up to the developer.

[0] https://developer.android.com/training/location/permissions


iOS. Decent discussion with pic of dialog:

https://9to5mac.com/2020/08/12/ios-14-precise-location/


>I believe that this is an exceptionally unneccessary feature for an app that prides itself with caring about their users' privacy.

Does the author mean "for the sake of people's safety don't sell them kitchen knives"?

This feature is intended for organizing outdoor events. And it proved itself quite useful, for example, during Hong Kong protests.


If only Fire Eagle had caught on. It's quantization method is much stronger than the way Telegram works.

https://en.wikipedia.org/wiki/Fire_Eagle


I wonder if the 100s issue can be addressed by spamming locations around the target and finding the center of the hole...


Then release the drones and target the user.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: