Hacker News new | past | comments | ask | show | jobs | submit login
Passive TCP/IP Geo-Location (geoloc.foremski.pl)
184 points by blaskov on June 26, 2016 | hide | past | favorite | 86 comments



I see the bug. If the ping times are high enough (~250ms) it will happily create circles which exceed the area of earth, and the google maps API will happily draw negative circles which exclude the user's likely location.

Really there should be a max function that shades the entire earth (or ignores it since that result really can't tell you anything).


If ping times are over 250ms the user's likely location is China (but could also be online via satellite, or in space). Of course, if the user is in China, Google Maps won't display.


Eh? RTT between London and Sydney for example is around 300ms and testing that page, I see a 316ms RTT from Singapore (in London).

Theoretical minimum RTT's based on the speed of light for systems on the opposite side of the world is nice and all but doesn't take into account the realities of packet processing on the internet.


Yes but speaking statistically, since virtually nobody lives in Australia or London versus China, the user's likely location is China.


..and those of actual cable routing (esp. on land), and the speed of light in fiber vs. vacuum, that alone adding 66ms for a great circle.


Or they're on Comcast with wifi in a large apartment building. I have a 250mS ping, and I'm definitely not in China.


I did this a few years ago, a simple tool that pinged places all over the world and calculated where you should be.

This only ever gets as close as your ISP, or perhaps a local routing center if your ISP has multiple (mine has only one in Amsterdam, I live on the other side of the country). This means it's less accurate than your typical GeoIP database.

Edit: I'm no longer sure this is the same. The title mentions "passive", and from another comment I understand that the author meant that you can find someone's location if you are the man in the middle, observing traffic to various places around the world. This is definitely something I had not considered yet!


It isn't passive. See lines 64-73 of the code [1]. This seems very similar to what you describe, but rather than you pinging a bunch of places around the world, it has a bunch of places around the world essentially pinging you.

If you check the HTML source, it's src'ing scripts from machines around the world (I assume whatever host is being used, the author has spun up a VPN in each of their data centers). Here's an example of one [2]. Each of these sends you a constant value over and over again, and the server side measures the latency, before finally reporting the average round-trip time to you.

[1]: http://geoloc.foremski.pl/tcpinfo.go

[2]: http://46.101.226.53/DE_Frankfurt.js


I'm thinking this might become pretty important with IPv6, no? My ISP advertises a prefix I can do SLAAC with, but they won't allow you to assign an arbitrary address outside that prefix (which seems reasonable-ish for routing). GeoIP databases will have a harder time with IPv6, but if ISPs only use prefixes like this then it'll still be manageable..


> This website demonstrates IP address geo-location by passively measuring TCP/IP round-trip times of web requests made to a few servers spread around the world.

I don't get how that could be called passive. This is making your browser issue requests or am I missing something?


I believe the author's point was you could determine the location by monitoring other applications and their network calls.

That being said, I highly doubt there are many requests being made to Bangalore, India and this appears to require requests to a wide array of locations all over the world.


Aside from the disproportional amount of traffic going to America (I'm from Europe), there was still a significant portion of traffic that went all over the world when I last checked for my private traffic.

Additionally I once made a map of destinations from our school's "security lab" network. This is 30 minutes of traffic: https://snag.gy/aJqrg2.jpg

The map is a modified version of the one generated by Wireshark (circles are proportional to the amount of data going from/to that IP address). It seems to use Maxmind's GeoIP database.


It requires precisely what services such as Cedexis provide.

You visit a website that uses one or more CDNs for its big resources, and uses Cedexis to choose CDN node. When you first visit, Cedexis instructs your browser to retrieve small resources from some/all CDN nodes. Based on timings and throughput, it then chooses a CDN node to use for big resources.

If Cedexis does its job well and obtains the right data for Cedexis' purposes, then the passive observer also gets optimally usable data.


On the contrary, it appears the most accurate servers are ones nearby, which you'd contact a lot if CDNs are doing their job right.


I agree passive is a bit odd phrasing. Though, with how many 3rd party assets websites load nowadays, I suspect this could be done without introducing any additional requests by passively monitoring the already-existing 3rd party requests.


I think the author means passive in a way that no known geoip DB is used.


That would be the opposite of passive: using a geoIP database is passive; probing from/to various locations is active.

Another comment mentioned someone monitoring your traffic and noting your RTT (roundtrip time) for various IP addresses around the world. That seems much more likely.


Not passive at all, as it loads content from different sources. Plus the circles are all over the place without any logic.


The ping times seem sensible, but I can't work out any correlation between them and the map. Just seems like a random collection of circles...


You're in the most heavily shaded region. Some of the areas reach most of the way around the world, so they look like a lighter circle that you're actually outside, some are smaller and look darker which you're inside, some are about halfway and look like a sine wave, but they're all really circles. (Damn you, Mercator!)


I assume each circle has a radius proportional to the rtt, thus the area in which the circles overlap is roughly where you should be. The limited amount of servers + wide range of rtt gives a massive margin of errors, but this is probably easily improvable through some extra servers and some kind of knn-ish analysis.


The best circle is about 1500 miles off from my actual location. The most overlapped area isn't even in the right continent, let alone country. I think massive margin of error is a bit of an understatement.


Well in my case somehow javascript places Singapore circles center to middle of Indian ocean close to Antarctica. San Francisco and Bangalore are in South America.


That's not an inclusive circle, it means anything within the circle is _not_ reachable within the RTT reach of ie. Bangalore or Singapore.


Delay component of the round trip time contains sum of: 1. delays in buffer queue 2. serialization delays 3. processing delays 4. propagation delays While propagation delay usually makes up the largest portion, it is not always the case. Furthermore, the routing efficiency varies widely, depending on IX peering and INET upstream arrangements of your ISP. It also helps to know the topology of the underlying physical networks, or at least major POPs.

This is like measuring intensity of the Sun with the naked eye.


I think there's a bug in the circle placement. They don't appear to be correct for me. I'm assuming they're supposed to be centred on the locations listed.


I though that too at first, but they are correct. It looks like this because these are circles on a globe. The outline of a very big circle around a location is the same as the outline of a small circle around the point on the opposite side of the globe.

Look at the color-tint to find out what is inside and what is outside a circle.


Ah yes, I see. My ping times to Singapore are so bad that it looks like a circle over Columbia/Ecuador, ie the other side of the globe, but what looks like a circle is actually what's outside the circle.


While GeoIP databases might be more accurate this method may in some situations provide location information of users behind proxies. lame self-reference, discussion on the topic: https://deadhacker.com/2011/03/13/predicting-location-of-one...


Sort of. You measure the latency from the user to the proxy + the proxy to the site, so what you end up with is the approximate location of the proxy and a radius around it where the user could be. But the radius is liable to be 3000 miles, and you can't even assume the user isn't really closer to the proxy than that because the user could be using more than one proxy or anything else that adds fixed network latency (including on purpose to prevent this).

Which is the same problem with using it to determine if the user is "too far" from the server. You can get extra latency (and thus false positives) whenever there is bufferbloat or a lame corporate network that routes traffic from New York to New York via a network appliance in California or similar.


3000 miles is still a relevant clue if the target users continental location isn't already known. Additionally even in this very weak form it provides useful data for anti-fraud metrics. For example by payment processors like Paypal that may use [total latency minus ping latency] to feed a metric indicating possible proxy use.

But I also assumed an investigative organization that took the idea further would use the latency information with a map of latencies between major internet exchanges. This would increase accuracy and usefulness, though still the entire idea only works to provide indefinite clues


Assuming that the user only uses one proxy, you can use GeoIP to get his proxy location, and then if you account for the proxy-webserver distance, you can get a pretty accurate geolocation.


> Assuming that the user only uses one proxy, you can use GeoIP to get his proxy location, and then if you account for the proxy-webserver distance, you can get a pretty accurate geolocation.

A congested network link can easily add more than 100ms of latency due to buffering. At the speed of light that's thousands of miles.

You can't even say that if the latency between the proxy and the browser is only 10ms then the user is physically close to the proxy, because you don't know if the user is running the browser on a VPS with something like X forwarding.


This is wildly inaccurate. I'm in the U.S. (not even a southern state) and it thinks I'm somewhere well off the coast of Peru. Perhaps my results are atypical, or there's a rendering problem? Perhaps this could be improved with more servers and a better algorithm for resolving the times?


Actually, what it's trying to tell you is that you're not in Peru. If you're 220 ms away from Singapore, you must not be in Columbia, since a speed-of-light transmission would take longer than that to make a round trip.

For me, the most specific fix is provided by New York and San Francisco. Together, they can tell that I'm somewhere in the US, Canada, or Mexico. (But, of course, you could have figured that out from GeoIP.)


Ah yes, that makes more sense. Thanks for the help. :)


Don't forget that this is a proof-of-concept and is not meant to be an reliable solution for you to replace GeoIP with, but rather a concept to play with thoughts.

The first thing that comes to mind would be combining this with peer-to-peer pinging - which if the swarm is big enough could potentially provide a fairly decent geolocation mechanism.


Really cool tool.

If he added a few more servers and set a max ping rt it would become a lot more accurate.

And it doesn't really replace geoip, but implemented correctly can be used very affectively for latency based routing (similar to what AWS has).


Honestly this does not seem to be a very smart thing to do at all. Your ping times is probably just going to reveal on which coast of US you are on (unless you're speedtest.com or something like that).

If you want city-level accuracy, at least in US, just use geo IP lookup: https://geoiptool.com/ This finds my city perfectly, unlike the link above (which says I'm somewhere on US West coast).


Geoiptool thinks I'm in Romania. As does Maxmind's free IP location database[1]. However, Google thinks I'm in New Delhi, until I grant permission to upload my actual location. Then it puts the marker right on my rooftop in Massachusetts.

I'm actually using a VPN (euro214.vpnbook.com), which, based on the name, should be somewhere in Europe. But who knows? If I disconnect the VPN, Google and Geoiptool both get my town right, and Maxmind is one town over. I'm using Verizon FiOS.

As for the geolocator, the ping times range from 172 ms (Frankfurt, Germany) to 524 ms (Singapore). Too large to be of use. With the VPN disconnected, it lucks out with a 12 ms ping to New York, resulting in a yellow circle accurately indicating that I am in the northeast US.

Looking at the no-VPN map[2], there are small circles corresponding the antipodes[3] of Singapore (purple) and Bangalore (blue). Without the New York ping, the map would be completely useless.

[1]https://dev.maxmind.com/geoip/geoip2/geolite2/

[2]https://i.imgur.com/nKw8u1D.png

[3]http://www.antipodesmap.com/


The whole point is to not use a geoip database and instead attempt to find your location from numerous server requests. Hence 'concept' as this wouldn't really be feasible for any real world usecase.


How does Geo IP Tool make these predictions? Like, where is the data it's associating my IP with coming from or how are they obtaining it?


they start with WHOIS data and add what they can from there to improve accuracy. Quite a few residential ISPs register blocks of IP addresses (subnets) to a city, town, neighborhood, etc.


Little Snitch[1] does a very good job of preventing this sort of attack. Even if they give names to the servers that make me think they're something I want to allow, the time I spend clicking the "Allow" button is well outside the margin of error of the latency measurements. I'm in NYC and the tool places me in Frankfort, Germany.

+1 for Little Snitch.

[1] https://www.obdev.at/products/littlesnitch/index.html


Yes, I use and love Little Snitch too, but don't you have a rule that says Browser = Allow all?


Nope.

It only took me a few days to work out whitelists/blacklists for the sites I use often, i.e. most citicards.com subdomains get an "allow" but cardoffer.citicards.com gets a "deny". On other sites I come across it's usually trivial to whitelist the domains that provide their functionality, and most adservers and tracking servers I've already blocked.

Given browsers are the main place I get tracked, putting an allow all for my browser seems to defeat the purpose.

That said, it was pretty annoying the first few days.


I see. I use uBlock Origin for the browser, and Little Snitch for application level connections. Similar strategy though, starting with deny all, and gradually building a whitelist to get sites functional.


Nice tool, is there something similar (in terms of UX) for Linux?

For browser I usually use Noscript... Not the same because it only blocks JS files though.


I have a question about geolocation when using Google Compute. All of the geolocation services identify my server as US based, when in fact it is located in asia (ping and traceroute reveals).

Was wondering perhaps HN big heads can explain it to me:) I suspect it is somehow related to how Google operates its SDN and that IP is assigned to massive AS block.

Nevertheless, I think there is a decent performance risk when CDNs serve from US based cache rather than local.


GeoIP databases work based on the _owner_ of the IP address. The owner can cooperate with a geoip database and provide more granular locations for the blocks they use. Others, like Google apparently, don't and as such, you'll show up as the registered address for the allocated block.


CDNs mostly use network-level data to find the "closest" and/or highest performance cache, not geography. if your server is in Asia, most CDNs will be able to calculate that Singapore, Tokyo, Hong Kong, etc. have a lower latency to you than a US one that more closely matches WHOIS data for your IP address.


Nice idea. Digitalocean's Singapore routing seems a bit terrible. From Perth I get 50ms to a hop in Singapore (aarnet), but the endpoint is >100ms!


Seems a bit off for me in Virginia, USA.

http://imgur.com/TblQSSu


Looks about right, the circle around Columbia is not shaded inside - that is, the center of the circle is on the other side of globe. So Virginia lies in the most heavily shaded region.


Smart, but I don't know how granular you could get unless you had known servers at every possible datacenter.


For me it roughly the right continent where I am (if I am on VPN, otherwise it also include greenland and arctic regions).

But with servers in more locations, and possibly accounting for network topology (could be mined with traceroute?), accuracy could go quite up.


Just use a CDN with decent edge coverage. I'm sure those results can be improved drastically.


You did notice that the backend code is Go code snippet that needs to run at the end points?


So? I'm sure some CDN out there allow you to run your own code, hence it could be more granular (as OP was interested in).


I've created slightly improved version here: https://news.ycombinator.com/item?id=11993451


I'm not impressed so far. The demonstrator on the linked page found out that I live... somewhere in Europe.

You could get a more precise location by simply reverse-DNSsing my IP address.


Works for me! Correctly identified my location as Germany.



Mine is also a bit off... http://i.imgur.com/klwgKbS.png


I think that means you're in Europe.


I am. How do you see this?


The circle around the Indian Ocean, and the 2 near/over South America are inverted. _Inside_ the circle is where you aren't. As such Europe is the place where all circles overlap.


Tried with Tor. Not even close.


Am I looking at the area in which the majority of circles intersect, a la Venn?


Intersect? In my case they intersect in a weird way: http://i.imgur.com/eFqyCEq.png Those circle centres don't even look right.


It's obvious. You are either in the Pacific Ocean or the Atlantic Ocean!


>The common area of all circles show your likely location.


OK... I get that. But, where am I? http://tinypic.com/r/2ushgus/9


Western Europe I think?

It's confusing, some of the circles are shaded inside and some outside. I think the idea is if the ping is high it covers the whole globe except a circle around the antipodes (shaded outside), and if the ping is low it only covers around the server (shaded inside).


Mmm, right. Close but not quite. I like the idea but I think the visualisation needs some work. Too many people seem to have a hard time making sense of it.


So it approximated me being in Frankfurt. Not bad, just about 100 miles away.


Not very accurate - got my location incorrect by over 10000 km.


I apparently live in the middle of the Caribbean. Ha. I wish.


earlier it had me in California/Oregon.

now it shows me in Ecuador?

i am actually on west coast canada


it even doesn't show a continent properly where I am .


Only works if JavaScript is allowed by default.


The demo, yes. The attack, no.


this puts me in one of about 6 locations in the world each with an accuracy of several thousand kilometers. probably because i suffer from quite unpleasant packet loss. but not convinced it works.


Some of the circles are inverse circles. I thought it was inaccurate at first too, until I realized the India result was telling me "you're NOT in Peru".


Just done it from the office. I'm definitely "not" in the ocean. http://i64.tinypic.com/28ujms7.png


Lol! Looks like this is saying:

Green: Not in the Indian Ocean.

Blue: Not in the Pacific near Ecuador.

Yellow: Not in the Southern hemisphere.

Red: Somewhere in Europe, Greenland, North Africa, or Svalbard.

In your case, the red circle is the only one that really matters. Yellow took a tiny slice out of it near Iraq.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: