Hacker News new | past | comments | ask | show | jobs | submit login
Six Degrees of Wikipedia (sixdegreesofwikipedia.com)
294 points by EndXA 10 months ago | hide | past | favorite | 69 comments



Creator here. Glad that HN rediscovered my old side project. All the code is freely available on GitHub [1]. It's running on a tiny f1-micro GCE instance which is currently down due to the traffic. I don't have time to fix it right now, but it should resolve itself once traffic dies down. The data source is also several years out of date at this point, so the links it returns may not match current reality. Other than that, it should still work!

I'm currently building an AI email app called Shortwave [2]. I promise that has much better uptime and more consistent updates!

[1] https://github.com/jwngr/sdow

[2] https://shortwave.com


I have been using shortwave for quite some time off and on. I absolutely don't appreciate it's shifted focus / pivot on the AI hype train. Shortwave itself can use that focus instead to add more productivity features instead.

It's interesting to see you now pitching it as an "AI email app". If it's actually working you to gain more traction than may be it's worth riding the wave but please don't loose focus. It was not meant to be an AI first email app, it was a better Google Inbox, a better email client.


> I'm currently building an AI email app called Shortwave [2]. I promise that has much better uptime and more consistent updates! > > [1] https://github.com/jwngr/sdow > > [2] https://shortwave.com

Welp! It's already hugged to deathi guess


Definitely gonna try it out, I've been frustrated with Gmail's inability to sub-segment it's main categories of smart labels in a good way, and it's a long term battle to not miss important emails amid the clutter of time waster stuff. Tried another paid for client "mimestream" which delivers a real gmail desktop app experience, but doesn't really address this.


Very fun to play around with. Thank you!


No doubt! this is the UX!


Related:

Six Degrees of Wikipedia - https://news.ycombinator.com/item?id=28595821 - Sept 2021 (67 comments)

Six Degrees of Wikipedia - https://news.ycombinator.com/item?id=27444053 - June 2021 (1 comment)

Show HN: Six Degrees of Wikipedia - https://news.ycombinator.com/item?id=16468196 - Feb 2018 (324 comments)

Six Degrees of Wikipedia - https://news.ycombinator.com/item?id=201513 - May 2008 (7 comments)


Small point but the last link here (from 2008) is a different project with the same title.


Thank you!


This makes it onto the front page of HN like every couple years.

Jacob (its creator) is now my cofounder at Shortwave (https://www.shortwave.com). Check it out if you want to see his latest work :)


This was pretty fun. It was a challenge to find a link that got up to 6

https://www.sixdegreesofwikipedia.com/?source=Korea%20Squash...


Thanks for your comment! You may enjoy this blog post: https://www.sixdegreesofwikipedia.com/blog/search-results-an...


Somewhat related to Getting to Philosophy phenomenon [0].

Interestingly, there are only 3 degrees between Banana slug [1] and Ergodicity [2]. I thought there will be more and Philosophy [3] would be one of them. I was wrong.

[0] https://en.wikipedia.org/wiki/Wikipedia:Getting_to_Philosoph...

[1] https://en.wikipedia.org/wiki/Banana_slug

[2] https://en.wikipedia.org/wiki/Ergodicity

[3] https://en.wikipedia.org/wiki/Philosophy


Shameless plug, semi-related game I made some years ago based on Wikipedia links, where the goal is to find the page that’s _not_ linked to the others. https://havarnov.github.io/oddoneout/


I'll add my shameless plug: https://redactle.net based on uncovering a redacted Wiki article.


All my searches end up 2 degrees from the actual subject I'm querying for, it doesn't show what's the exact relationship between the subject and the next node, so I can't see the connection. Example: Ricardo Darin -> Ron Gilbert


This is very similar to and likely inspired by the Six Degrees of Kevin Bacon:

https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon


"Six Degrees of Kevin Bacon" is a play on "Six Degrees of Separation", which they link to: https://en.wikipedia.org/wiki/Six_degrees_of_separation


I think it is very interesting what pages are in the middle of two related ones less the obvious paths. For example, I tried Coinbase <=> Arthur Koestler and "Half Truth" is one of the internal nodes. Useful from a brainstorming point of view to came up with new concepts. Like one of the goals of the I-Ching produce randomness in your brain beyond the idea of prediction.


Interesting. I wonder what are the two most separated pages, i.e. pages with the longest path connecting them.


You go here: https://en.wikipedia.org/wiki/Wikipedia:Database_download Then you analyse the data and draw your own conclusions.

If I had to guess there will be a lot of them but there will be roughly six or so (give or take 10) links on average from one article to another. Even the notion of longest path will be debatable - I know this because people are ... people.

I've run a mediawiki based intranet for quite a while - I wrote this: https://www.mediawiki.org/wiki/Intranet

You can link any page from any other and you can categorise ad infinitum. Also you can use subpages. So in the end it does not really make sense to talk about a simple graph, within WP. There are at least three sets of graphs in the last couple of sentences and I haven't really tried to think about it.


> Even the notion of longest path will be debatable

The only debate would be between the diameter (longest shortest path, which is likely meant here) or the longest path (general version of the travelling salesman problem). The latter might be a bit slow on the entire Wikipedia dataset.


11. Embleton → McCombie

However, Wikipedia also has "islands" of pages which cannot be clicked out of to other islands.


> Wikipedia also has "islands" of pages which cannot be clicked out of to other islands.

Wonder what the largest island can be where you can't get out. At some point you must have a person who invented, created, discovered, explored, dismissed, destroyed, ... a key concept of what is discussed in that island and from there you get to a country and then have the world.

Having links back might be a bit more limited (depending on categories)


There's a website for this too! Unfortunately I can't remember the URL, and google isn't helping...


It's pretty easy to get 5 degrees with two obscure pages from completely unrelated categories, like a small train stop to a videogame character.


This is delightful!

The UI looks a little ugly, but the UX is really great! I love the fun facts while loading, and how you're able to test the app right on the landing page. The graph is cool too, and I legitimately feel motivated to show this to other people.


Interesting, the automated version of racing wikipedia pages:

https://wikispeedrun.org/

Though the path it gives from Manifestiny https://en.wikipedia.org/wiki/Manifest_destiny to Homo Sapiens https://en.wikipedia.org/wiki/Human seems to go through a hyperlink not found on the page? Might be buried in versioning or other languages though.


I built a similar website this summer https://wikisp.fanor.dev ! The source code is also open source https://github.com/haskaalo/wikisp

Its built from Wikipedia dumps in July. Links in articles shouldn't have changed much by now but updating the data set takes quite a long time!

One of my first project that taught me that gathering and cleaning data in data science is a long task, even if automated.


All post-2021 created pages don't exist. One gets a slightly misleading error messages that a page doesn't exist which is confusing when the page evidently _does_ exist. E.g. https://www.sixdegreesofwikipedia.com/?source=Ever%20Given&t...


My script has been broken since February 2021 :( Documentation on the data source (with links to old data dumps) is on GitHub: https://github.com/jwngr/sdow/blob/master/docs/data-source.m...


Found one with 6 degrees of seperation:

Geronimus Polynomials -> Aenictus Raptor

You need to pick something that is only linked to in its "parent" article. So for example Aenictus Raptor is only really linked to on the Aenictus (genus) page, which is almost only linked to by the Ants page.

Don't do anything related to countries, because almost every article lists a country. So it's easy to traverse between them.


It seems buggy to me. I tried Kodiak bear and Gdańsk (https://www.sixdegreesofwikipedia.com/?source=Kodiak%20bear&...). The site found a 2-degree path through Europe article. But there is no link to Europe article form Kodiak bear article


Maybe there used to be one and was later removed? There's no doubt some kind of indexing, so the data may become stale.


Had the same experience with one of the prompts: Venus Flytrap to Willy Wonka with a missing link from Uncle Sam to Willy Wonka.


This is also an interesting game on Wikitree, especially if you've connected a few of your own relatives: https://www.wikitree.com/index.php?title=Special:Connection&...


As an Unread Information Scientist I find tremendous value in this. It's something worth running locally for research purposes. Great work.


This is more fun than I expected.

I tried the strangest pairings I could think of.

The last one I tried was Barack Obama and Joanna Rutkowska [0].

[0]: https://www.sixdegreesofwikipedia.com/?source=Barack%20Obama...


It says United States links to James Cameron. I can't find where that link is on the page.

https://www.sixdegreesofwikipedia.com/?source=United%20State...


Author posted and said their index was out of date


I tried to go from 'Wernhout', which is the village I grew up in, to C++. Both have their wikipedia entries and also show up below the edit field when typing it in. Then it says 'Start page "Wernhout" does not exist'. Also, it removes the ++ from C++.


Captivating. It's very nice to see new Wikipedia pages I would have never visited before (https://en.wikipedia.org/wiki/Forest_informatics)!


I loved playing this game with friends. We never used this website though.

Instead, one of us would press random to get the "target" article, and then all of us would press random on our own for a starting point.

Whoever got to the "target" first wins!


This is awesome! I was able to validate some trivia: everything in Wikipedia leads back to philosophy… All you have to do is search something and click the hyperlinks randomly and eventually you’ll end up in philosophy…

Very small degrees of freedom so far..



The way I've heard this explained tells you to click on the first link on the page that's not in the disambiguation section. Never found a loop so far, and it pretty quickly hones in on it.


We had a game for awhile at work of finding paths from random articles to Rome. We never failed, so we stopped playing.



This is similar to Erdős number https://en.wikipedia.org/wiki/Erd%C5%91s_number


And it's been DDOSed by us at HN ;) "Whoops... Six Degrees of Wikipedia is temporarily unavailable. Please try again in a few seconds."


Get this only 54 paths with 3 degrees of separation from Yuri (manga genre) to the Lockheed Martin F-22 Raptor in 13.93 seconds. Got WOW!


> Start page "ChatGPT" does not exist. Please try another search.

Looks like the data is a little outdated.


Traversing the hyperlinks to find shortest path gives less interesting results. They should have considered some kind of weights to assign some importance to the edges like how many times two entities are mentioned together or if they are mentioned in the first paragraph of each other etc. This is much weaker than 6 degrees of separation.


I recall a website maybe around a decade ago that "sort of" did this, except it was called Hitler Hops - and it posited that any article on Wikipedia was only (i think) 6 articles away from getting to Adolf Hitler. You'd put in an article and it would find how many "hops"/articles it took to get to the guy.

Was an interesting concept, not sure if it's still around though (probably not).


Wikipedia cites 'Clicks to Hitler'


Pretty sure this was a homework assignment I had while I was in school


Yes -> No: 658 paths No -> Yes: 2 paths The struggle is real :)


I find this a helpful antidote to a lot of conspiratorial thinking. Namely the idea that you can draw a connection to one donation from a billionaire, to a foundation, to a person who worked at that foundation but then did another thing, etc.

The error in conspiratorial thinking about these links is the assumption that whoever is at the beginning and end of those chains are in close coordination. When really it's an ordinary phenomenon you see in any information that has any relational properties (people, economic transactions etc.)


Went from Vercingetorix to Olivia Rodrigo in four hops.


What does the relationship between the nodes mean?


This is an archetypical graph database query.


It is certainly an achetypical graph problem, shortest path, of which there are many ways to solve. Some algorithms are very highly efficient. I've tried some graph databases and was quite surprised at how hard they were to express using their query languages and how hard it was to solve them efficiently. I've had better luck with library based solutions.


Being an archetypical query, the shortest path is usually explicitly covered in the documentation: https://neo4j.com/docs/cypher-manual/current/appendix/tutori...


(2018)


Hugged?


To death


> Found 218 paths with 3 degrees of separation from Adolf Hitler to Wet T-shirt contest in 54.73 seconds!

Finally we know!


Lambda Calculus -> Lorena Bobbitt was 4 degrees.

I feel like the existence of pages like "List of notable people from Buffalo, NY" are sort of cheating this sort of thing.


I love this error page: "Sorry, little Internet Hipster. This page requires Javascript."




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: