Creator here. Glad that HN rediscovered my old side project. All the code is freely available on GitHub [1]. It's running on a tiny f1-micro GCE instance which is currently down due to the traffic. I don't have time to fix it right now, but it should resolve itself once traffic dies down. The data source is also several years out of date at this point, so the links it returns may not match current reality. Other than that, it should still work!
I'm currently building an AI email app called Shortwave [2]. I promise that has much better uptime and more consistent updates!
I have been using shortwave for quite some time off and on. I absolutely don't appreciate it's shifted focus / pivot on the AI hype train. Shortwave itself can use that focus instead to add more productivity features instead.
It's interesting to see you now pitching it as an "AI email app". If it's actually working you to gain more traction than may be it's worth riding the wave but please don't loose focus. It was not meant to be an AI first email app, it was a better Google Inbox, a better email client.
Definitely gonna try it out, I've been frustrated with Gmail's inability to sub-segment it's main categories of smart labels in a good way, and it's a long term battle to not miss important emails amid the clutter of time waster stuff. Tried another paid for client "mimestream" which delivers a real gmail desktop app experience, but doesn't really address this.
Somewhat related to Getting to Philosophy phenomenon [0].
Interestingly, there are only 3 degrees between Banana slug [1] and Ergodicity [2]. I thought there will be more and Philosophy [3] would be one of them. I was wrong.
Shameless plug, semi-related game I made some years ago based on Wikipedia links, where the goal is to find the page that’s _not_ linked to the others. https://havarnov.github.io/oddoneout/
All my searches end up 2 degrees from the actual subject I'm querying for, it doesn't show what's the exact relationship between the subject and the next node, so I can't see the connection. Example: Ricardo Darin -> Ron Gilbert
I think it is very interesting what pages are in the middle of two related ones less the obvious paths. For example, I tried Coinbase <=> Arthur Koestler and "Half Truth" is one of the internal nodes. Useful from a brainstorming point of view to came up with new concepts. Like one of the goals of the I-Ching produce randomness in your brain beyond the idea of prediction.
If I had to guess there will be a lot of them but there will be roughly six or so (give or take 10) links on average from one article to another. Even the notion of longest path will be debatable - I know this because people are ... people.
You can link any page from any other and you can categorise ad infinitum. Also you can use subpages. So in the end it does not really make sense to talk about a simple graph, within WP. There are at least three sets of graphs in the last couple of sentences and I haven't really tried to think about it.
> Even the notion of longest path will be debatable
The only debate would be between the diameter (longest shortest path, which is likely meant here) or the longest path (general version of the travelling salesman problem). The latter might be a bit slow on the entire Wikipedia dataset.
> Wikipedia also has "islands" of pages which cannot be clicked out of to other islands.
Wonder what the largest island can be where you can't get out. At some point you must have a person who invented, created, discovered, explored, dismissed, destroyed, ... a key concept of what is discussed in that island and from there you get to a country and then have the world.
Having links back might be a bit more limited (depending on categories)
The UI looks a little ugly, but the UX is really great! I love the fun facts while loading, and how you're able to test the app right on the landing page. The graph is cool too, and I legitimately feel motivated to show this to other people.
You need to pick something that is only linked to in its "parent" article. So for example Aenictus Raptor is only really linked to on the Aenictus (genus) page, which is almost only linked to by the Ants page.
Don't do anything related to countries, because almost every article lists a country. So it's easy to traverse between them.
I tried to go from 'Wernhout', which is the village I grew up in, to C++. Both have their wikipedia entries and also show up below the edit field when typing it in. Then it says 'Start page "Wernhout" does not exist'. Also, it removes the ++ from C++.
This is awesome! I was able
to validate some trivia: everything in Wikipedia leads back to philosophy…
All you have to do is search something and click the hyperlinks randomly and eventually you’ll end up in philosophy…
The way I've heard this explained tells you to click on the first link on the page that's not in the disambiguation section. Never found a loop so far, and it pretty quickly hones in on it.
Traversing the hyperlinks to find shortest path gives less interesting results. They should have considered some kind of weights to assign some importance to the edges like how many times two entities are mentioned together or if they are mentioned in the first paragraph of each other etc. This is much weaker than 6 degrees of separation.
I recall a website maybe around a decade ago that "sort of" did this, except it was called Hitler Hops - and it posited that any article on Wikipedia was only (i think) 6 articles away from getting to Adolf Hitler. You'd put in an article and it would find how many "hops"/articles it took to get to the guy.
Was an interesting concept, not sure if it's still around though (probably not).
I find this a helpful antidote to a lot of conspiratorial thinking. Namely the idea that you can draw a connection to one donation from a billionaire, to a foundation, to a person who worked at that foundation but then did another thing, etc.
The error in conspiratorial thinking about these links is the assumption that whoever is at the beginning and end of those chains are in close coordination. When really it's an ordinary phenomenon you see in any information that has any relational properties (people, economic transactions etc.)
It is certainly an achetypical graph problem, shortest path, of which there are many ways to solve. Some algorithms are very highly efficient. I've tried some graph databases and was quite surprised at how hard they were to express using their query languages and how hard it was to solve them efficiently. I've had better luck with library based solutions.
I'm currently building an AI email app called Shortwave [2]. I promise that has much better uptime and more consistent updates!
[1] https://github.com/jwngr/sdow
[2] https://shortwave.com