From a mathematical perspective, this is actually not that surprising.
Think of the act of clicking the first link in the Wikipedia article as a function that takes in the page you're on and outputs another page.
If you call this function on itself over and over again (ie use the output of one step as in the input of the next step), you will eventually enter a loop. The proof of this is simple: there are only a finite number of Wikipedia articles, therefore you must eventually reach an article you've seen before. (This is the same reason systems with a finite number of states cannot be chaotic.)
Since it's necessarily true that all articles will eventually reach a loop when you iteratively click the first link, we have to ask: how unusual is it that they usually reach the SAME loop?
The fact that many lead to the same loop(1) of Argument <-> Logic in unsurprising, too. Wikipedia articles usually define their subject in in the first sentence. Defining things works by saying what general category they belong to, then by differentiating them within that.
E.g.
A fish is any member of a paraphyletic group of organisms that consist of all gill-bearing aquatic craniate animals that lack limbs with digits.
Google Inc. (NASDAQ: GOOG) is an American _multinational corporation_ which provides Internet-related products and services [...]
A table is a form of _furniture_ with a flat and satisfactory horizontal upper surface [...]
And so on. If you always walk up the abstraction chain because you're picking the first link (the general category), you'll end up at the root of the categorization tree, which is likely something along the lines of Argument, Logic, Fact etc.
Note however that the current state of wikipedia has a different root loop for me, Logic leads to Philosophy.
(1) or one of a very limited number of loops, I also saw Science <-> Knowledge.
Another way of thinking about it: There are N nodes that loop back to themselves. When starting at a given node, what are the odds that you'll never hit one of these N nodes? Additionally, we know that many of the N nodes are very general ones, like "truth", under which many topics roll up.
Agreed. I'd love to know how many different cycles the wikipedia article graph contains. If most articles lead to 1 out of 20 possible cycles, it's much less interesting than if it's 1 out of 5,000,000.
I've been looking to do something like this with the Networkx library in Python, though on an internal company MediaWiki with a much smaller article base. Going to try to visualise much the same thing, the major loops and cliques in the graph.
This is the same reason systems with a finite number of states cannot be chaotic
Without more information, I don't think this is true. It is true if the transition function between states depends on a finite number of previous ones (WLOG if s_n is a function of only s_{n-1}), but I think that it isn't if the transition depends on an infinite number.
(Though, in this case, clearly there is no history, so what you say is true.)
If your model only depends on at most a maximum fixed length of history, you can always model your finite systems to only depend on the most recent state. (Just make a copy of the history part of that state.)
In the case of depending on up to an infinite history, I wouldn't call that system to have a finite number of states any longer.
This may be hyper naive but is this the consequence of beginning most articles representing the current topic as a more specific form of another topic?
ah so thats why the degree is called Doctor of Philosophy?! Thanks for the enlightenment! I've wondered quite a lot of times why I'm a Doctor of Philosophy in Computer Science, and not Doctor of Computer Science.
Traditionally, a university had four faculties, the lower or artists faculty, and the three higher faculties of theology, law and medicine. Students would start in the artists faculty learning the seven liberal arts (grammar, rhetoric, logic, arithmetic, geometry, music, and astronomy (including astrology)) to become a magister artium (M.A.) and then go to one of the higher faculties to get their doctorate (Th.D, LL.D or M.D., respectively). Philosophy and science (née natural philosophy) developed in the artists faculty and over time became important enough to grant the artists faculty the right to grant doctorates, too, with philosophy leading the way. That's why many science faculties, which split off of the artists faculty over the centuries still grant the title of a doctor of philosophy.
Ah, and the first three of the liberal arts that were the first thing a student learned, grammar, rhetoric and logic, were called the trivium, hence the modern word "trivial" for obvious things everyone should know.
In the above - grammar, rhetoric, logic - formed the "trivium" and - arithmetic, geometry, music, and astronomy - formed the "quadrivium".
Also "trivium" represented the place where three roads would intersect. People would meet here and exchange pleasantries and gossip. That is how you get the current meaning of the word "trivia".
Very interesting, thanks. http://atilf.atilf.fr and Dictionnaire historique de la langue Française both see two meanings for the French "trivial": one is common as in commonplace, and the other is common as in gross and vulgar. The first comes from /trivialis/, as fhars says. The latter comes from /trivium/, the crossroads. I don't know if English also has the second meaning: vulgar.
I get that this is tongue-in-cheek, but philosophy as in "Doctor of Philosophy" doesn't refer to the field of philosophy per se, but to the more general concept of "love of knowledge."
Not sure why you are getting down votes, but checking over Wikipedia confirms your comment
In the context of academic degrees, the term "philosophy" does not refer solely to the field of philosophy, but is used in a broader sense in accordance with its original Greek meaning, which is "love of wisdom". In most of Europe, all fields other than theology, law and medicine were traditionally known as philosophy. The doctorate of philosophy as it exists today originated as a doctorate in the liberal arts at the Humboldt University of Berlin, and was eventually adopted by United States universities, becoming common in large parts of the world in the 20th century.[1] In many countries, the doctorate of philosophy is still awarded only in the liberal arts, which is known as "philosophy" in continental Europe.
This is beautiful. I've always thought sites like IMDB and Wikipedia are hypertext expressions of the purest form.
As a nerdy kid, over a few years I read my way through a significant chunk of the early 1990's Groliers Encyclopedia set my Grandma bought me at Krogers. And it was never cover-to-cover reading, instead it was filled with hopping around, page-to-page, volume-to-volume. Start at oscilloscope, but what's a cathode ray? And then it's in television? How does broadcasting work? I'd sit on the floor with a half dozen open encyclopedias open around me, the same way we can do now with browser tabs.
Wikipedia has many flaws, but it's a fantastic tool and I'd have killed for it as a 12 year old with 5 volumes of an encyclopedia in my backpack!
Ah yes. To me (to my productivity's detriment...) quite often when I visit Wikipedia it becomes a learning adventure. From one thing you can hop to the other, finding out all sorts of interesting and obscure things. To a curious mind like mine, Wikipedia is heaven!
For myself, I would broaden this and say that almost every link I visit on HN, or every forum post I read leads to more than 1 additional link. Thus three windows of Chrome open with 10+ tabs each :)
I find this drill-down effect to be quite illuminating, if not productive. Strangely, it seems to provide both depth and breadth of topics in relatively equal measures.
I have a site: http://TheWikiGame.com that makes a game out of a related idea (finding the connection between Wikipedia articles).
Recently, I've been giving access to the game data to people (like university researchers, etc) to test different theories on path connections made by real people, etc.
The game has now been running for over 3 years, has about 1.19 million players, playing over 1.37 million games, with about 1.22 million won games (successful start/end article connection).
Got a really cool application for all the game data? I'd love to hear: alex@thewikigame.com
I have a kind of Wiki game that I play aloud with friends. Person A names a topic. Person B must guess whether Wikipedia owns the top Google (incognito/not signed-in) result for that term. Person B gets a point for guessing right, or Person A gets a point if Person B guesses wrong. The fun is in coming up with "stumpers.'
As they describe in the page, when you describe something, you are trying to place it within a general framework.
In trying to describe a domain in an encyclopedia, it does little good to use language specific to the domain. If your reader doesn't know what Biology is, it's little use trying to describe it in terms of Microbiology and Immunology. You have to use more general language.
There are are few words more general and meaningless in our language than "organization".
sorry to say, but couldn't you have thought about that before posting it on HN? you're on the front page, if your site really runs without caching, you must be wasting an enormous amount of WP's resources.
Either philosophy is a dead end, then, or it is the base of all knowledge. Depends on whether knowledge is a tree or a graph, which is... a philosophical question!
That is, however, a weird Wikipedia page. It seems like the first link is a target to another location on the page, which (I'd assume) would put it in an infinite loop.
What happens for Biology? http://wikiloopr.com/Biology. For me, it goes down until it gets to Biology, and then highlights Biology over and over.
Biology
Natural science
Science
Knowledge
Fact
Proof (truth)
Argument
Philosophy
Problem
Doubt
Belief
Proposition
Logic
Reason
Human
Mammal
Class (biology)
Biological classification
Organism
Biology
The first link on the Wikipedia page for Biology is "Biology(disambiguation)", but the first textual link is "Natural Science". I assume it loops on Biology because it keeps going back between Biology (the article) and Biology (disambiguation), but that doesn't explain why the first redirect is to Natural Science in the very beginning.
Am I just missing something here?
Edit: This no longer happens. Now it properly loops on Philosophy and Proposition.
This is a superior visualisation in my opinion - it actually builds you a tree of links as you enter subsequent searches. Quite interesting to see how article chains cluster into 3 or 4 different "main branches".
I can't replicate it, but I encountered a weird bug. I entered "Star Trek" and hit Enter. The suggested items list appeared and I clicked on "Star Trek". Page titles appeared as normal, but everything appeared twice, like so:
Star Trek
Star Trek
Cinema of the United States
Film
Cinema of the United States
Recording
Data
Level of measurement
Film
...etc until the last page title, "Reality"
Seems like two processes were running in parallel and outputting to the same stream.
I love wiki. I spend hours following links. Having loops is a feature or else I might never stop.
Greek City States ->
Polis
City
Human settlement
Statistics
Data
Level of measurement
Stanley Smith Stevens
United States
Federalism
Politics
Art
Human behavior
Behavior
Organism
Biology
Natural science
Science
Knowledge
Fact
Proof (truth)
Argument
Philosophy
Problem
Doubt
Belief
Proposition loops to Philosophy
When images appear first in the page's source, the wrong link is chosen. I assume you are trying to go for the first link, as a human reader would see it.
I think this is fairly trivially explained in most cases by the fact that Wikipedia is an encyclopedia. Almost by construction, nearly every article starts with "X is a Y" (modulo grammatical niceties) where Y is a more general class to which X belongs. In cases when the article doesn't follow that form, the first linked term is usually still explanatory or definitional in some sense and hence more general, and cases where that doesn't apply (e.g. Obstacle) are unusual enough that they don't generally break the cycles. There are no "Wikipedia axioms" that I know of, so this is pretty much guaranteed to end in a cycle looping through a limited set of terms used to talk about abstract concepts. Hence Philosophy.
Does this hold true with following the xth link on the page? I'd predict it does, but the further x is from 1, the longer the average chain before you get to the loop (by virtue of definitions being at the top of wiki articles, then at some point the average length would hit a consistent level). Would love to see this modified to let you determine x.
Edit: Looks like somebody is editing certain articles that don't have philosophy as the first link to have it. Psychology used to loop with itself through like 5 intermediates.
After having twenty tries that got me into the philosophy loop, I started to think what are the most basic concepts that philosophy is "made of". Maybe I could escape the loop that way?
And lo and behold, I tried word "word" and it loops between "emotion" and "psychology". http://wikiloopr.com/word
Now it loops between "Human" and "Reason". I think there are two secret cabals at work. One to break the Philosophy link and one to thwart these evil schemes.
This reminds me of a game I used to play as an intern with other interns to kill time. Someone would pick a Wikipedia article deep down in the bowels of Wikipedia and we would race to the article from Wikipedia only using links. We could get the article surprisingly fast, most within only 5 intermediate page visits from the homepage.
This behaves erratically. Sometimes it considers the coordinates link, above the entire article the "first link" (in the "Michael Phelps" chain), sometimes not (Chicago). Consequentially, I believe "Michael Phelps", when the coordinate link is ignored, leads to it's own loop that is not the philosophy loop in question.
Nice demo, but I think it is an incorrect asumption that the first link is to a broader subject. For example, for Perú it says: "It is bordered on the north by Ecuador and Colombia, on the east by Brazil" So Perú->Ecuador left me with a big wtf..
Funny... As I was playing with this, somebody changed the order of:
"In philosophy and sociology..."
On the Agency (philosophy) page. This changed the output of the results, but in the end the loop still existed. However, I couldn't help but feel like someone is trying to hack the matrix.
Puts me in mind of Translation Party[1] (which is back up, kudos to WillC and Rick!). There's a lot to be entertained via this sort of vertex-following algorithm.
You could optimize this a bit by caching the loops, I noticed some pauses between Fact and Truth, seeing as how that loop always comes up it would be nice to cache it.
Right now it appears to be looping between Natural Science and Physics. But they're really far apart so you might not seem them on your screen at once.
Somehow there was Emotion/Psychology loop there. Now it goes even further, than yours image, the loop is Argument/Philosophy now. Clearly, wikipedia is changing or tool was enhanced :)
yesterday, the loop was always the same. since this made it to the top of hacker news, the loop has been changing. look at the revision history of some of those pages and you'll see that people on wikipedia are messing with the loop.
Same but at the boring callcenter job I had at the time. Dell Enterprise support, we would get virtually no calls sometimes and we would make up shit to find on wikipedia and whoever found it from a specified starting point in the least amount of jumps won.
Think of the act of clicking the first link in the Wikipedia article as a function that takes in the page you're on and outputs another page.
If you call this function on itself over and over again (ie use the output of one step as in the input of the next step), you will eventually enter a loop. The proof of this is simple: there are only a finite number of Wikipedia articles, therefore you must eventually reach an article you've seen before. (This is the same reason systems with a finite number of states cannot be chaotic.)
Since it's necessarily true that all articles will eventually reach a loop when you iteratively click the first link, we have to ask: how unusual is it that they usually reach the SAME loop?