Hacker News new | past | comments | ask | show | jobs | submit login

So... it's like a Freebase clone?



If I understand it correctly, it has more modest goals. Freebase was trying to make the uber-map of all data entities. Wikidata is just trying to make data reuse easier on Wikipedia.

For instance, imagine a table of all the populations of the countries of the world. Today, someone might make a really good one for the French Wikipedia. But then someone has to make it from scratch, all over again, for the Greek Wikipedia. And when someone updates the French one, the Greek one doesn't update, and vice versa.

With Wikidata you can define the data once, and then transclude it to different pages, with translated labels if necessary.

The first release attacks the problem of "inter-wiki links". On the left hand side of some Wikipedia pages, there are the links to equivalent pages in different languages. Check out the one for http://en.wikipedia.org/wiki/Jimmy_Wales, for instance. Right now these are updated with a system that looks at every possible connection (scaling at O(n2)), and with Wikidata it will be more manageable.


Interestingly, interwiki links sometimes have some semantic drifting effect.

from Omnipedia http://brenthecht.com/papers/bhecht_CHI2012_omnipedia.pdf [pdf]:

""" One major source of ambiguities in the ILL graph is conceptual drift across language editions. Conceptual drift stems from the well-known finding in cognitive science that the boundaries of concepts vary across language-defined communities [13]. For instance, the English articles “High school” and “Secondary school” are grouped into a single connected concept. While placing these two articles in the same multilingual article may be reasonable given their overlapping definitions around the world, excessive conceptual drift can result in a semantic equivalent of what happens in the children’s game known as “telephone”. For instance, chains of conceptual drift expand the aforementioned connected concept to include the English articles “Primary school”, “Etiquette”, “Manners”, and even “Protocol (diplomacy)”. Omnipedia users would be confused to see “Kyoto Protocol” as a linked topic when they looked up “High school”. A similar situation occurs in the large connected concept that spans the semantic range from “River” to “Canal” to “Trench warfare”, and in another which contains “Woman” and “Marriage” (although, interestingly, not “Man”). """


Wikidata is trying to make re-use possible beyond Wikipedia, as well. In fact there's already a couple of apps built with it. Here's a trivial genealogy visualization using the API:

https://toolserver.org/~magnus/ts2/geneawiki/?q=Q1339

It's also worth noting that Freebase itself heavily relied on parsing Wikipedia database dumps to build its ontology -- to a large extent Wikidata is giving structure to data that's been in Wikipedia all along.


Thanks, it makes more sense now.


For one thing, Wikidata data has a different intellectual property regime.

Wikidata data is dedicated to the public domain, using http://creativecommons.org/publicdomain/zero/1.0/

Most Freebase data licensed under a CC-BY license. Details are here: http://www.freebase.com/policies/attribution

A CC-BY license can be a burden, if you really want to fulfill all the terms of the license, namely: "You must attribute the work in the manner specified by the author or licensor." If there are 20,00 authors, are you really going to find out how each one wants you to give them attribution? It's impractical, so what you end up doing is giving what you think is reasonable attribution. But you never really know for sure.

Even worse, some of the material in Freebase is under other licenses, such as CC-BY-SA or GFDL.


Similar but not exactly. To my understanding, Wikibase was introduced to avoid the duplication of language-neutral information (like infoboxes) and to improve the coherence of same subject on multiple projects (like cross-language interwikis). Both are similar goals (and can be realized with the very same tools like RDF) but the latter is quite specific to Wikimedia projects.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: