Hacker News new | past | comments | ask | show | jobs | submit login

Languages don't have bijective mappings of concepts, so this is a hard problem. Do you have an ontology you'd like to propose?



Property P107 (http://www.wikidata.org/wiki/Property:P107) has emerged as Wikidata's de facto upper ontology. It currently consists of six main types: person, organization, event, creative work, term, and geographical feature. It's essentially a clean port of the high-level entities from the GND Ontology -- a controlled vocabulary developed by the German National Library and released last summer (http://d-nb.info/standards/elementset/gnd).

There's a fair amount of debate over that property. Are those current high level types (person, place, work, event, organization, term) a good fit for a knowledgebase that aims to structure all knowledge and not just library holdings? Does classifying subjects like inertia, DNA, Alzheimer's disease, dog, etc. as simply "terms" make sense?

More reading related to Wikidata, ontology and types: https://blog.wikimedia.de/2013/02/22/restricting-the-world/.


No, I mean I was really confused. The term "WikiData" seems to connote data of the tabular type, like a central repository for public data. Though I'm also confused at how the mapping for this particular term (for "victory") can't be done in Wikipedia or the Wiki dictionary.


It is, but the problem is whether you are talking about an individual language version of Wikipedia or the Wikipedia project as a whole. In the article they talk about the problem of maintaining Interwiki links on each individual language version, rather than centrally.

This is also just one aspect of Wikidata. The centrality of shared table content is important, too. Why have data in a specific language version Wikipedia and point to it from other versions when you can have a central repository that is pointed to using templates from each language version?


"The term "WikiData" seems to connote data of the tabular type"

There's a strong correspondence between "tabular data" (you probably mean relational) and triples (<predicate,X,Y>). Bot are based on the first-order predicate logic, so there's actually a natural mapping.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: