Hacker News new | past | comments | ask | show | jobs | submit login

Wow. This feels like someone has taken a Borges parody and ran with it:

> What is the scope of the new "Wikipedia of functions"?

> [...] Vrandečić explained the concept of Abstract Wikipedia and a "wiki for functions" using an example describing political happenings involving San Francisco mayor London Breed:

> "Instead of saying "in order to deny her the advantage of the incumbent, the board votes in January 2018 to replace her with Mark Farrell as interim mayor until the special elections", imagine we say something more abstract such as elect(elector: Board of Supervisors, electee: Mark Farrell, position: Mayor of San Francisco, reason: deny(advantage of incumbency, London Breed)) – and even more, all of these would be language-independent identifiers, so that thing would actually look more like Q40231(Q3658756, Q6767574, Q1343202(Q6015536, Q6669880)).

> [...] We still need to translate [this] abstract content to natural language. So we would need to know that the elect constructor mentioned above takes the three parameters in the example, and that we need to make a template such as {elector} elected {electee} to {position} in order to {reason} (something that looks much easier in this example than it is for most other cases). And since the creation of such translators has to be made for every supported language, we need to have a place to create such translators so that a community can do it.

I'm not sure I'm smart enough to decide if this is all really stupid or not. If I had to summarize my feelings it would probably be along the lines of Q6767574, (Q6015536, Q654880), Q65660.




THAT's the reason? Conveying a sentence as a series of propositions or a tree with case labels has been tried in the previous century, without success. It does not offer a good basis for translation, as e.g. Philips' Rosetta project showed. It works for simple cases, but as soon as the text becomes more complex, it runs into all the horrible little details that make up language.

A simple example: in Spanish you don't say "I like X" but "X pleases me". In Dutch you say, "I find X tasty" or "X is good" or something else entirely, depending on what X is. Those are three fairly close languages. How can you encode that simple sentence in such a way that it translates properly for all languages, now and in the future?

Symbolic representation isn't going to cut it outside a very narrow subset of language. It might work for highly technical, unambiguous, simple content, but not in general. Whatever you think of ChatGPT, it shows that a neural network can't be beaten for linguistic representation.


> It might work for highly technical, unambiguous, simple content

I mean, the goal is wikipedia lite basically - so they are targeting technical unambigious simple content.

My understanding is the goal to target small languages where it is unlikely anyone is ever going to put in the effort (or have a big enough corpus) to do the statistical translation methods. Sort of a - this will be better than nothing approach.


The original paper [0] envisages a much wider scope. Vrandecic literally quotes "a world in which every single human being can freely share in the sum of all knowledge".

It also makes the task of the editor much, much more difficult than it is now.

[0] https://arxiv.org/pdf/2004.04733.pdf


Tbf, that quote gets thrown around wikimedia every 10 seconds. I wouldn't take the quote too literally.


But it seems like a huge amount of work to achieve that goal.

I suspect a large proportion of the realistic target audience are bilingual.


Reminds me of this section of Cryptonomicon:

"""

RIST 9E03 is the RIST that RIST 11A4 denotes by the arbitrarily chosen bit-pattern that, construed as an integer, is 9E03 (in hexadecimal notation). Click here for more about the system of bit-pattern designators used by RIST 11A4 to replace the obsolescent nomenclature systems of "natural languages." Click here if you would like the designator RIST 9E03 to be automatically replaced by a conventional designator (name) as you browse this web site.

Click.

From now on. the expression RIST 9E03 will be replaced by the expression Andrew Loeb. Warning: we consider such nomenclature fundamentally invalid, and do not recommend its use, but have provided it as a service to first-time visitors to this Web site who are not accustomed to thinking in terms of RISTs.

... Click.

RIST stands for Relatively Independent Sub-Totality.

... Click.

A hive mind is a social organization of RISTs that are capable of processing semantic memes ("thinking"). These could be either carbon-based or silicon-based. RISTs who enter a hive mind surrender their independent identities (which are mere illusions anyway). For purposes of convenience, the constituents of the hive mind are assigned bit-pattern designators.

Click.

A bit-pattern designator is a random series of bits used to uniquely identify a RIST.

"""


Feels a lot like RDF, especially in terms of how I expect the underlying utopian dream to play out.



Vrandečić was Google's consultant on the old Freebase's RDF export. Wikidata, which he helped create, succeeded it. It's the same people pushing the same solution under different names.


My takeaway from this is that Wikimedia clearly has way, way too much money.


Reads even worse than Ulillillia literature, at least he doesn’t fully yield to scientific measurements


this is the kind of unhinged make-work schemes all those wikipedia beg banners are funding


This particular work is mostly funded through a set of large restricted donations, not through the general funds.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: