That's fascinating to read, thanks for sharing. Did it ever do something genuine...

stakhanov · on Sept 6, 2023

One of the people from Cyc gave a talk at the research group I was in once and mentioned an idea that kind of stuck with me.

...sorry, it takes some building-up to this: At the time, a lot of work in NLP was focused on building parsers that were trying to draw constituency trees from sentences, or extract syntactic dependency structures, but do so in a way that completely abstracted away from semantics, or looked at semantics as an extension of syntax, but not venturing into the territory of inference and common sense. So, a sentence like "Green ideas sleep furiously" (to borrow from Chomsky's example), was just as good as a research object to someone doing that kind of research as a sentence that actually makes sense and is comprised of words of the same lexical categories, like "Absolute power corrupts absolutely". -- I suspect, that line of research is still going strong, so the past tense may not be quite appropriate here. I'm using it, because I have been so out of the loop since leaving academia.

The major problem these folk are facing is an exploding combinatorial space of ambiguity at the grammatical level ("I saw a man with a telescope" can be bracketed "I saw (a man) with a telescope" or "I saw a (man with a telescope)") and the semantic level ("Every man loves a woman" can mean "For every man M there exists a woman W, such that M loves W" or it can mean "There exists a woman W, such that for every man M it is true that M loves W"). Even if you could completely solve the parsing problem, the ambiguity problem would remain.

Now this guy from the Cyc group said: Forget about parsing. If you give me the words that are in the sentence and you're not even giving me any clue about how the words were used in the sentence, I can already look into my ontology and tell you how the ontology would be most likely to connect the words.

Now, the sentence "The cat chased the dog" obviously means something different from "The dog chased the cat" despite using the same words. But in most text genres, you're likely to only encounter sentences that are saying things that are commonly held as true. So if you have an ontology that tells you what's commonly held as true, that gives you a statistical prior that enables you to understand language. In fact, you probably can't hope to understand language without it, and it's probably the key to "disambiguation".

This thought kind of flipped my worldview upside down. I had always kind of thought of it as this "pipelined architecture" where you first need to parse the text, before it even makes sense to think about how to solve the problems of what to do with the output from that parser. But that was unnecessarily limiting. You can look at the problem as a joint-decoding problem, and it may very well be the case that the lion's share of entropy comes from elsewhere, and it may be foolish to go around trying to build parsers, if you haven't yet hooked up your system to the information source that provides the lion's share of entropy, namely common-sense knowledge.

Now, I don't think that Cyc had gotten particularly close to solving that problem either, and, in fact, it was a bit uncharacteristic for a "Cycler" to talk about statistical priors at all, as their work hadn't even gotten into the territory of collecting those kinds of statistics. But, as a theoretical point, I thought it was very valid.