A lot of AI is about vector space these days. https://www.kaggle.com/c/word2vec-...

Animats · on Dec 16, 2016

but it turns out you can represent a language pretty well in a mere thousand or so dimensions

That makes sense, considering that Basic English has only a thousand words yet can express most concepts with enough words.

YeGoblynQueenne · on Dec 16, 2016

>> When you summarize language in a similar way, you essentially produce multidimensional maps of the distances, based on common usage, between one word and every single other word in the language.

The problem with word embeddings, or any distance-based model really, is that language doesn't work that way.

Chomsky has a standard example he uses to make this point: "Instinctively, Eagles that fly swim". He points out that in this phrase, the "instinctively" goes with "to swim" (as in "instinctively, they swim") even though the phrase, and the attachement, mean nothing (the phrase is nonsensical by design).

If the relation was really based on distance, we would expect "instinctively" to attach to "fly". The fact that it doesn't suggests that there is something else that makes us pick the correct association out of all the possible interpretations in that sentence.

Word vectors in their original form also have trouble with homonyms etc "faux amies": for instance, the word "cat"- is it referring to the animal, or to the Linux command? In vector space, there wouldn't be any difference, so the animal would be associated with the symbol ">" and the Linux command with "small" and "furry".

gipp · on Dec 16, 2016

The "distance" referenced in your quote is not distance in a sentence, it's the distance between points in this abstract embedding space. Two completely different things. The Chomsky argument isn't really relevant here.

YeGoblynQueenne · on Dec 16, 2016

Meh. You're totally right, of course. What the hell was I thinking? :/

JumpCrisscross · on Dec 16, 2016

Language is something we intuitively understand. To get weak AI's potential, consider how many less-intuitive problems might be similarly addressable.

matt4077 · on Dec 16, 2016

Word embeddings are actually quite neat. You get to the poinr where you can do QUEEN - OLD - PRESIDENT = GIRL. Or take the new google translate as a very practical example. But yes, it's not quite the groundbreaking progress that has been achieved in image and video.

eli_gottlieb · on Dec 16, 2016

>You get to the poinr where you can do QUEEN - OLD - PRESIDENT = GIRL.

That's very nice, but it seems to miss the difference between a monarchy and a presidential republic.