In short, manual labelling rarely works for natural language processing (and top-down mathematical approach). See:

Peter Norvig, On Chomsky and the Two Cultures of Statistical Learning, http://norvig.com/chomsky.html

Also, we want something that is automatic. It means easily adjustable to other contexts (and languages), inferring information about neologisms (e.g. semantic meaning of emoji), etc.

Whether its symbolic or NN based or some other representation, the main thing people are missing here is the lack of symbol grounding. That is why we the open 3d environments based on virtual senses and motor outputs are the most likely to ultimately move forward in terms of things like NLP. See http://courses.media.mit.edu/2004spring/mas966/Harnad%20symb... or http://www.goertzel.org/papers/PostEmbodiedAI_June7.htm

What are non edge-case examples of the symbolic approach not working?

If my recollection is correct, Peter Norvig changed his views around the time he joined Google. Google have a particular way of doing things which doesn't include symbolic processing.

It should be possible to infer using a symbolic approach, or more simply just provide a new definition.

> What are non edge-case examples of the symbolic approach not working? ... It should be possible to infer using a symbolic approach, or more simply just provide a new definition.

I've met some really really brilliant people who've been banging their head against that particular wall since the 1980s. The MIT AI Lab crew, for example, poured untold brainpower into symbolic inference. There was the whole "expert systems" movement. This all failed miserably, in disgrace, because nobody could ever get it to work, and "AI" became a dirty word. Later, there was Cyc, which was hyped on and off throughout the 90s. http://www.cyc.com/ After that, there were people who tried to reason over RDF tuples, which didn't work either: http://www.shirky.com/writings/herecomeseverybody/semantic_s...

This idea pops up every 5 or 10 years and wastes a generation of brainpower. It never works. And let me be clear: There have been some terrifyingly brilliant people who were convinced that it ought to work, and who spent years of their life on it.

Meanwhile, any joker who can code up Bayes theorem or single value decomposition can get some results in a couple of weeks. Probability and statistics get results (as do more advanced numeric techniques). Logic deduction fails. I'm not even sure I could explain why. But I encourage you to think long and deeply on what Norvig has written on this subject. Or just buy Norvig's two AI textbooks (written before and after he discovered the joys of probability, basically), do some exercises, and compare the results you get.

It's clear in hindsight. Natural language is the serialisation of human thought. Humans don't think symbolically in terms of rigorous logical statements. We stretch and play with definitions, sometimes making up a new definition on the spot, and relying on context and our shared experiences for the other person to figure out the meaning.

Statistics isn't sufficient for fully understanding natural language (for that, a computer would have to go out and experience the world in the first person). But it is necessary.

> I've met some really really brilliant people who've been banging their head against that particular wall since the 1980s.

The idea of breaking meaning into an inventory of discrete components like this is at least as old as Hjelmslev, I think even Saussure touches on it.

Part of it seems to me to be that you're breaking down words into words. Even if you write it in uppercase and call it a primitive, you don't have to spend too much time on cognitive linguistics to know that there's really nothing primitive about MAN or MONARCH...

At that point, you do need some sort of grounding, e.g. a human reading the text, or a neural network embedded in a robot. It isn't symbols all the way down.

Well, AFAIK all practical systems for translation use data-centric approach, rather than any top-down one.

See also: Andrej Karpathy, The Unreasonable Effectiveness of Recurrent Neural Networks, http://karpathy.github.io/2015/05/21/rnn-effectiveness/, and try to replicate it with any formal semantics (good luck!).

Additionally, formal systems rarely incorporate for actual language, with some things being technically correct, but sounding weird, or things being incorrect, yet - prevalent (and a root for language evolution). See also: char2char translation (which accommodate for e.g. neologisms or typos).

But does the translation use word2vec or anything similar?

The original Altavista Babelfish (which used SYSTRAN) used rule-based machine translation. It has been replaced by Bing Translator, but my recollection of the Babelfish was that it was accurate enough to be usable, and better than Google Translate was when it was first released. Google Translate has improved a lot recently. My only problem with it is that it doesn't understand the meaning of the words it's translating.

Neural networks are perfectly suitable for perception, e.g. image recognition. No argument there.

> My only problem with it is that it doesn't understand the meaning of the words it's translating.

I think that's a fallacy that will haunt AI forever (or, more likely, will be the definitive civil rights struggle ca. March 25, 2035 6:25:45am to March 25, 2035 6:25:48am)

We tend to move the goalpost whenever AI makes advances. Where many people would have considered chess a pretty good measure of at least some aspect of intelligence, it seems like mundane number crunching once you know how it works.

It may be that we really mean consciousness when we say "intelligence", although if we ever find an easy formulation that creates a perfect "illusion" of consciousness, it may end up having some strong effects on people's conception of themselves that I don't necessarily want to witness.

thanks for sharing this... stirring thoughts. Did my best to paraphrase here (https://twitter.com/iamtrask/status/818090203990659072) but I really think we should spend more time on this as a society.

I think a vector of weights in different dimensions constitutes meaning. Words don't have a single meaning; they're a shared implicit web of allusions and hints. We only understand one another to the degree that our allusions are shared - it's why nonnative speakers miss nuance, and it's why poetry is a thing.

If there are symbols, they're in the dimensions of the vector; but words only probabilistically suggest meaning, they don't categorically denote it.

Words with multiple meanings will have multiple vectors each depending on context, and if you train your system without taking that into account the vectors will be averaged and the individual meanings will be lost.

With a symbolic approach, the right thing is to use a different symbol for each meaning, and disambiguate based on context (which you could identify either statistically or by using rules).

I feel that you don't appreciate how plastic language is, and how indirectly meaning is conveyed. New words and meanings become cultural shared knowledge on a weekly if not daily basis, and shades of meanings are added to existing words and phrases simply by being used in a milieu or by a sufficiently famous person. Symbols might be used as hidden variables representing concepts, but to fully represent how vague and allusive language is, you've got no choice but to make everything fractional. I don't see the result being anything other than isomorphic with a vector.

In particular, words are not repositories of meaning. They allude to concepts; new concepts are created and get forgotten on a regular basis. The connection between words and concepts waxes and wanes over time, and even the very timeline of a connection's strength can be used for allusion: using language to represent concepts that can only be coherently mapped by using previously-stronger allusions conveys a sense of being old-fashioned, while the reverse conveys future-thinking. Using allusions that are stronger within a milieu conveys social signalling information about group membership. Etc.

There's no way a human-maintained database is going to capture the subtlety here on anything like a timely basis. There's no universal truth, everyone's map is a little bit different, and the map is changing all the time.

Yet we've evolved to use symbols, which are short sequences of phonemes or glyphs, and to connect these together to form sentences, which are lists of symbols.

People who speak the same language are able to communicate perfectly well, almost all of the time, across continents and centuries. New words are rare compared to existing vocabulary and often soon disappear from use. New concepts can readily be mapped onto existing vocabulary. People are able to learn other languages and improve their own with the help of dictionaries and grammar books. Things like humour, cryptic crosswords, and social signalling are edge cases. And things like deception are unrelated to language understanding at the semantic level.

We're discussing the best way for computers to understand natural language and communicate with people, and in practice that's either going to use unambiguous language or it's going to need human help or verification.

We've evolved to use symbols to communicate (to transfer information), not to think using symbols. We use symbols as a way to transfer information over slow, lossy channels. Inevitably some information is going to get lost along the way, either because the symbols don't represent the meaning fully, or because if they would, they would be too long and cumbersome to use. Speed won over precision.

People who speak the same language are still prone to misunderstandings.

Take the word "soon" in your example. How many unambiguous meanings does it have?

We may be overestimating our success in communication. Heck, I'm not even sure if my understanding of soon equals yours. Is it a century, a few decades, a few years? And why is it so different from the meaning when I use it to answer when lunch will be ready?

in practice that's either going to use unambiguous language or it's going to need human help or verification.

The former doesn't exist in human languages (we'd otherwise have gotten rid of the lawyers long ago), and the latter is infeasible. There is a another way.

>My only problem with it is that it doesn't understand the meaning of the words it's translating.

Wouldn't that mean it was an actual full AI?

