Hacker News new | past | comments | ask | show | jobs | submit login
Neuroscientists find new support for Chomsky’s “internal grammar” thesis (nyu.edu)
182 points by tompark on Dec 9, 2015 | hide | past | favorite | 91 comments



I think we also have visual grammars. We are surprised if heavy and large objects are on top of light and small objects. We have priors on lighting and shadows, depth and occlusion, inside and outside.

We solve these statistically with complex priors and likelihoods. How much you can shove into the prior is the question of universal grammar. That you are born with linguistic priors is I think not so surprising. If they are of the old-fashioned symbolic variety that Chomsky proposes, would be surprising. It's way too "clean" for my liking, personal opinion!


I'm not sure about that. A newborn first learns to recognize shapes, then starts exploring its environment using hand-eye coordination. It learns the rules of gravity by watching and feeling. It has a few inborn reflexes, like looking for the source of milk right after birth, but the physics engine seems very much a learnt thing. Also it seems the fundamental physics laws can easily be learnt by a system like the brain (we can even do it with artificial neural nets). Visual cortex seems to be similarly simple. Only for understanding language we have been missing a couple of pieces to the puzzle. That seems still like a miracle. Disclaimer: anecdotal evidence.


Since most babies tend to follow the steps of development according to a timeline (i.e. a, then b, then c, and never out of order), and typically on time, isn't this rather evidence that such neural system grow more than the physics system is learned?

The idea that some complex systems can simply grow without inputs is not unfounded in the biological world -- witness, for example, grazing animals such as cows and giraffes that plop out of the birth canal, shakily stand up, and being trotting around. They couldn't have trained that system in the womb.

Similarly, human babies who are growing complex neural systems over years might look like they are learning, but instead they are just growing. That's not to say that we don't learn things later on-- certainly we do-- but that's not evidence that babies are necessarily learning all the skills they become capable of in toddlerhood.


The way you formulated that is fascinating and made me understand the whole debate a little better. The strict order in which we grasp things does indeed indicate that these basic skills are perhaps not left to chance waiting until the correct input is provided, but are hardwired into the process that develops our brains.

I guess the Chomsky debate is whether grammar is one of those basic skills. Would two children left unsupervised by adults through their formative years construct their own language that surpasses the communication exhibited by gorillas for example?


This has actually happened-- caveat, with deaf people (but sign language does have grammar)-- check out Nicaraguan Sign Language. Pinker has a good chapter on it in The Language Instinct.


Deaf people? I read somewhere that some kids develop themselves a sign language without their parents help...


Artificial neural networks also go through predictable and repeatable stages of development as they train on a corpus.


In experiments, a newborn of a few hours will reflexively dodge/cower from a quickly growing shape circle on a screen (something is approaching and about to collide!). It has no experience at that point of being struck by any moving object or anything like that.


New borns can barely hold their heads up.

Shine a light in their eyes, and yes they react. As does everything else with eyes.


Couldn't you control for that by showing a shrinking circle on a screen? I'd be amazed if the experiment didn't do something like that.


Some animals (and people) with eyes are blind and so probably don't react when you shine a light at them.


Or the experimenter interprets any reflex according to his theory.


https://en.m.wikipedia.org/wiki/Infant_vision

I think it's a quite nice analogy. There are structures present, but they have to be trained as well. Suppose you have a hierarchical Dirichlet process as prior, you still need data to be able to define what clusters with what. And of course I won't postulate such a simple process as representing our newborn brains.

From Wikipedia: "At birth, visual structures are fully present yet immature in their potentials. From the first moment of life, there are a few innate components of an infant's visual system. Newborns can detect changes in brightness, distinguish between stationary and kinetic objects, as well as follow kinetic objects in their visual fields. However, many of these areas are very poorly developed."

Your extreme case would be that there is a priori no reason to have language processed in certain parts of the brain, like Broca's or Wernicke's regions.

It is of course possible that these are just strategically located. However, newborns with damage to these regions by oxygen deprevation at birth can already suffer from linguistic impairments.


I wouldn't find it too surprising if the brain would abstract actions and actors, perhaps even the tense from thoughts and sensory input and map them to auditive features and motor programs for (sub)vocalization and the internal monologue. I think, visual thinking is different from that. Mental rotation and visualizing objects have no actors and actions in it. It is just recall of movement, shape and texture patterns that have been previously encountered. However, I think visual thinking is often enhanced by thinking in language. The auditive memory can be leveraged as an additional buffer and as a means of verifying and constructing visual thoughts.


> We solve these statistically with complex priors and likelihoods

"we interpret experimental results up to some order of approximation with..."


And I believe in gravity influenced perception. Verticality isn't wired the same way as horizontality in our minds.


> Their results showed that the subjects’ brains distinctly tracked three components of the phrases they heard, reflecting a hierarchy in our neural processing of linguistic structures: words, phrases, and then sentences—at the same time.

How are they distinguishing between grammar that's encoded in the brain due to "nature" vs grammar that's encoded in the brain via "nurture"? I think we all knew that the brain has some mechanism to detect grammar.


Your assumption is that all information necessary and sufficient to decode grammar into its constituents is detectable from the spoken words that a child hears. According to Chomsky there is a distinction between the I-Language versus E-Language [1] (internal/ex) namely because grammar in most human languages is not the most computationally efficient way of structuring sentences hence the need for an uniquely human/innate ability [2].

This is also why monkeys can never be trained to learn human language [3].

[1] https://en.wikipedia.org/wiki/Transformational_grammar#.22I-...

[2] http://www.theatlantic.com/technology/archive/2012/11/noam-c...

[3] https://en.wikipedia.org/wiki/Nim_Chimpsky


I don't think any assumption was made here. It is as much an assumption to say that all information necessary and sufficient to decode grammar into its constituents is not detectable from spoken words that a child is exposed to. That's a theoretical argument. Just because we can't think of a way this could be done, doesn't mean it can't be done. At the moment, we simply don't know.

I think what the comment you're responding to meant is that this study shows no evidence for an innate internal language or grammar, which I thought was Chomsky's most controversial claim.


>grammar in most human languages is not the most computationally efficient way of structuring sentences

Grammar in most human languages is structured to be information-theoretically efficient, which often winds up being vague, ambiguous, and context-dependent. It's basically a time-space trade-off for statistical inference.


> How are they distinguishing between grammar that's encoded in the brain due to "nature" vs grammar that's encoded in the brain via "nurture"?

They do not.

> I think we all knew that the brain has some mechanism to detect grammar.

The press release is spinning the paper to make it sound much more controversial than it is.


Pattern detection. Pattern happens to be grammar.


However, patterns are language dependent. They don't mention what happened when an English only speaker heard a Chinese phrase.


If you hear long enough of something that is noise first time you hear it, eventually you will distinguish patterns if it is for example Chinese. Brain is pattern recognition machine.


wait, i think that's a good comment. what if that "invisible grammar" they said was the result of the statistical process ? which means that it already emerges in childhood?


This appears to be the paper in question:

http://psych.nyu.edu/clash/dp_papers/Ding_nn2015.pdf

Also, for more background on the idea of the "Universal Grammar":

https://en.wikipedia.org/wiki/Universal_grammar


Not a neuroscientist, but I do do NLP, and I only lightly skimmed the paper.

This doesn't really speak to UG.

First, you can believe in the structures they purport to show without accepting the existence of UG, by appealing to the existence of general mechanisms in the brain for assembling hierarchical structures, which is equally validated by this experiment.

Second, they looked at two languages with sentences of up to ~7 syllables each with at most two constituents (Noun Phrase Verb Phrase). You can't show any evidence for any hierarchy of interest in 7 syllables. They demonstrated that phrases exist and phrase boundaries exist, but it's entirely possible to have "flat' grammars without interesting hierarchy, especially in simple sentences. If they want to show interesting hierarchy, they should conduct experiments with more interesting structure (say, some internal PPs and some limited center embedding) and show something that correlates with multiple levels of the "stack" getting popped, or something.

It's still interesting work, but as usual oversold by the university press office.


First, you can believe in the structures they purport to show without accepting the existence of UG, by appealing to the existence of general mechanisms in the brain for assembling hierarchical structures, which is equally validated by this experiment.

That was kinda my impression as well, but I don't want to say much more as I'm so far from an expert on this and I'll probably just make an idiot out of myself. Still, as you say, it is interesting work in its own right.


I suspect we have a "grammar" which is based on massively parallel pattern matching over a sequence of symbols of bounded length. (I.e. why we deal with ambiguities well, but not long sentences.)


if it's so massively parallel, why do I read your comment left to right one word at a time? why garden path sentences that mislead us as we build up an erroneous parse? sure there's some fuzzy pattern matching, but lexical parsing doesn't seem so massively parallel to me - more like a huge lookup table of fuzzy matches (including whole phrases, I don't have to say much about how I'm never gonna give you up, before that clicks for you), while we're also be parsing lexically with quite a formal grammar.

I put a grammar error into the last part of the last sentence purposefully, after I wrote it so you can read it and notice the error explicitly. (Whereas I would expect you to still figure out what I meant.) If it were just pattern matching that grammatical error wouldn't bother you at all even if you notice it, you would just go with the most likely fuzzy interpretation.


The thing is, people don't speak they way they are taught to write, in complete sentences. Instead, they speak _in actual conversation_ in a series of fragments, which often don't end properly (we trail off, let the other complete our thought, interject, etc), and would be ungrammatical if sequenced together. Remember, language evolved to communicate with others in conversation, not to deliver speeches and text monologues.

Consider this text: "Well the whole point of it was... I mean, when I was in school, we often-- sometimes we would go hunting after school, and I mean, you know -- look, nothing like this ever-- we could never have imagined it."

That's 5 or 6 sentence fragments, one after another. If you processed it in a naively serial way, you just end up with one big ungrammatical sentence. I'm only able to clarify it with punctuation (mdashes and commas) because I could parse it correctly.

A parallel system would benefit not from parsing such a string, but creating a set of candidate interpretations, (all the words together, this is a series of three phrases, this is a series of five phrases, etc) and determining the most meaningful match.


> If it were just pattern matching that grammatical error wouldn't bother you at all even if you notice it, you would just go with the most likely fuzzy interpretation.

I actually had to read the sentence about 3 times, and force myself to read it slowly word by word, before I spotted it.

We often don't read one word at a time.

I agree with you that "massively parallel" is probably drastically overstating it, though.


I have a sense of what common words are. I can enter a mode where it's as if I can't see them. I just see extremely specific words. Names. Places. Things. If the meaning of these objects is obvious, then I can mostly ignore ambiguities in the construction of the sentence. I just move on. When ambiguity explodes I backtrack and read sequentially. I think it's basically just thorough skimming. Also that the continuous resolution of ambiguity is a trained (massively parallel?) skill.


> "massively parallel" is probably drastically overstating it, though.

Think of a single-core processor executing instructions one at a time. Now think of ten billion neurons in your brain out of which a large proportion is active at any time. That is definitely massively parallel.


At that level, yes. But those neurons are not doing high level symbol recognition tied to a grammar.

This is like pointing to the billions of transistors in a typical modern CPU and claiming it is massively parallel despite being single core.

For many things the brain clearly can do multiple things in parallel (I can write one thing while talking about something else, for example). But it most certainly can't usually carry out many high level tasks at the same time.

When it comes to things like recognising words/symbols the question is how high level that actually is, and whether or not the brain has capacity for recognising more than one at the same time, and if so how many.


> why do I read your comment left to right one word at a time?

Because in English word order is important (in fact, in most languages)

It would be fun to learn a completely independent word order language (I suspect Latin is like that) but I suspect this is one of the reasons this aspect died in romance languages, it's harder to understand and relying on word order makes it easier


so I speak such a language, but I still don't read it in parallel. also this parallel pattern-matching idea doesn't explain sentences that when parsed carefully are clearly ungrammatical: I had inserted an error at the end of my first paragraph of the previous comment to illustrate this.


>also this parallel pattern-matching idea doesn't explain sentences that when parsed carefully are clearly ungrammatical:

Your confidence threshold of the meaning of the sentence was reached, so you didn't wait for the rest of the results to come back before moving onto the next sentence.

Had you needed to understand the sentence sequentially, you would be less likely to miss the grammatical error because you wouldn't be able to skip the word.


Yeah I agree it's not 100% parallel (that's why I took it out of your quote)

We kinda ignore mistakes when they don't clash with our heuristics (nobody reads every word, especially stop words)

You can see how people make mistakes by writing the way the words sound rather than constructing the phrase (and this happens accidentally as well, not because the person doesn't know the right words)


Text is serial because it is based upon speech which, being sound, is necessarily serial. Grammars structure serial representations.

I wonder if the serial limitation is purely historical, or if our intelligence system is somehow fundamentally unable to process massively parallel transmission, even if such has been available during our evolution?

Certainly, we recall associations in a seemingly massively parallel fashion, but our conscious thought is apparently serial. We think this, then that... wait, isn't this really that? There may be non-deterministic threads going in the background, but they are more about association or "fit" than working something out.

We also model things narratively, a serial representation which is also a nice match for events over time - also serial.

BTW: didn't spot the error at first. But I've seen typos in books on second or later readings, which (evidently) also eluded the author and copy editor.


Sound is serial, and it isn't. Even for the purpose of raw analysis of things like the frequency spectrum, you need a certain sized frame of audio. One or two instantanteous samples won't give you anything. Conversely, we could argue that images are also serial, because we can rasterize them as pixels on a grid, from upper left, to lower right. Anything that can be encoded or approximated by 1's and 0's is serial.

How do we really hear? How do we hear speech? Though the audio is constantly in oscillating motion, tracing an amplitude graph in time, we perceive audio as snapshots: images are formed in our mind which persist. We chunk the audio into frames. Speech understanding probably doesn't take place until a whole frame of audio is assembled and delimited into phonemes. That speech frame appears in the mind as a unit which is randomly accessed; we do not imagine we have a "read write head" which can only retrieve one phoneme at a time, and only in one direction.


>> if it's so massively parallel, why do I read your comment left to right one word at a time?

For fun, I once had two friends speak to me simultaneously, one in each ear. We swapped places and did that to each other a few times. Result: nope, our language understanding unit is not parallel. You can either take in a wall of sound, OR focus to a string of words and understand its meaning. It might feel like you can almost do both at the same time and pick up two different threads of conversation at the same time, but it's always just beyond your reach. At best you can swap between threads quickly, if the conversations permit it (so forget it both/all conversations are rapid-fire and heavy with nuance etc).

I also note that our language generation units are not parallel. We can utter one syllable at a time. So maybe there's not much point in understanding multiple syllables at a time?


I am not so sure about this I suspect it is perhaps more of a learned skill. Professional DJs come to mind. To match two desperate musical tracks with differing tempo and align correctly requires following two musical tracks simultaneously. Initially it is very much as you say but over time you learn to hear them both separately and together without strain. My experience with twitch is another anecdotal counterpoint. After a lot of practice I now regularly listen to multiple audio streams at the same time and no longer hand any problem following up to three streams simultaneously these days, and it was impossible for me at first. Another issue is at what point does very fast attention switching become effectively multiprocessing? If we don't notice the context switching happening is still "happening"?


>> Another issue is at what point does very fast attention switching become effectively multiprocessing? If we don't notice the context switching happening is still "happening"?

Well, OK, that's one big question.

However, I can process music and language at the same time. I think I can process multiple pieces of music at the same time, for example I can follow a complex piece of classical music or jazz with multiple instruments playing at various signatures and so on (full disclosure: I'm a drummer). I can't quite do that with speech though. So maybe music and language are just different things and they're processed in different ways?


This rings true to me. As a trained musician, I know that I hear music differently from most untrained people. I notice the bass line, or the way the melody inverts, or anticipate the end of a phrase.

Several years ago Kathy Sierra wrote a great post about how app developers should try to educate users from the first time they use the product and "raise the resolution" on how they think about the problem the app is designed to solve.

http://headrush.typepad.com/creating_passionate_users/2005/1...


> our language understanding unit is not parallel

Your experiment seems to rather point to the intuition that we don't have a plurality of these units which can operate independently, not that a single unit doesn't have a parallel architecture.

The parallel talk is being superimposed into effectively a single input going into the same unit. That unit doesn't have the processing layer to unravel this superimposed input into two streams of speech.


> why do I read your comment left to right one word at a time?

I'd say because the probabilistic structure of language is such that it must be processed sequentially to be decoded correctly. The interpretation of what comes next depends on what came before. Information-theoretically, this happens to be an efficient and compact way of encoding information in the auditory medium (where input unfolds over time, rather than being presented all at once like in images).

Whether processing is "massively parallel" depends on what level of analysis you're assuming. At some level, processing in the brain is obviously massively, since that's simply how the brain's laid out.

I don't see how sequential processing bears any implication to whether or not language is or can be learned through statistical pattern matching, but I'd love to hear your angle on this.


For one both our auditory and visual senses have limited bandwidth. Reading is done in saccades, so already at the source the information comes in sentence chunks at a time.

The opportunity for massive-parallelness is in disambiguating a huge number of possible interpretations. Before you can construct the possible candidates, you first need to have processed enough of the sentence. But since you can process new sensory input at the same time as disambiguating previous input, you can get the garden path effect. What I find interesting about the garden path effect though is that it's so artificial, so the way language is used normally seems to purposely avoid ambiguities that will trip up listeners/readers.


> why do I read your comment left to right one word at a time?

I don't read one word at a time, but scan larger chunks. For instance, I processed "if it's so massively parallel" simultaneously as one "picture".

Do you really read one word at a time? Try making a sheet of paper with a cutout that is approximately word-sized, and move that peephole over the text. Then you're reading one word at a time.

If that makes no difference to your reading efficiency, you could have some cognitive disability affecting reading.


Kudos for getting a 'rickroll' into a conversation about linguistics!


> Neuroscientists and psychologists predominantly reject this viewpoint, contending that our comprehension does not result from an internal grammar; rather, it is based on both statistical calculations between words and sound cues to structure.

Wait, is this true? There are people who seriously suggest that we don't have an innate notion of grammar?


Yes, there are. Or at least, that we don't have a special-purpose module of our brain devoted just to that. The common counter to Chomsky's version of innate grammar is that language is the result of the interaction of a bunch of different cognitive modules which all evolved for different purposes, but which when combined happen to make language possible, and have since been under selection pressure to make language work better.

The article doesn't actually provide any support for the "innate" or "universal grammar" hypothesis as it is typically understood, though. This is really about whether the de-facto grammar that emerges from whatever the "language faculty" has a particular hierarchical and recursive structure. And yes, there are people who seriously argue against that, too. The most common alternative model that I have been exposed to is analogical template matching, which actually does a pretty good job at a lot of NLP tasks.


Well, OK, I could understand a position saying "We don't have a special-purpose module of our brain devoted to grammar," which is (I think) a direct opposite of Chomsky's position. I'm not a neuroscientist, so I don't really know how much we're close to answering this.

However, that is very different from the quoted passage. It sounds like these "neuroscientists and psychologists" don't think our brain has any notion of "grammar" at all, and everything can be explained better with statistical relations. I find that very hard to accept, and I wonder if a more reasonable position (= there is no special part of brain reserved for learning grammar) was summarized poorly into a much stronger hypothesis.

Well, of course, if we reduce everything as much as we like, then we could say everything is about statistical relation of a gazillion variables (or quantum mechanics, if you go far enough). But that's not useful, isn't it? Even if we cannot pinpoint a neuron and say "This neuron will fire if it sees a past perfect!", if the brain as a whole acts like it has an underlying notion of grammar, then I think it's fair to say that it "knows" grammar.

And, forgive me if I'm wrong, but I'm pretty sure that my brain acts like it has some notion of English grammar. :P

Edit: Hmm, I think I caused confusion by using the word "innate". Apparently it can mean (1) existing from the time a person or animal is born, and (2) existing as part of the basic nature of something.

Chomsky proposes (1), and many disagrees. (I don't find Chomsky's arguments particularly convincing, either.) But the sentence I quoted (which didn't use the word "innate") sounds like it was against (2).


Let's say that you are very good at basketball, enough so that you can almost always make a shot from anywhere on the court. Your mind is computing, based off of visual feedback, how to fire your muscles in such a way so that the basketball makes a near-perfect parabola and makes it into the net.

Does your brain "know" the formulas that describe projectile motion? Is it running a physics model in your brain?


It doesn't "know the formulas". However, it has accumulated a life time worth of statistics, which it uses to predict what's about to happen, all the time. So I can be immediately surprised when something unexpected happens. In that sense, there's a running physics simulator.

As for throwing the basketball, most people know (#) what the arc should be, and can immediately after the throw tell if a throw will hit or miss with reasonable accuracy, assuming a viewpoint that allows them to extract the movement accurately. The difficulty there is learning to execute that throw.

(#) edit, know as in "have learned", not as an innate ability.


I think that's a very good analogy, and no, I wouldn't say my brain would "know" Newton's formulas. However, I will still say my brain "knows" grammar. The difference is complexity.

Even if you consider spins, throwing a ball is only about 9 degrees of freedom. It is too small to make good prediction on how the brain works, so Occam's razor dictates that it probably doesn't solve a differential equation. (Also I could try throwing a baseball, and I'll find that I have to re-learn everything, so it's reasonable to assume that my brain was "optimized" to just throw one particular kind of balls to a particular height.)

Human language is so mind-bogglingly complex that even if you just restrict yourself to a dozen words you can easily utter sentences that were never spoken in entire history of English. Yet my brain has no problem dealing with that.

Sure, we could say that it's all statistical relation, but those relations have to stack (almost) recursively and connect phrases that are a dozen words apart. At what point do we stop calling them a heap of relations and instead call it what it is, i.e., grammar?


Let's make it complicated, then: if you're great at walking a tightrope on a windy day, is your brain storing a model of your muscular+skeletal structure and live-computing a kinematics physics model + wind simulation to simulate how you should be walking across that tightrope?

My point was that Newton's formulas are simplified models of how the world works. As a result, just because you know how to throw a ball doesn't mean you "know" Newton's formulas. Similarly, just because you know how to speak doesn't mean you "know" grammar.

The question to ask to help us answer this question would be: is grammar a simplified model of language?[1]

[1] This depends on whether you're a linguistic prescriptivist or descriptivist.


I don't think that question has that much to do with prescriptivism and descriptivism. Prescriptivists want to use the study of grammar to uphold certain linguistic norms; descriptivists find that language works fine without expert supervision; neither is a stance on the scientific origin of linguistic structure. You could be a prescriptivist and believe that grammar is a simplified model: you would just want that model to carry normative weight (e.g. for conservative reasons, or to promote standardized national language, or whatever reason).


A very important difference: someone who is good at throwing a basketball (or any sport, really) will "switch off" their brain while doing so. There's various words for this, e.g. "being in the zone". Ask the person to think about something while throwing, and they'll likely miss. It takes years of training to teach the brain to do the correct motions while " switched off".

Using language, on the other hand, requires us to really think.


Language doesn't seem to require conscious thought either. Philosophers like Heidegger and Wittgenstein talked about that a lot. The postulating of an internal thought process "behind" everything is basically what Heidegger spent his life refuting, from what I've read of Dreyfus's book... And Wittgenstein explains language as a tool that people use to get things done, more or less. You can talk without "really thinking", people do it all the time. Grammar is natural in that sense. Pinker's "The Language Instinct" mentions studies that show higher educated people in some situations tend to make more grammatical blunders and to be slightly less "fluent," maybe because of second guessing their language instincts.

(Any errors in this comment due to the fact that I wrote it without thinking.)


I must admit I was thinking more about the mental process of constructing the content of e.g. an argument than of the process of putting the content into grammatically correct sentences. I completely agree you can talk, and even be gramatically correct, without really thinking.


Actually if you look at research done on freestyle rappers, you'll find similar mechanisms as those found in great basketball players. http://www.theatlantic.com/health/archive/2012/11/the-neuroa...

Being in a flow state seems very similar to being in the zone.


And then consider that freestyle rapping is a formally demanding way of speaking, and that speaking is something people do all the time without much mental effort. Impressive rappers can produce rhyming and alliterating verses in "flow" but a normal dinner conversation is in flow by default (depending on how neurotic your dinner guests are).


Do the more neurotic dinner guests rhyme a lot?


> Using language, on the other hand, requires us to really think.

Not consciously, no. I mean, do you really think about what you say when you speak in your native language, or do you just say it? It's actually a milestone in learning a foreign language - switching from thinking in native and translating into foreign to just thinking in foreign. If you find yourself running your internal monologue in a foreign language then you know your brain groks it.

And generally, mastering a task means switching it to unconscious mode. A good driver doesn't consciously think about turning the steering wheels or pressing pedals; they think about car (or rather, car+themselves system) moving faster or slower, going here or there. A soldier is trained until most of his tasks are muscle memory. A programmer doing what they really knows how to do will find themselves in the state of flow, etc. Mastery of a task is mentally abstracting it into just another basic capacity, just like breathing or moving your hands.


> Using language, on the other hand, requires us to really think.

I'm not sure I agree, not entirely at least. "Thinking while speaking" is something I associate to speaking a second language that you don't really master. Then you really think in one language, think about the translation, and speak. If I speak in my native language or in English, I don't think about the words I'm saying or about how I'm using the language, there's a much bigger component of instinct and thoughts can be much more impulsive.


I am not aware of Chomsky ever making the argument that language has a specific physical location with the brain, or that it is the achievement of a single organ which evolved for exactly this purpose. You may be confusing Chomsky with Jerry Fodor, whose "Modularity of Mind" was inspired by Chomsky's ideas (which it helped popularize).


Thanks for the second point. It's frustrated when they write titles like that for the public.

I always think that "human is born with some miracle thing" is quite suspicious. How can orderliness be born out of no where (no work spent, no energy consumed)? The probability of orderliness to be spontaneously assembled itself must be infinitesimally small.

Grammar might evolve along the way (due to statistical learning) but it seems too good to be true if it's born out of no where.


Giving birth to a human child is not trivial. Lots of assembly required. DNA is pretty complex, and the process of motherhood is not just waiting for nine months and delivering a miracle.


Not only are there people who seriously suggest it, but most neuroscientists think Chomsky is completely wrong, and have quite a bit of data to back it up. The notion of innate grammers has been pretty much replaced with evidence that you get much better and more human like performance (and the ability to actually learn the bloody things) out of RNNs and the like.

The interesting question is why RNNs are able to produce things that resemble formal languages at first pass.


There's a misconception here: deep neural networks are used to learn grammar models, or the features for grammar models and so on, but the predominant model of natural language is still based on grammars.

Google "Stanford parser". This is a state of the art natural language parser and like all parsers it relies on a grammar. The grammar it uses is a dependency grammar, a different grammar than the phrase structure grammars proposed by Chomsky and also one that is built in a probabilistic manner, but a grammar nonetheless.

Also, it's really not the case that you get better performance out of neural network models of language, let alone "human like performance". We're very far from that still.


The Stanford parser does both constituency (phrase-structure) and dependency parsing. Actually in the model that I know of, the dependency parses are derived from the constituency parses.

The base of the parser relies on a grammar to describe possible sentence structures, but actually most of the work in disambiguation is done using statistics, or with neural networks. The question what is and what is not grammar is rather arbitrary, there can be a continuum between simple rule-like and statistical regularities.

Ultimately the models in NLP are rarely making any kind of cognitive/neuroscientific claim about being plausible models, just effective ones. There's a specific field of computational psycholinguistics which does investigate those things.

> Also, it's really not the case that you get better performance out of neural network models of language, let alone "human like performance". We're very far from that still.

I wouldn't make such claims these days because deep learning methods are gaining ground very fast. There are many tasks at which the deep learning model is better than traditional models. Furthermore, there are already deep learning models which are better than humans at image labeling.


>> I wouldn't make such claims these days because deep learning methods are gaining ground very fast. There are many tasks at which the deep learning model is better than traditional models. Furthermore, there are already deep learning models which are better than humans at image labeling.

We've been here before. Back in the Olden Days of GOFAI, expert systems used to routinely outperform human experts at all sorts of cognitive tasks (medical diagnosis being a typical example). There was a huge amount of excitement and people promising wild things were just around the corner. A few years down the line, there's the same excitement around a completely different technology and expert systems are nowhere to be seen. So I'll keep my expectations at about mid-range and wait for another ten years before I say I know exactly what's going on.


> The notion of innate grammers has been pretty much replaced with evidence that you get much better and more human like performance (and the ability to actually learn the bloody things) out of RNNs and the like.

No not really. Barring concrete physiological evidence, innateness is simply an unfalsifiable position. I do think you can say that consensus among scientists might have shifted against innateness, but that is definitely not because of some computational model such as RNN with some passing resemblance to observed behavior of human neurons. The way these networks are configured and trained has nothing to do with real brains, nor does the abstraction in neural nets leave room for any electrochemical effects.

> The interesting question is why RNNs are able to produce things that resemble formal languages at first pass.

Is it? A neural network can achieve that by simulating a Turing machine or simpler automaton. I don't see what that brings to the table? That's just doing something we could already do, but less efficiently. I think the interesting thing is the opposite: if the RNN can do something which couldn't be done before with other models.


Dammit, why do I always get carried away discussing details and forget the main point?

That, what you say- that's the business. An RNN (any ANN) is not a human brain. It is one model of the human brain, based on our very limited understanding of human brains. It doesn't matter what ANNs can and cannot do, how well or bad they perform, it tells us nothing much about how we do language; or vision, motion, decision-making, anything.


> It is one model of the human brain

No it isn't. It's loosely inspired, but the goal is to make something that works, not to simulate what we know of the human brain. There's a different field, computational neuroscience, which does try to make faithful models of the human brain.


An rnns success is related to, and determined by the programmers model of the space.


Yes, sort of, for example: http://deevybee.blogspot.com/2012/09/what-chomsky-didnt-get-... It depends a lot too on what exactly you mean by innate.


I think this is somewhat visible if you look at any kind of artistic writing. Or realize that grammar is made after the fact. The language itself never follows the grammar perfectly; grammar is a scaffolding we invent to help others learn the language.

Hell, you can communicate pretty well by just dropping nouns in sequence. That we have no problems understanding information presented in such way, or maths, or music notation, or programming languages, suggests that we don't have "internal grammars".


Yes, (most) linguists seem to be unique in treating this as dogma. It's not the only possible explanation consistent with current evidence.

Edit: by "innate", people mean that it's something you're born with, rather than learned through experience.


I am having a frustrating time finding a good summary of the position, but the branch that studies this is informally known as West Coast Linguistics.


That's just one example, construction grammar (not limited to US west coast). The innateness of "universal grammar" can be argued against on evolutionary grounds (how could one universal grammar evolve and end up in the DNA of all humans?), neuroscientifically (how can you encode a fixed grammar when everyone's brain is different and continually changing?), from comparative linguistics (there are many languages and they can be radically different), etc.

If it's not about innateness but about the claims of "internal grammar" from the press release then it because less clear that there would be many people who would reject such a position. I think this is a misleading on the part of the press release. Whether something was innate or learned, I think most people would agree that knowledge of language is internalized. So then the question remains whether you call it grammar or something else, and whether statistics plays a part should have no bearing on that if you ask me.


pdf link, because somehow University press releases never contain references to the actual paper: http://psych.nyu.edu/clash/dp_papers/Ding_nn2015.pdf


HN discussion from 6 months ago upon the 50th anniversary of the theory:

https://news.ycombinator.com/item?id=9762001


Well, sooner or later it'll turn out it's all mostly that we're wired to learn grammars; and that we're biased towards learning some types of grammars better than others. (I think evidence has been plenty, even before this.)

That's not quite the same as how some people seem to understand Chomsky, a kind of grab-bag of hardwired grammar rules tucked away in some strand of DNA. But still, there it is.


As someone who is a fan of the generative grammar tradition, that's pretty cool!

Going to have to read the full study later, but it sounds like they have some decent results from that press release (though not total evidence for the thesis).

Hopefully this finding can contribute toward pushing studies in neuroscience, psychology, and linguistics further.

Has some implications for a computational theory of mind too.


"Neuroscientists and psychologists predominantly reject this viewpoint, contending that our comprehension does not result from an internal grammar;"

Fascinating article, highly recommended. Not trying to be nitpicky but I'm going to need a citation or few to support some of the claims made about "neuroscientists and psychologists"... just saying.



No point in asking for a citation on a press release statement, go directly to the paper (which seems to be much more careful in what it claims).


Math, computation, grammars, are just models, syntax, that we sometimes use to consciously express things about the world. We could assign syntax to a system composed of whatever thing and say that is Turing-complete or that follows the productions A->B and B->BB of G_1, an imaginary grammar, just because we can assign to it certain symbols as inputs and others as outputs. Numbers, symbols, don't exist in the real world. Math entities only exist as syntax, as human artifacts. They're language entities. That's why they're based on axioms.

I don't see the point in saying that syntax is physical - I mean, the neurons firing in our brains because of photons getting in our eyes explain how I recognized the shape of a car, not that a 'hypothetical 2D array' gets 'processed' with some 'gradient-based algorithm' unconsciously in our brain. It doesn't make biological sense. Turing didn't even defined what a symbol or a computation is in terms of Physics, he just made a (very cool and beautiful, by the way) model.

Chomsky is simply wrong. He's just thinking here that language is suddenly physical, that we have a grammar-processing CPU in our brains, a form of a Universal Turing Machine! Can this number, '1', be physically somewhere in the world, taking in account modern physics and not some weird Platonism? or we just use it as language to refer to physical entities to talk about them? I do think the latter is the truth. Grammars are math entities which we sometimes use to refer to certain things written in books or other spoken things that just happen to have some physical shape in form of sound waves (words) or even to refer to an imaginary entity we've created with math, just like abstract machines. I don't think there's some weird computational thing going on in my head, that's just an oxymoron. Just neurons firing, blood being pumped, neurotransmitters being sent, in all its glorious and mysterious nature. The mind, on the other hand, exists as a high-level feature of some parts of the bigger brain system, it's caused by real things; that mind that lets us do conscious calculations. But that 'Universal Grammar' wouldn't be inside my mind anyway because it's supposed to be unconscious. It wouldn't be a feature of the mind-spanning parts of the brain anyway.

In the end, it's just the whole Nietzschean critique of mistaking language with reality all over again.

I cannot recommend John Searle's work on the subject enough, like the book 'The Rediscovery Of The Mind'. He explains all of this way way better than me.

P.S: Yes, if you think hard about it, this logically implies that a Strong AI won't ever exist. Sorry, Skynet!

P.S.2: Sorry for the long post and possible grammatical mistakes, I'm not a native. And I sincerely hope you enjoyed this post even if you don't agree with me. Have a nice day!


The study relies on isochronously spoken words!

Inside joke: I recently published a network protocol that relies on sending words isochronously between switches.

I'll take this as validation that I'm on to something :-)




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: