Hacker News new | past | comments | ask | show | jobs | submit login

Many of us believe that Vision/NLP research falls into the category of AI applications, and does not inform insights into how to achieve generally intelligent agents.

Hmm... I can agree that vision and NLP could be seen as "applications", from one point of view. But I can see another position where each simply represents a different aspect of underlying cognition. Language in particular, seems to be closely tied up in how we (humans) think. And without proposing a strong version of the Sapir-Whorf hypotheses, I can't help but believe that a lot of human cognition is carried out in our primary language. Now to be fair, this belief comes from not much more than obsessively trying to introspect on my own thinking and "observe" my own mental processes.

In any case, it leads me to suspect that building generally intelligent AI's will be tightly bound up with understanding how language works in the brain and the extent to which there is a "mentalese" and how - if at all - a language like English (or Mandarin or Tamil, whatever) maps with "mentalese". Vision also seems crucial to the way humans learn, given our status as embodied agents that learn from the environment using site, smell, sound, kinesthetic awareness, proprioception, etc.

Quite likely I'm wrong, but I have a hunch that building a truly intelligent agent may well require creating an agent that can see, hear, touch, smell, balance, etc. At least to the extent that humans serve as our model of how to construct intelligence.

On the other hand, as the old saying goes "we didn't build flying machines by creating mechanical birds with flapping wings". :-)




I'm not saying NLP/Vision is not important, but we approach these modalities in a very specific way.

When we "work on NLP" it looks something like https://blog.openai.com/learning-to-communicate/ or http://www.foldl.me/2016/situated-language-learning/, as an emergent phenomenon in the service of a greater objective - not an end but a means to an end. It does not look like doing sentiment analysis or machine translation.

When we "work on Computer Vision" it looks like including a ConvNet into the agent architecture or a robot, it doesn't look like trying to achieve higher object detection scores.


an emergent phenomenon in the service of a greater objective - not an end but a means to an end.

Ah, sounds like we are on the same page then.


I have to disagree with you on language and cognition. If you went around asking famous mathematicians how they think the last thing they would tell you would be "words".

Creative and inventive thought is very picturesque and non-linear: https://www.amazon.com/Psychology-Invention-Mathematical-Fie...


Also, thanks for the book recommendation. I ordered a copy. Looking forward to digging into it.


FWIW, I don't intend to suggest that all cognition is done in terms of "word language", just an important bit of it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: