Actually, that algorithm is the result of many iterations of a simple algorithm written by a human. It is itself the output of the algorithm, not too different than the machine code generated by a compiler after optimizations. The author says as much and names the algorithm-producing algorithm:
But at the core the approach we use is also really quite profoundly dumb (though I understand it’s easy to make such claims in retrospect). Anyway, I’d like to walk you through Policy Gradients (PG), our favorite default choice for attacking RL problems at the moment.
What the human is doing is identifying the problem, writing a solution, and simply letting the computer work out the parameters through various statistical methods. But the resulting algorithm itself is pretty much in the narrow class of algorithms that were already described by the human. The computer just executed a dumb and straightforward search through a space of parameters, which itself was a simple preprogrammed algorithm.
Look, most of our science is also pretty much parametrized models, often with smoothness assumptions for calculus. Now with Deep Learning we may indeed find more interesting parametrized models. But that is a far cry from understanding abstract logical concepts and manipulating them to come up with entire algorithms from scratch to solve problems.
You seem to put a lot of weight into the notion that all these things can be reduced to essentially a search for parameters with in a particular (not all-encompassing) solution space.
Do we have any good reason to suppose that a parametrized model isn't enough for everything we'd want, including a system that has human-level or higher intelligence and creativity? (assuming adequate structure that we don't yet know, that allows for a sufficiently large solution space)
We have good reasons to assume that we ourselves, the sum of a particular person's memories, skills, identity and intelligence, are contained within a particular set of parameters encoded by different biochemical means in our brains, and the process of how we learn skills, facts and habits is literally a search through that space of parameters.
"far cry from understanding abstract logical concepts" is more related to the types of problems we're tackling - symbolic manipulation and reasoning is a valid but very distinct field of AI, but it's not particularly useful for these problems any more than it's useful to having a human programmer craft explicit algorithms for computer vision.
You would expect a computer system "to come up with entire algorithms from scratch to solve problems" if you were making a computer system to solve the general problem of machine learning, i.e., a program to replace the research scientist making learning systems, not the human who currently solves the particular problem. We aren't trying to do that, it is a bit of a different direction, isn't it?
We have good reasons to assume that we ourselves, the sum of a particular person's memories, skills, identity and intelligence, are contained within a particular set of parameters encoded by different biochemical means in our brains, and the process of how we learn skills, facts and habits is literally a search through that space of parameters.
Actually, I have better reasons to assume that a parametrized model would poorly describe a human brain. We grow organically out of cells replicating in an environment for which they have been adapted. These cells make trillions of neural connections. Each cell has its own DNA etc.
We have already tried understanding just the DNA using straightforward parametric models, and they are too organic to be described that way.
It is far more likely that human brains are specifically adapted to the world they live in, and can operate with abstract concepts which are encoded in fuzzy ways (like a progressive JPEG, for example) that allow us to apply concepts to situations and search for concepts that fit situations. The concepts themselves are the hard part. It's not really a parametric model. Each concept represents experience that is stored between neurons.
Yes, we can teach these concepts to a computer eventually but we would have to figure out a language to express this info and data structures to store it. We'd still be designing the computer to mimic what we think we do. Ultimately for the computer to truly replicate what humans do it might need to simulate a gut brain, neurotransmitters etc. And even it would be only a simulation.
I think computer intelligence is just of a very different sort that human intelligence. Less organic, far less ability to come up with new concepts or reprogram itself to "understand" concepts. It is fed parametrized models and does a brute force serch or iterative statistical approximations, and then saves the precomputed results, that's all. That's why humans can recognize a cat with a brain that fits inside your head and consumes low energy, and computers need a huge data center which consumes a lot of energy.
We aren't replicating human intelligence. We are building huge number crunchers, and the algorithms are still written by humans.
Even our languages are too tied up in organic experience acquired over the years (refereces to current events, puns, emotions, fear of some animals vs dominance over others, inside jokes of each community etc) that language recognition is currently quite dumb and has trouble with context. Once again we solve this by dumbing down the human input, making people talk to computers differently than they would if the computer had "understood" anything they would say as a human with similar experience would.
When computers write algorithms to solve arbitrary problems the way groups of humans do, then I'd admit we made a huge leap forward. As it is, AlphaGo and self-driving cars are the result primarily of human work and refinement of the algorithms. It just is amazingly smart because computers crunch numbers fast, consistently and replicate what works across all the instances.
Computer AI does raise philosophical questions of identity and uniqueness, but currently they are not capable of true abstract thought.
And once again the rules "we all know" were fed to it by humans through a language and data structures and code devised by humans and now we will judge whether it does well and replicate the result to millions of machines. We are still doing nearly all the actual design.
But at the core the approach we use is also really quite profoundly dumb (though I understand it’s easy to make such claims in retrospect). Anyway, I’d like to walk you through Policy Gradients (PG), our favorite default choice for attacking RL problems at the moment.
What the human is doing is identifying the problem, writing a solution, and simply letting the computer work out the parameters through various statistical methods. But the resulting algorithm itself is pretty much in the narrow class of algorithms that were already described by the human. The computer just executed a dumb and straightforward search through a space of parameters, which itself was a simple preprogrammed algorithm.
Look, most of our science is also pretty much parametrized models, often with smoothness assumptions for calculus. Now with Deep Learning we may indeed find more interesting parametrized models. But that is a far cry from understanding abstract logical concepts and manipulating them to come up with entire algorithms from scratch to solve problems.