1. You've made an off-hand comment on one of your videos that a sequential dense network is just a generalization of any other type of neural network architecture. In theory you could re-create an RNN or CNN through just Dense layers. But obviously it's not practical.
Why isn't it practical? Is it because the network would have to be too deep, or too wide? Would the optimizer just get stuck in a local minima or would overfitting be inevitable? Or perhaps some combination of issues?
What do you think is the best hope for a generalized network architecture, most similar to our brain?
2. On a somewhat related note, do you have a strong enough faith in the current machine learning algorithms and architectures being used (RNNs, CNNs, capsule networks) that given infinite resources (time training and network size), that we would be able to create a meaningful general AI? Or do you think that our current approach is just incremental and a truly different approach would be required to achieve meaningful AI?
> Why isn't it practical? Is it because the network would have to be too deep, or too wide? Would the optimizer just get stuck in a local minima or would overfitting be inevitable? Or perhaps some combination of issues?
Schmidhuber did a paper a few years ago showing near SoTA performance on computer vision using just a fully connected net. One of our students showed how a convolution is just a weight-tied matrix multiply here: https://medium.com/impactai/cnns-from-different-viewpoints-f...
So the issue is that without the weight-tying, you've got more parameters to regularize (which can decrease performance) and train (which takes longer). So you should use weight tying where you can - e.g. by using convolutions.
In general, domain-specific architectures try to find structure in the underlying data and problem, and use that to decrease the number of parameters we need. The use of implicit factorizations in the inception and xception architectures is a good example.
1. You've made an off-hand comment on one of your videos that a sequential dense network is just a generalization of any other type of neural network architecture. In theory you could re-create an RNN or CNN through just Dense layers. But obviously it's not practical.
Why isn't it practical? Is it because the network would have to be too deep, or too wide? Would the optimizer just get stuck in a local minima or would overfitting be inevitable? Or perhaps some combination of issues?
What do you think is the best hope for a generalized network architecture, most similar to our brain?
2. On a somewhat related note, do you have a strong enough faith in the current machine learning algorithms and architectures being used (RNNs, CNNs, capsule networks) that given infinite resources (time training and network size), that we would be able to create a meaningful general AI? Or do you think that our current approach is just incremental and a truly different approach would be required to achieve meaningful AI?