Hacker News new | past | comments | ask | show | jobs | submit login

On reading this, my first question is about the properties of the "random" feedback matrix. They illustrate what is happening using a tiny 1-width machine and a "random" matrix of "1". It seems like some analysis needs to be done on what kind of "random" is most appropriate to replace the gradient update for larger machines. There could be something really interesting going on such that you could generate some optimal non-random B according to whatever the network topology is.



Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: