Attention and Augmented Recurrent Neural Networks

legel · on Sept 12, 2016

This is a deeply appreciated update on the state of the art from Christopher Olah and the Google Brain team, including great insights into the nature of engineering attention. I'd be curious to understand more about how making so many parameters differentiable suddenly opens up so many pathways... In any case, as always, elegant visualizations to match with a cohesive set of simple and strong insights. Here's a few lovely bits from their reflections on the big picture:

"In general, it seems like a lot of interesting forms of intelligence are an interaction between the creative heuristic intuition of humans and some more crisp and careful media, like language or equations. Sometimes, the medium is something that physically exists, and stores information for us, prevents us from making mistakes, or does computational heavy lifting. In other cases, the medium is a model in our head that we manipulate. Either way, it seems deeply fundamental to intelligence."

ThePhysicist · on Sept 9, 2016

Great article! Nicely sums up recent results and has some great diagrams to go with the text.

visarga · on Sept 9, 2016

Great interactive visualizations.

dharma1 · on Sept 9, 2016

Source here https://github.com/distillpub/post--augmented-rnns

habitue · on Sept 9, 2016

This is a great survey. Does anyone know the back story behind distill.pub?