This is a deeply appreciated update on the state of the art from Christopher Olah and the Google Brain team, including great insights into the nature of engineering attention. I'd be curious to understand more about how making so many parameters differentiable suddenly opens up so many pathways... In any case, as always, elegant visualizations to match with a cohesive set of simple and strong insights. Here's a few lovely bits from their reflections on the big picture:
"In general, it seems like a lot of interesting forms of intelligence are an interaction between the creative heuristic intuition of humans and some more crisp and careful media, like language or equations. Sometimes, the medium is something that physically exists, and stores information for us, prevents us from making mistakes, or does computational heavy lifting. In other cases, the medium is a model in our head that we manipulate. Either way, it seems deeply fundamental to intelligence."
"In general, it seems like a lot of interesting forms of intelligence are an interaction between the creative heuristic intuition of humans and some more crisp and careful media, like language or equations. Sometimes, the medium is something that physically exists, and stores information for us, prevents us from making mistakes, or does computational heavy lifting. In other cases, the medium is a model in our head that we manipulate. Either way, it seems deeply fundamental to intelligence."