Hacker News new | past | comments | ask | show | jobs | submit login

I agree that in this case the animated parts of the graphics were not needed, it's an easy pitfall to be distracted by the beautiful aspects of visualisations when crafting them.

I feel the need to defend the author though, it's hard to make research accessible while still distilling valuable insight. I think his post on transformer networks [1] did a good job for example, and you'll appreciate the lack of animations.

[1] https://jalammar.github.io/illustrated-transformer/




Yes this seems like an early work in progress, compared to Jay's previous Transformer articles.

In addition to your link, I've found a really good Transformer explanation here (backed by a Github repo w/ lively Issues talk): http://www.peterbloem.nl/blog/transformers

Additionally, there's a paper on visualizing self-attention: https://arxiv.org/pdf/1904.02679.pdf


Can't edit the post anymore so adding it here - further research reading on improving the current attention model: https://www.reddit.com/r/MachineLearning/comments/hxvts0/d_b...


That's a good complement, thank you for the links




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: