Hacker News new | past | comments | ask | show | jobs | submit login
Hopfield Networks Is All You Need (ml-jku.github.io)
184 points by meiji163 on May 1, 2021 | hide | past | favorite | 40 comments



Brief abstract for the lay person (like me):

1. Hopfield Networks are also known as "associative memory networks", a neural network model developed decades ago by a guy named Hopfield.

2. It's useful to plug these in somehow as layers in Deep Neural Networks today (particularly, in PyTorch).

I hate non-informative titles!


The title is a reference to the famous machine learning paper "Attention Is All You Need" which introduced the concept of transformers. Transformers have revolutionized how we process sequential data (i.e. natural language processing).


And recently, a paper titled Attention Is Not All You Need has made the rounds arguing that some of the claims made in the AIAYN paper may have been overstated. https://arxiv.org/abs/2103.03404


If you read the title, it only refers to the multi-head-attention part of BERT, excluding the feed forward and skip connections, hence calling it "Pure Attention".

> Attention is Not All You Need: Pure Attention Loses Rank Doubly Exponentially with Depth

This does not prove the original title was wrong, and this paper is not a counter, but an analysis of a submodule which helps better understanding transformers.


> famous machine learning paper "Attention Is All You Need"

1. It's a paper from 2017. Unless you follow academic ML research, you will not have heard of it.

2. That paper's title is also inscrutable unless you've gone and read at least the abstract.


Which itself is a reference to the 1967 Beatles song All You Need is Love (which also includes the line "Love is all you need").


Also... While cute, I found the examples of storing and retrieving images of The Simpsons characters not very informative about what goes on in that weight matrix that stores patterns.

Edit: the linked pytorch implementation looks interesting, these layer types promise pretty incredible things https://github.com/ml-jku/hopfield-layers


Not to mention grammatically incorrect ones :/


I think it's correct, in the same way that you can say "Rolling Stones is a great band". It's about the tech called "Hopfield Networks", not about any particular number of networks that are all you need.


Not doubting what is officially grammatically correct, but that still sounds really weird to me. Like with sports teams I would only ever say "The Patriots are a good team" or "New England is a good team". Not "The Patriots is a good team".

In any event, the authors definitely chose that title as a callback to the well known paper "Attention is all you need", which introduced Transformers. So that probably influenced their decision to use "is" instead of "are".


Consider ‘My team is a good team’ vs. ‘My team are a good team’.

I bet ‘is’ sounds better to you in this context, though ‘my team’ and ‘The Patriots’ are similar noun phrases that could refer to exactly the same thing.

The difference is that Patriots is a plural. Replace it with Manchester United and ‘is’ sounds good again.


Yeah it's definitely caused by the team name being plural, or at least sounding plural - I've never heard anyone say "The Red Sox is good" either. Regardless of what is technically grammatically correct I think real life usage has pretty much settled on that convention, at least in the US.


Something odd : While "The Red Sox are John's favorite team." seems more natural then "The Red Sox is John's favorite team.", phrasing it in the opposite order, "John's favorite team is the Red Sox." seems more natural then "John's favorite team are the Red Sox." .

This seems like a strange discrepancy. Why is this the case? Maybe it is because "favorite team" is clearly singular, and is closer in the sentence to the "is"/"are" then the plural indicating sound in "Red Sox". Or maybe it is just whichever comes first which determines how the "to be" is conjugated?

Hm, but what if instead of connecting a noun phrase (determiner phrase?) like "The Red Sox" to another noun phrase (determiner phrase) "John's favorite team", we instead connect it to an adjective?

"The Red Sox are singular.", "The Red Sox is singular.", "Singular is The Red Sox." "Singular are The Red Sox." . Well, the "[Adjective] is [noun]" is kind of an unusual thing to say unless one is trying to sound like one is quoting poetry or yoda or something, but to the degree that either of them sound ok, I think "Singular are The Red Sox." sounds better than "Singular is The Red Sox." . Though, in this case, there doesn't seem to be anything grammatically suggested by the adjective that the thing be in the singular case (maybe I shouldn't have used "singular" as the adjective..) .

Hm, what if instead of "John's favorite team [is/are] the Red Sox." , we instead look at "John's favorite [is/are] the Red Sox." ? In this case, it seems, less clear which is more natural? They seem about the same to me (but that might just be me, idk.) .

Anyway : Weird!


You're totally right I would definitely always say "my favorite team is", and probably also would say "my favorite is". I think the subject of the sentence is what determines it grammatically, also since that's what you say first it makes sense it would affect your choice of verb more.

I actually think this could extend to a lot of situations where the object is referring to a single group, not just plural-sounding proper nouns. Like if asked "what was your favorite zoo exhibit?", I would probably respond "my favorite was the giraffes" not "my favorite were the giraffes". I'm actually not even sure what the correct response would be technically though. "My favorites were the giraffes" implies multiple favorites, and "my favorite was the giraffe" makes it sound like the exhibit had a single giraffe. So it feels like subject/object have to mismatch then.


This is language dependent.

In English: It's only five minutes to the bus stop.

In German: Es sind nur fünf Minuten zur Bushaltestelle. (It are only five minutes to the bus stop.)

I think it's a question of whether the verb is supposed to agree with the subject or the complement.


> phrasing it in the opposite order, "John's favorite team is the Red Sox." seems more natural then "John's favorite team are the Red Sox.

interesting. to my (non-native) ears, the second sounds more natural. Wonder how common preference for each of those is.


IME people in the US almost exclusively use "is" in this context, but it definitely is weird to actually think about it.


Trending as John Hopfield scheduled to present his "biologically plausible" response to the Modern Hopfield Network at ICLR next week:

Large Associative Memory Problem in Neurobiology and Machine Learning

https://arxiv.org/abs/2008.06996

MHN seem ideal for prediction problems based purely on data, such as chemical reactions and drug discovery:

Modern Hopfield Networks for Few- and Zero-Shot Reaction Prediction

https://arxiv.org/abs/2104.03279


Krotov (Hopfield's co-author in these set of papers) has a tweetutorial for that paper in your first link

https://twitter.com/DimaKrotov/status/1387770672542269449


“Sooner or later, everything old is new again.” -Steven King


"I’m fashionable once every 15 years, for about three months" - John Cooper Clarke


Quoting: "We introduce a new energy function and a corresponding new update rule which is guaranteed to converge to a local minimum of the energy function."

Is this a minimum in a local area or local in the range of some function? I could see perhaps that'd being an advantage if you happen to know that local part of the range

In contrast we're usually looking for global min/max say with annealing algorithms. How is local is better in the context of this paper than global?


They mean local minimum as in an attractor state. Each "memory" is an attractor state stored in the Hopfield network.


I’ve seen a lot of efforts to add a notion of associative memory into neural networks. Have any exciting applications of such architectures been publicised?


Just some days ago researchers from Peking U and Microsoft published a paper[0] saying they can access "knowledge neurons" in pretrained embeddings that will enable "fact editing"[1].

[0]: https://arxiv.org/pdf/2104.08696.pdf

[1]: https://medium.com/syncedreview/microsoft-peking-u-researche...


I thought that Transformers were a type of associative memory.



Relevant paper from Misha Belkin's group https://arxiv.org/abs/1909.12362


I looked at the paper but it was way over my head.

Can anyone explain it in simpler terms to a person who barely understands attention models and has no idea what associative memory means here?


Nice paper! I used Hopfield networks in the 1980s. I hope that I can clear a few hours of time this week to work through this. I admit that for machine learning, that I have fell into the “deep learning for everything pit” in the last six or seven years. Probably because DL is what I usually get paid for.


Off-topic, but does anyone know what Jekyll theme this is? Absolutely beautiful formatting and color scheme.


Further off-topic, but do people actually consider this to be beautiful design? Looks like a rendered markdown document with MathJax and green headers. Perfectly appropriate for the content of the post, but beautiful isn't the first word that comes to mind for me.


I don't think it's awful but I don't like it.

I really wish I could literally just dump LaTeX onto the web and be done with it. Everything I've tried either doesn't work (Pandoc is cute) properly / isn't 1:1, or does work but yields enormous amounts of html (pdf2htmlex).

I am fairly happy with [insert MD->Book tool of your choice], but sometimes I want citations and things like that.


Beauty is in the eye of the beholder, isn’t it? I like the font, as well as the greens, blues, and header gradient. Green is my favorite color.

I also like dark themes (although I wouldn’t force those on my viewership).


No I very much dislike it.



This reminded me of a very old fun side project of mine [1] that had made me look at neural networks from a different perspective.

[1] https://github.com/milosgajdos/gopfield


If I understood them correctly, they store all the training samples and then select one most similar to a given input.


Are*


Amazing




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: