Referenced in this paper: "Overall, while approaches such as FNet, Performer, an...

light_hue_1 · 2025-02-26T16:21:17 1740586877

Except that the paper is written as if they discovered that you can use an fft for attention. They even have a "proof". It's in the title. Then you discover everyone already knew this and all they do is as some extra learnable parameters.

Pretty lame.

hinkley · 2025-02-26T22:29:11 1740608951

Search engines don't always turn up prior art the way you'd like. Simple jargon discrepancies can cause a lot of mischief. Though I'm sure a case could be made about it being confirmation bias. It's hard to get people to search in earnest for bad news. If it's not in your face they declare absence of evidence as evidence of absence.