Hacker News new | past | comments | ask | show | jobs | submit login

The whole show in ML these days is in the infra around massively distributed, redundant, tunable, train the same fucking transformer we’ve all been using since 2017.



What about diffusion models? (And all the other things with less publicity that aren’t that)


Sure, I’m obviously being a big glib.

So it’s transformers plus let’s force some loss into the super-resolution pipeline.

“Residual” sounds a lot fancier than X = fn + X, and “Diffusion” sounds a lot fancier than “let’s do deconv repeatedly and jam a loss term in there”.

My point is all of this stuff is on Huggingface. You/I/We just don’t have 5k A100s for 6 months, so we can’t play.


If you can find a more efficient or effective approach than this “redundant” one you’ll be rich. There’s a lot of smart people working in this space.


I’m not the person to break this logjam of “moar GPUs”. But I agree that someone will, and I agree that person will do quite well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: