So it’s transformers plus let’s force some loss into the super-resolution pipeline.
“Residual” sounds a lot fancier than X = fn + X, and “Diffusion” sounds a lot fancier than “let’s do deconv repeatedly and jam a loss term in there”.
My point is all of this stuff is on Huggingface. You/I/We just don’t have 5k A100s for 6 months, so we can’t play.