Yes the training process looks at pixels, because that’s all it has. That’s the point. Humans don’t look at pixels, they learn ideas. It’s not in the least bit surprising that AI models shown a bunch of examples sometimes replicate their example inputs, examples are all they have, and they are built specifically to reproduce images similar to what they see, I’m not sure why you consider that idea “loaded”.
Again, naming Rutkowsi invokes an impression of his style. But, copies none of his paintings.
Read the paper. What I found is that a random sampling of the database naturally found a small subset of images that are highly duplicated in the database. Researchers we able to derive methods to produce results that give strong impressions of images such as: a map of the United States, Van Gogh's Starry Night, and the cover of Bloodborne :P with some models and not at all with others. The researchers caution against extrapolating from their results.
> We speculate that replication behavior in Stable Diffusion arises from a complex interaction of factors, which include that it is text (rather than class) conditioned, it has a highly skewed distribution of image repetitions in the training set, and the number of gradient updates during training is large enough to overfit on a subset of the data.
Here are the examples you requested: https://techcrunch.com/2022/12/13/image-generating-ai-can-co...
Yes the training process looks at pixels, because that’s all it has. That’s the point. Humans don’t look at pixels, they learn ideas. It’s not in the least bit surprising that AI models shown a bunch of examples sometimes replicate their example inputs, examples are all they have, and they are built specifically to reproduce images similar to what they see, I’m not sure why you consider that idea “loaded”.