> Secondly, there is still a big human selection process going on. Only the most interesting and coherent images will find their way onto the public internet.
That's rather optimistic. Plenty of low quality images are generated and posted to the public Internet.
What is true is that most of that generated junk doesn't have much of a shelf-life, which reduces the odds of it making it into the training data.
That's rather optimistic. Plenty of low quality images are generated and posted to the public Internet.
What is true is that most of that generated junk doesn't have much of a shelf-life, which reduces the odds of it making it into the training data.