i mean i respect that but it makes me uncomfortable that you have to prompt engineer this. uses up context for a lot of boilerplate. why cant we correct for it in the training data? too hard?
I think this is the right way to handle it. Not all cultures are diverse, and not all images with groups of people need to represent every race. I understand OpenAI, being an American company, to wish to showcase the general diversity of the demographics of the US, but this isn't appropriate for all cultures, nor is it appropriate for all images generated by Americans. The prompt is the right place to handle this kind of output massaging. I don't want this built into the model.
Edit: On the other hand as I think about it more, maybe it should be built into the model? Since the idea is to train the model on all of humanity and not a single culture, maybe by default it should be generating race-blind images.
Race-blind is like sex-blind. If you mix up she and he randomly in ordinary conversation, people would think you've suffered a stroke.
If a Japanese company wanted to make an image for an ad showing in Japan with Japanese people in it, they'd be surprised to see a random mix of Chinese, Latino, and black people no matter what.
I'm telling the computer: "A+A+A" and it's insisting "A+B+C" because I must be wrong and I'm not sufficiently inclusive of the rest of the alphabet.