I am curious how do these AI get the data sourced? Is it the free floating image...

Cthulhu_ · on Aug 24, 2022

Bit of column A, bit of column B; I'm confident that Facebook and Google have huge datasets of people's personal information and photos, but they will not use them in anything public facing, because they know this scares people and will cause huge lawsuits.

So for the public image generators, they will use public images; I believe places like artstation and flickr will be used a lot. Also because they will include metadata, either on the descriptions or in the photo's metadata, about the location, scene, setting, camera, etc.

Then (and I'm mainly thinking of Dall-E here) there's other open caches of data, like collections of artwork along with intricate descriptions of what they depict.

tarunmuvvala · on Sept 1, 2022

Thanks for answering. If its Publicly sourced data. There is huge opportunity for some create a DATA DAO where data better labelled and people can use their data to use the AI for their usecases.

What do you think of challenges for such use cases?