Hacker News new | past | comments | ask | show | jobs | submit login

I am curious how do these AI get the data sourced?

Is it the free floating images on the internet or is it the data that we keep on the servers of the BigTech?




Bit of column A, bit of column B; I'm confident that Facebook and Google have huge datasets of people's personal information and photos, but they will not use them in anything public facing, because they know this scares people and will cause huge lawsuits.

So for the public image generators, they will use public images; I believe places like artstation and flickr will be used a lot. Also because they will include metadata, either on the descriptions or in the photo's metadata, about the location, scene, setting, camera, etc.

Then (and I'm mainly thinking of Dall-E here) there's other open caches of data, like collections of artwork along with intricate descriptions of what they depict.


Thanks for answering. If its Publicly sourced data. There is huge opportunity for some create a DATA DAO where data better labelled and people can use their data to use the AI for their usecases.

What do you think of challenges for such use cases?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: