I wouldn't immediately call creating synthetic data 'poisoning the well' unless ...

Onawa on Nov 23, 2023 | parent | context | favorite | on: ChatGPT generates fake data set to support scienti...

I wouldn't immediately call creating synthetic data 'poisoning the well' unless it is actually distributed as such. For training models with a minimal amount of quality data, it is a viable method for generating more data to increase the quality of the models. But any legit organization will obviously label synthetic data as such.