Hacker News new | past | comments | ask | show | jobs | submit login

I have never seen a “dev” instance of a DB that wasn’t just a snapshot of the prod DB from earlier. I admit haven’t seen many - but I have seen zero of any other kind (e.g. anonymized or synthetic)



Just going to throw out there that I’ve never seen a dev database that was anything other than fake data, or internal dogfood data. Have worked at major public tech companies and late-stage startups.


I think one reason might be that this was never sensitive personal data. Phone numbers, emails and addresses mostly corporate. But real passwords (hashes) from real users, on 50+ laptops with unencrypted drives was pretty normals.

I think culturally there may be a difference since I'm in a place where some data (addresses, phone numbers, ...) is public info, i.e. given your name I can get your address and phone number from a public DB anyway.


I've done this plenty of times, but you can anonymize the data fairly easily and get the benefits of both.

The flip side of the coin is dev databases not being representative of production, this can cause performance issues. "It works with 10 rows on localhost, why doesn't it work with a million in production?".




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: