Hacker News new | past | comments | ask | show | jobs | submit login

I disagree. They were using it as sample data, when they should have used synthetic sample data. It's totally sane to check-in a JSON (or similar) file with a few hundred [edit: non-confidential] samples so that you can run integration tests without needing to set anything else up.

Similarly, I keep small pdf manuals in git. I add them in their own commit, which doesn't have a useful diff. In exchange, they're always there and I don't need to spin up some special one-off system.




Fixture data (for testing) is fine. Sometimes a few lines is the minimum viable to test, sometimes a hundred or so lines of JSON. As long as the data is being optimized for testing and you aren't just storing a GB because it's easier.

I was replying to a comment asking why PII data shouldn't be stored in git vs a database. Just tried to give a general rundown of how to think about the tools.

Totally agreed that the data should have been synthetic/scrubbed and that a small sample set (large enough to validate) would be ok.

Sometimes a PDF or a few pictures are necessary (or just much simpler). I get that. I have seen some repos with an unreasonable amount of PDFs/Docs/pictures/etc.. and that's when a script that copies them into a gitignored directory from someplace (Dropbox/S3/etc..) is a better fit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: