I agree about the part of not accessing information from production. But I am wo...

Kalium · on June 18, 2021

> But I am wondering how could we debug or test something which happens only on production? I ask this because there are some bugs that can appear at the intersection of code and data.

I've found that your strategy depends greatly on the kind of bug and what kind of service:

* If you're implementing a DNS server, you can copy live queries and compare good-to-bad. Then you can notify when something bad crops up. But odds are you aren't implementing a DNS server.

* If you're working on something whose behavior potentially changes under load, you need to find a way to replicate load. Some companies have entire production environments where release candidates are sent without being less secure. Cloudflare has some of these - I implemented one of the early versions.

* If you're dealing with weird logic tied to edge cases in the database, you need to work to identify those. Having live data often makes it only marginally easier.

There are products out there that will synthesize large amounts of production-like data based on the patterns in your database. I've used tonic.ai, and I know there are others. As you say, this is a touchy process with nasty error cases. Having someone else implementing it might be desirable.

eru · on June 18, 2021

Use a copy of production (perhaps anonymized) for debugging, and delete the copy afterwards.

> But then if there are bugs suddenly we have a live internal DB with customer data which is not wanted.

Don't let the production-copy touch your normal development environment. Make sure it's deleted in time.

onion2k · on June 18, 2021

Use a copy of production (perhaps anonymized) for debugging, and delete the copy afterwards.

This way of debugging assumes a lot of things;

- You're assuming that your anonymization script works. What if some data isn't removed?

- What if the system you're using for debugging sends an email or connects to a webhook or attaches to a remote volume or pushes to a cloud service etc etc? Did your anonymization step really work?

- What if someone has connected the system you're debugging on to a production service by mistake? That would mean you're not even using the anonymized database. You're really on production..

- What if you forget to delete the database afterwards? Or forget to purge a cache? Or you fail to delete a container? Or you do delete the container, but not the container volumes? That production data is still there. Oops.

It's much simpler to just not use production data for debugging. It makes debugging harder, which is annoying, but you can't go wrong and accidentally leak your user's data. I'd prefer to just spend more time on debugging than have my users data be put at risk.

eru · on June 18, 2021

Yes, obviously you'd try to debug as much as possible without touching production data.

Of course, different businesses also have different requirements on how sensitive production data is.