It is a very tough problem to solve. Especially when you consider the richness of the datasets you use to put the pieces together.
In my experience the only effective means is to poison the data, in addition to the common sense steps mentioned here.
Poisoning a dataset means seeding a wide variety of the datasets you use to discover PII with fictional look alikes that resist debunking.
Additionally you can poison the core set if you are very clever about it.
It is a very tough problem to solve. Especially when you consider the richness of the datasets you use to put the pieces together.
In my experience the only effective means is to poison the data, in addition to the common sense steps mentioned here.
Poisoning a dataset means seeding a wide variety of the datasets you use to discover PII with fictional look alikes that resist debunking.
Additionally you can poison the core set if you are very clever about it.