Hacker News new | past | comments | ask | show | jobs | submit login

At the same time, I wouldn't want anyone to be able to enforce certain parts of that. For example, to make sure that data was only kept around as long as needed, you'd need to be able to monitor the contents of all the computers that contained that data. This creates problems of its own, much larger than the original one. To a certain extent, we just have to trust researchers with sensitive data, and severely punish gross violations of that trust.

To be honest, I've heard of many more examples of organizations who put too strict of controls on their data. This is due to researchers trying to walk a line between a requirement that they share their data, and their (understandable) desire to keep their work to themselves as long as possible, so other competing researchers can't publish on it first. A bad data governance committee fails much more often in allowing data contributors to be too strict with their data, even though I agree that a data breach is a worse outcome, and avoiding it should be the highest priority.




Shared data between researchers defeats the purpose of replication.

Suppose two people conduct the same experiment on the same medical data using the same code. If the sample was biased then what’s the point?


Reusability of data is an important part of research. It helps collaboration between researchers, and enables secondary research to take place. Using the same data is important for reproducibility in many cases, because the research isn't about creating the dataset, it's about doing analysis on the data. A lot of original research relies on existing datasets.

Having "good" data is obviously crucial, but it's a separate matter.


It’s not a question of “good” data. Slice and dice perfectly random data and sometimes you get spurious correlations. The only way to separate them from real results is to have completely new data.

It’s not even a question of p hacking or bad design. Preform enough experiments and you always get false positives.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: