Let us be a little bit fair to the researchers who adopted the p=.001 or p=.0001 uncorrected approaches. Their approach wasn't completely unreasoned, and was even justifiable at one time given available methods.
There were mainly two approaches to multiple comparison corrections: Bonferroni and setting an uncorrected threshold. People here might say, well yeah, use Bonferroni.
However, Bonferroni is really only appropriate when comparisons are independent. Voxels (3D pixels) which are adjacent are highly dependent, and indeed the brain is generally correlated. This dependency makes Bonferrnoi correction (very) inappropriately conservative. Given the average dependence of voxels, some researchers estimated that the average number of true comparisons might be on the order of hundreds to a few thousand. In practice, researchers corrected with Bonferroni, either found a really strong effect, or reset using uncorrected threshold. Some reported results using both. People who read the results interpreted results that way too. Bonferroni = reliable, uncorrected = provisional
The contribution of the salmon study and other research papers is that they truly demonstrated that the typical uncorrected thresholds in use were insufficient to control false positives.
You're right. I definitely don't mean to sound like I was an enlightened graduate student. Nothing ever passed FWE using Bonferroni, so we almost always resorted to using uncorrected p-values with cluster thresholding, with the cluster and voxel thresholds set from using alphasim (which gets the probability of having a cluster of that size significant from a random dataset, given the smoothness of your actual images).
If I recall correctly, all the major neuroimaging packages (AFNI, SPM, FSL) had options for cluster-size thresholding at the time. Along with tools like alpha sim to estimate cluster-level FDR (but I think that ultimately had issues with it's algorithm, discovered only a few years later...).
I just remember thinking that if the salmon paper had a reasonable cluster-threshold, none of the spurious voxels would have been considered in the final analysis.
Sorry, I was writing to HN more than responding to you particularly. It is sometimes easy for non-scientists to underestimate scientists and think of them as fools, when in fact the problems are frequently hard.
I believe that you are correct that about the time of the salmon poster there were other methods available for multiple comparison correction. The work in the early- to mid-2000's was much more "wild-west" however.
Indeed cluster correction may have its own issues, re your link. I think that a good approach these days is to eschew whole-brain approaches for theory-drive, a prior i ROIs, then supplement those analyses with a whole brain exploratory analysis.
There were mainly two approaches to multiple comparison corrections: Bonferroni and setting an uncorrected threshold. People here might say, well yeah, use Bonferroni.
However, Bonferroni is really only appropriate when comparisons are independent. Voxels (3D pixels) which are adjacent are highly dependent, and indeed the brain is generally correlated. This dependency makes Bonferrnoi correction (very) inappropriately conservative. Given the average dependence of voxels, some researchers estimated that the average number of true comparisons might be on the order of hundreds to a few thousand. In practice, researchers corrected with Bonferroni, either found a really strong effect, or reset using uncorrected threshold. Some reported results using both. People who read the results interpreted results that way too. Bonferroni = reliable, uncorrected = provisional
The contribution of the salmon study and other research papers is that they truly demonstrated that the typical uncorrected thresholds in use were insufficient to control false positives.