I was doing fMRI work around the time this paper was published. It astonished me that people would simply set an uncorrected voxel-level threshold and call it a day. No FWE-correction, no cluster-threshold - just an 0.001 uncorrected threshold. It was sad that this paper needed to be published to get researchers to start paying attention to that.
I'll be honest - when the paper was published I was thinking "no shit - why do we need a paper to tell us what we all learned in stats 101 about multiple comparisons??" And then realized the quantity of fMRI papers that used uncorrected thresholds.
Very similar feeling when the "Voodoo Correlations" paper came out. Except I was admittedly guilty of having presented correlation coefficients from clusters that had already been identified using thresholding. So that paper really did make me take a closer look at some of my figures/conclusions.
Let us be a little bit fair to the researchers who adopted the p=.001 or p=.0001 uncorrected approaches. Their approach wasn't completely unreasoned, and was even justifiable at one time given available methods.
There were mainly two approaches to multiple comparison corrections: Bonferroni and setting an uncorrected threshold. People here might say, well yeah, use Bonferroni.
However, Bonferroni is really only appropriate when comparisons are independent. Voxels (3D pixels) which are adjacent are highly dependent, and indeed the brain is generally correlated. This dependency makes Bonferrnoi correction (very) inappropriately conservative. Given the average dependence of voxels, some researchers estimated that the average number of true comparisons might be on the order of hundreds to a few thousand. In practice, researchers corrected with Bonferroni, either found a really strong effect, or reset using uncorrected threshold. Some reported results using both. People who read the results interpreted results that way too. Bonferroni = reliable, uncorrected = provisional
The contribution of the salmon study and other research papers is that they truly demonstrated that the typical uncorrected thresholds in use were insufficient to control false positives.
You're right. I definitely don't mean to sound like I was an enlightened graduate student. Nothing ever passed FWE using Bonferroni, so we almost always resorted to using uncorrected p-values with cluster thresholding, with the cluster and voxel thresholds set from using alphasim (which gets the probability of having a cluster of that size significant from a random dataset, given the smoothness of your actual images).
If I recall correctly, all the major neuroimaging packages (AFNI, SPM, FSL) had options for cluster-size thresholding at the time. Along with tools like alpha sim to estimate cluster-level FDR (but I think that ultimately had issues with it's algorithm, discovered only a few years later...).
I just remember thinking that if the salmon paper had a reasonable cluster-threshold, none of the spurious voxels would have been considered in the final analysis.
Sorry, I was writing to HN more than responding to you particularly. It is sometimes easy for non-scientists to underestimate scientists and think of them as fools, when in fact the problems are frequently hard.
I believe that you are correct that about the time of the salmon poster there were other methods available for multiple comparison correction. The work in the early- to mid-2000's was much more "wild-west" however.
Indeed cluster correction may have its own issues, re your link. I think that a good approach these days is to eschew whole-brain approaches for theory-drive, a prior i ROIs, then supplement those analyses with a whole brain exploratory analysis.
I was unaware of the salmon paper, but I remember a bit later being very puzzled by a study about comatose/vegetative patients that included brain dead subjects as controls (and off course healthy subjects as well).
I suppose it was meant to convince statistically naive readers that the dead salmon thing didn't apply to their methodology.
We were doing testing for an fMRI experiment that we planned to run with humans later. So, yes, all of the stimuli were presented to the salmon. It only felt a bit ridiculous at the time...
We were the inaugural article in what was to be an entire journal dedicated to surprising/odd results. The salmon paper went through a pretty solid peer review as part of publication in JSUR, so we felt good sending it there.
Our paper came out and got a lot of attention, which was good press for the journal. After that the JSUR founders found that they didn't have much time available and the journal folded a few years later. In the end, we were the only paper it published.
Yeah, actually, we feel like it did serve as a solid illustration as to why proper statistical correction is necessary. We had a lot of emails saying what a useful tool it was for lab meetings and classes.
We did a review of the literature as part of our paper. In 2008 something like 30% of papers in major journals used uncorrected stats. In 2012 it was under 10%. The field was certainly already moving in the right direction, but I think we managed to help things along.
My grad school advisor and I were always scanning interesting things when we had sequence testing to do. We scanned other objects like pumpkins and such as well. We scanned the salmon since we thought it would look interesting on a high resolution T1 scan. It looked really great in the end:
https://www.wired.com/images_blogs/wiredscience/2009/09/fmri...
Sorry man - the karma train is probably done. We wrote a few other papers on fMRI reliability, but nothing with the popularity that the salmon paper had.
There are a huge number of fMRI papers that have significant results, statistically and in terms of impact. I have been out of the field for about five years now, so I am not up-to-date with the latest work. I am sure there is some amazing stuff going on.
"Task: The task administered to the salmon involved completing an open-ended mentalizing task. The salmon was shown a series of photographs depicting human individuals in social situations with a specified emotional valence. The salmon was asked to determine what emotion the individual in the photo must have been experiencing."
"A body is found in the frozen North Dakota woods. The cops say the dead Japanese woman was looking for the $1m she saw buried in the film Fargo. But the story didn't end there."
This is and remains one of my all time favourite scientific research papers.
Understandable, obvious, and not just relevant, actually providing a meaningful contribution to their field by virtue of quantifying existing techniques as not sufficient to fully eliminate spurious data from the 'noise floor' inherent in fMRI machine data.
Huh! When I click "Cancel", the PDF disappears and then there's just a centered box saying "This file is password-protected. Bennett-Salmon-2010.pdf · 0.94 MB". However, the Download link works just fine and lets me download and read the PDF. Dropbox is weird.
I'll be honest - when the paper was published I was thinking "no shit - why do we need a paper to tell us what we all learned in stats 101 about multiple comparisons??" And then realized the quantity of fMRI papers that used uncorrected thresholds.
Very similar feeling when the "Voodoo Correlations" paper came out. Except I was admittedly guilty of having presented correlation coefficients from clusters that had already been identified using thresholding. So that paper really did make me take a closer look at some of my figures/conclusions.