One of the things that is latent in this article is that in the US you are supposed to have done a power analysis in order to justify the number of animals you are using in your study. Almost no one does this, and it is not surprising -- if you are doing cutting edge research you are in unknown unknown territory and any power analysis is likely to be no better than a guess. In a sense it is farcical that exploratory research needs to pretend that it is always successful.
Countless animal experiments are uninterpretable simply because they were poorly executed by first year graduate students. There is no other way to learn. Write them off as animals used for training if we want full accounting, but researchers' time is scarce enough as it is so asking them to publish uninterpretable results is a non-starter.
On the other hand I think it is important to publish as many results as possible to avoid file draw effects etc. Data publishing might be one way around this, but at the moment few labs anywhere in any field have the know-how to publish raw data for negative results, even if it is just sticking files in a git repo and getting a DOI from zenodo.
"Countless animal experiments are uninterpretable simply because they were poorly executed by first year graduate students. There is no other way to learn. Write them off as animals used for training if we want full accounting, but researchers' time is scarce enough as it is so asking them to publish uninterpretable results is a non-starter."
> I'm sorry but this simply isn't true. During my masters i did in vivo surgery on 30 rodents and tested how many the procedure was succesful in (10). I had to record how many the procedure was sucessful in and write this up in my methods. I'm not sure how you see keeping a record of this as something that takes a significant length of time that you wouldn't bother to record it.
In fact i'm pretty sure (at least for the UK) that keeping records like this is essential under the home office rules for animal safety in scientific procedures. These rules are there to help with the 3 R's (reduce, replace, refinement). This guidance aims to cut down on the use of animals in research or at least ensure quality experiments are being ran. If you don't measure - how can you improve?
I think a common misunderstanding in this thread, and something not explained in the article, is the difference between internal accounting of animals and publication-level. Labs track animal usage internally in the way you mention, and typically share that information with the university vet, IACUC, etc. but that is simply not useful information to publish.
Despite the possibility of differing opinions in this thread, I think the article has not done a good job of explaining the realities of the process, so anyone here who is not familiar with how animal work gets done is coming in with a purist misunderstanding. Everything is reported (in the US), but not in publications.
Looking at publications alone for animal accounting would be like if I looked at the checking accounts for everyone in a country and wondered where all the money went. Of course it's in savings, investments, cash under the bed... but I only looked in one place. I cannot conclude money is unaccounted for when my search was incomplete by design.
Yes. I did not mention this in my original comment, but internal accounting and oversight by IACUCs is pretty good. They know within a couple of cages how many rodents are on campus at any given time (module reproduction etc.). If we want the public record to be able to account for this then it would likely have to be in another venue, because animals whose data does not go directly into a paper could be "involved" in the exploratory work for tens of papers. How do you prevent double counting, how do you know which animal whose data was not used was reported in which paper? Mostly you don't. The IACUC has it, it is buried in lab notebooks, etc. and if someone needs to be disciplined for misuse it is on the IACUC.
This is different from the UK where the 3Rs are much more strongly enforced, to the point where I remember asking a question back in 2015 to the then head of UK animal research about reproducibility, and getting the answer that he wouldn't approve the use of animals just to replicate an already completed study. In the US in some fields animals will be used just to replicate a result because another lab needs to know for sure that it is real before expending even more animals for a potentially useless follow up study. The way the numbers play out in practice, we would be much better off doubling if not 10xing the number of animals used in initial publications to avoid the 10x replication studies that will be done inside other labs to make sure that the result is real. Of course if we did this then the publication rate in many fields would be cut in half, or decreased by an order of magnitude.
Yes - animals absolutely do not go missing within university tracking systems with IACUC oversight.
I disagree with the parents that the remaining animals are even primarily used for training purposes. There are countless ways that experiments may fail with uninteresting results, that do not count as null results.
> In a sense it is farcical that exploratory research needs to pretend that it is always successful.
Did you interpret the article as saying that exploratory research must be successful? I read the opposite, that "unsuccessful" (defined here by me as negative or inconclusive) research should be published more. What am I missing?
The original article saying that more unsuccessful research should be published, and I agree. The sentence quoted in gp is a bit hard to parse, but it is just another way to say unsuccessful research shouldn't be hidden and ignored. The context for the sentence is a bit more from the funding side, where I usually only half jokingly say "99% of all funded grants are successful!" Which is the complete opposite of reality, where 99% of experiments fail.
"Inconclusive" can mean several things. You can get inconclusive results that don't strongly support or refute a particular hypothesis. These should be published. However, a lot of experiments end with "no/bad data" and publishing that is, IMO, often a waste of time.
Suppose you want to see how different types of neurons are distributed in the brain. You hypothesize that two specific subtypes of neurons are always found in close proximity in one condition (brain area, developmental stage, disease vs health, etc), but not another. There are a lot of ways to do this, so you pick one and start.
If things go well, your antibodies selectively label each neuron type. You count the pairs of neurons that are neighbors (or not) in condition A, those that are neighbors (or not) in condition B, and do some stats. If you get this far, I agree it ought to be possible to publish something, regardless of whether the proportions are wildly different, exactly the same, or somewhere in between.
However, things often go wrong. These protocols have a lot of free parameters and it's often not feasible to calculate the best ones from first principles. As a result, you try something and notice that the result is wildly implausible: maybe everything is labelled as one of your cell types, even stuff that isn't neurons. You tweak the protocol, and now nothing is labelled. This is also implausible--the tissue is from a normal animal--so you make some more adjustments and try again. Perhaps you even change techniques altogether and use FISH or a viral vector instead of immunohistochemistry.
The final protocol (if successful) is always included in a paper, but these intermediate failures are usually not and I'm not sure it makes sense to. Suppose the solution was to use a better antibody from a different company. The pilot experiments where we varied the incubation time, sample prep, etc using a dud antibody are fantastically uninteresting. Furthermore, people often change multiple parameters at the same time; going back and convincingly demonstrating which one "matters" would require a lot more work for a fairly limited payoff.
Finally, people also adapt their research question based on the data they can obtain. Maybe you can reliably label one type of neuron, but not the other, so you decide to focus on how those cells' locations vary during development. If so, it'd be weird to report a bunch of failures of an unrelated technique in the resulting paper.
There are many ways for something to be unsuccessful without being interesting. We do not need to reduce the signal to noise ratio of scientific publication further by requiring all exploratory research efforts to be published.
yes, it's important for work to see the light of day, but we do not want to disincentivise risky or exploratory work.
> But none of these is a valid excuse to not publish your findings in the scientific record
Are journals even going to publish most no-result studies? (I.e. a result of no significant correlation found.) Journals pick and choose the best articles to peer-review and publish out of all submissions.
It feels like there ought to be some other system for tracking no-result and unpublished studies, something easy and low-friction -- not the trouble of an entire paper, but maybe something more like the length of an abstract?
Then if you think about running a study, you can run a search to see if others have already done something similar and found nothing... and reach out to them personally if you need more info.
Journals are only one part of the equation. Before you can publish you need someone who is going to collect all the data, do a comprehensive analysis, then write up the results.
That's not a trivial amount of work in the least. If you're a professor trying to establish your lab, would you use a few months of your post-docs time writing up experiments that don't work, or working on new ideas that might work?
Indeed. I think people don't understand that precariousness doesn't just apply to the private sector. If you want to advance in your career, you have to be focused on things that aren't complete dead-ends.
> Are journals even going to publish most no-result studies? (I.e. a result of no significant correlation found.)
Yes they are. Not all journals of course, but some are very explicit that they will publish based on quality and not based on outcome, e.g. Plos One. And you always have preprint servers as a backup publishing option.
There are many reasons for publication bias. That you can't find a place to publish your findings is not one of them.
The Center for Open Science actually has a web framework built around this concept they call pre-registration. Check it out website: cos.io , site to preregister: osf.io
Speaking of which, why is there no central database of all articles with metadata? A "journal" would simple consist of the articles tagged by a particular set of people as peer reviewed.
It depends on the country but an approximation for US biomedical research is https://pubmed.ncbi.nlm.nih.gov/
although that's only journal-published information.
Oops. Yes. What I meant to say was that US granting agencies often require published results to be catalogued through PubMed. Not that PubMed only catalogues US-based research.
Exactly. I could see a PI saying "well, you'll finish your PhD in 5 years if you're lucky and most of your experiments are successful. In the meantime, could I get you to spend 6 months writing up all these unsuccessful experiments? You'll get zero credit for it and it will delay your PhD. Thanks!"
The general principle seems to be that any effort which doesn't yield a result (it failed for methodological reasons, there was no effect, there was not a significant effect, they were developing a protocol, etc) should be published so that other researchers can learn from the effort. But that principle doesn't sound specifically related to animals (or mammals, as this article mentions mice, rats and rabbits, but not fruit flies etc).
Should the same apply bacterial or fungal cultures? What about outside of biology? Should someone studying psychology of personality who gets a non-result from a survey of psych department undergrads publish it? Should a materials scientist who builds a bad battery describe it in detail? Should I create a PR for every branch which I don't want to merge?
At some point, any new endeavor can only start by choosing to not spend 5 years reviewing all the one-off failures.
"The most common reasons they gave were that the studies didn’t achieve statistical significance, a controversial but commonly used threshold for publication; that the data were part of a pilot project; and that there were technical issues with the animal models. But none of these is a valid excuse to not publish your findings in the scientific record, says study co-author Kimberley Wever, a metascientist at Radboud University Medical Center. “All animal studies should be published, and all studies are valuable for the research community.”
I hard disagree that pilot projects and results from data with technical issues should be published. There is a real cost to write-ups (time). Pilot projects usually are done just to prove feasibility of a protocol, and data retrieved from experiments with technical issues is likely noise. This could include inaccurate measurements, cross-contamination, etc. Taking the time to write and publish all the junk is, frankly prohibitive. I do think null-results should be publishable, however.
The title is a bit of click bait also: The animals are not missing. Make a request of the relevant Institutional Review Board.
Are any of you voting my comment down actually scientists?
Pilot projects are usually a small sample that demonstrated that a protocol can be achieved, not a meaningful set of data to test a hypothesis. So no, it is probably a waste of time.
I think the thing that non-experimentalists don't realize is how many experiments fail for reasons that are not understood. For example - "the neurons in the part of the brain failed to express the transgene I wanted to use to study them." To actually write a paper about that null result requires substantially more experiments to form hypotheses and test them. Moreover, my lab is far from being an expert in the esoteric ways that this can happen, so when I try to submit the paper, the experts ask for even 10x more experiments. So I've gone farther and farther from the questions that are "important" (reflect my interests, expertise, and funding).
But this explains the value of scientific meetings and poster sessions where I can just tell this random tidbit to interested people without the burden of peer review.
An interesting recent development I've observed once was a tweet describing the fact that a dye (Texas red) labels brain vasculature even when injected subcutaneously. This random fact is not in the literature but is quite helpful from a procedural perspective. I think that science twitter has a potentially super valuable role to play in reporting unexpected or otherwise difficult to publish findings.
Most "failed" pilots don't demonstrate that an approach is fundamentally and irredeemably flawed, which could be an interesting paper.
Instead, they demonstrate that doing something this specific way is too much of a hassle, too unreliable, or too expensive to answer a particular question in a particular context: skills, budget, timeline, resources, alternative ideas, etc.
I don't see how you could write pilot results up ("Trust me, I'm usually pretty good at these sorts of things but this didn't work well"?). Meanwhile, disentangling these factors in a rigorous, generalizable way would turn it into an entirely different project.
I think that is a fair point. It is partially answered by whether a grant was received, but only partially.
On the other hand, data sharing protocols mean the data is going to be made available, so I think that probably addresses that issue.
The problem with pilot data is that lots of things can change in the course of running the pilot; tweaks of the experimental protocol, bug fixes to code, etc.
pilot data is not what you think it is. Pilot data is about finding parameters, testing procedures. It is neither a consistent nor a full replica of an actual experiment.
Actual scientist. I believe Record keeping should be kept, i'm not saying it justifies a full publication but these records should be kept. If you go on to publish then these records should be acknowledged in the publication. It can be included in appendices / extra information even if it's not in the main article.
This is really a non-issue. There are numerous preliminary experiments and trivially failed experiments done in the course of figuring out a research direction and optimizing protocols. This work rarely makes it into publications in any field, and it would muddy the science to do so. I'm not a fan of most animal work in general, but the idea that there are "missing" animals is just a misapprehension about the process.
This is the most spot on comment. This is field- and technique- specific, but animals approved in a grant and animals that end up in the paper are, in my experience, extremely unlikely to be the same number. Even in an experienced lab with well-oiled technical expertise, there's nearly always a fraction of animals that go to pilot experiments (i.e. didnt work), have some confound, or just didn't work. And by didn't work I mean didn't even get far enough to count as negative data, e.g. if you're trying something very technically demanding or new.
Jumping to the conclusion that negative results aren't being reported (which is indeed its own big problem) from a difference in animal number suggests lack of understanding of the process, but perhaps formal accounting of research animals differs by country.
Based on the %success chance of typical research, I’m surprised such a high percentage of the animals end up in publications. That probably means the review board is doing a pretty good job, because they must be denying studies they believe wont answer any interesting conclusive question?
In the US, almost all animals should be accounted for by the IACUC at the research organization. In practice that includes rats, mice and fish even though the animal welfare act excludes them, because the Public Health Service does not. It does not surprise me that they do not always appear in papers. How often does an experiment end without any result at all? Pretty often.
> The researchers also surveyed the scientists involved to find out why so many animals were missing. The most common reasons they gave were that the studies didn’t achieve statistical significance, a controversial but commonly used threshold for publication
I do not come from academia so I may be mistaken, but why would you publish a non statistically significant result? There seems to me there's a very distinct difference between "this is most likely true" and "this might be true, we can't say yet". The first is useful, the second adds to the chatter needlessly. Some may say that it promotes further research, but what if a false weakly proven result misleads instead?
Because say that there are 20 studies trying to prove that the Earth is flat. The current system only rewards the 1 study that "proved" that. The 19 other studies which were inconclusive or demonstrated otherwise were not published.
Now, if you had 19 studies which were inconclusive and 1 study that claimed to have conclusions, you would look at it very differently than one study that claimed to have conclusions.
This is true, but at least in drug development (where I have some experience), no two studies are alike. Even when a drug company runs two phase 3 trials (like they are supposed to in order to secure FDA approval), the design or execution varies enough that if one fails, it doesn't necessarily suggest that the other one is a false positive.
Imagine if a drug company ran a study 100 times before finding a "significant" result that supported the claims they wanted to make about their product. Without the other 99 studies, the single study is highly misleading.
That's why you should implement (and require) pre-registration of clinical studies - however, that still just means that everyone knows that there are the other 99 unsuccessful studies, not that they will get written up in detail and published.
Depends what methodologies were changed between studies. If people with a particular gene were effected, but said expression was very rare then you might have a huge number of studies that show no effect, but were highly effective in that one case.
Exactly, so if you only published the one study where this rare gene was present in a participant, you'd get a wildly distorted picture of the general effectiveness of your intervention.
> > the studies didn’t achieve statistical significance, a controversial but commonly used threshold for publication
> why would you publish a non statistically significant result
There's actually two separate things going on here.
The first is that researches (at least in the biomedical sciences) often decide not to waste their time writing up results that are either weak, inconclusive, or negative. This is only mildly controversial - everyone involved recognizes the inherent time availability constraint, but knowing what didn't work can help to inform the field at large in it's own way.
The second is that the term "statistically significant" itself has become _highly_ controversial in recent years. The issue is that when you perform statistical tests, they generally (oversimplifying) spit out a number telling you what the odds of your results being correct are.
In concrete terms, say you have some graphs that show a slight improvement in some condition when a drug is used. Is the drug actually effective, or is the "improvement" actually just random measurement errors? So you run a statistical test on your data and it gives you a score of 80%. It's telling you that there's an 80% chance that your results were due to an actual improvement and a 20% chance that they were due to random chance.
If you had unlimited time and money you could just keep collecting results endlessly. Eventually, that score would either go towards 100% (it definitely works) or 0% (it definitely doesn't work). But this is the real world, where we don't have unlimited time and money (particularly academic researchers).
So when is your result worth publishing? How do you decide when to throw in the towel? And if you're reviewing papers for a journal, how low a score is acceptable before you vote to reject the paper on the grounds that the conclusions are unreliable and not worth looking at?
Enter the term "statistical significance". At some point, people started classifying scores that were above some arbitrary threshold as "significant" and those below it as "not significant". But there's an obvious problem here - some measure of probability flipping from (say) 89.999% to 90.000% doesn't do anything magical! Worse, what a future reader intends to do with the results will determine how important any given score is in that particular case. Clearly, results and their associated score need to be interpreted in context instead of blindly. Using a term such as "statistical significance" flies in the face of that by actively encouraging lazy thinking.
So the controversy being referred to in that specific sentence you quoted isn't the decision not to publish but rather the usage of the term itself. (Which is confusing, because the article at large is addressing the controversy surrounding not publishing.)
I don't believe publication of non results or negative results would be helpful at least not published in the traditional sense. Apart from the work for writing this up it is already hard enough to follow the amount of publications coming out in most fields at the moment. Non result publications would just not be read and would make finding the right literature so much more difficult.
What is really needed is publication of the data. The big problem with that is that there are no good systems for data publication that don't require significant extra work. Ideally the data would be made available directly from some electronic labbook system, with the relevant metadata attached to it. At the moment all systems are miles away from that (even if we don't consider how bad record keeping is in many research labs).
If funding agencies would really want to do something about open data they would significantly invest into a good data system and make data entry mandatory and in particular not just at the end, but during the study (private, but then made available at the selection of the PI or after some time)
Wouldn't this open the door to even more p-hacking than today - unscrupulous researchers taking others' data and finding some kind of signal in it?
My understanding was that the gold standard for research is to come up with a hypothesis, design an experiment to test that hypothesis, collecting data in such a way that you have statistical independence between the variables you wish to measure, and finally analyze the data and conclude check if you have confirmed your hypothesis or if it remains unconfirmed.
But data collected correctly to test one hypothesis is not necessarily good for testing other hypotheses - e.g. perhaps it was irrelevant for your research if all the animals were female and might have had birth defects, but that doesn't mean your data is useful for someone studying the whole population.
This would mean that not only do you have to publish the raw data, but also describe your data collection methods, your assumptions, things you didn't check etc. How far are you from publishing your whole paper at that point?
>Wouldn't this open the door to even more p-hacking than today - unscrupulous researchers taking others' data and finding some kind of signal in it?
I would argue that the more data is out there, the less easy it is to "p-hack", i.e. it's much more difficult to find some weird correlation if you have a huge population. Also I would not call researchers who use others data "unscrupulous". In fact that is been done in meta studies already and is exactly the point of the exercise. Increase the amount of data published so if some "unscrupulous" or "erroneous" researcher publishes a study with some some spurious correlation, we can look at a lot more data to see if that also exists there.
>My understanding was that the gold standard for research is to come up with a hypothesis, design an experiment to test that hypothesis, collecting data in such a way that you have statistical independence between the variables you wish to measure, and finally analyze the data and conclude check if you have confirmed your hypothesis or if it remains unconfirmed.
>But data collected correctly to test one hypothesis is not necessarily good for testing other hypotheses - e.g. perhaps it was irrelevant for your research if all the animals were female and might have had birth defects, but that doesn't mean your data is useful for someone studying the whole population.
It is true that sometimes data collected for one purpose is not necessarily good for another purpose, however very often it is and many discoveries were made by looking for something new in data collected for a completely different purpose.
>This would mean that not only do you have to publish the raw data, but also describe your data collection methods, your assumptions, things you didn't check etc. How far are you from publishing your whole paper at that point?
Good reproducible science should do that anyway. You should always keep a labbook that clearly documents what you are doing so that someone else (or your future self), can reproduce the results. So in the requirement I'm asking for is forcing people to do good science. I can tell from experience this documentation is still very far away from a whole paper.
Your points are very correct I think. Just wanted to comment that I didn't mean to call anyone who uses data published by others unscrupulous, I was just talking about researchers who would go hunting for p-hacking in others' data.
Actually, it may not be necessary for journals to publish null results. The important things is that the data be made available. Funding agencies like the NIH are making this mandatory. I think the NIH is, unfortunately, taking the wrong approach in implementing its data archives; it places an interest in data harmonization above common sense and data intelligibility, in my opinion. Still, I think its a move in the right direction.
Many scientific studies fail to publish the number of animals in total, used in their statistics. This would represent millions of animal-subjects, over time.
Many scientific studies obscure how many input subjects are rejected from final assessment. This would represent millions of experimental subjects, over time.
One of the things that is latent in this article is that in the US you are supposed to have done a power analysis in order to justify the number of animals you are using in your study. Almost no one does this, and it is not surprising -- if you are doing cutting edge research you are in unknown unknown territory and any power analysis is likely to be no better than a guess. In a sense it is farcical that exploratory research needs to pretend that it is always successful.
Countless animal experiments are uninterpretable simply because they were poorly executed by first year graduate students. There is no other way to learn. Write them off as animals used for training if we want full accounting, but researchers' time is scarce enough as it is so asking them to publish uninterpretable results is a non-starter.
On the other hand I think it is important to publish as many results as possible to avoid file draw effects etc. Data publishing might be one way around this, but at the moment few labs anywhere in any field have the know-how to publish raw data for negative results, even if it is just sticking files in a git repo and getting a DOI from zenodo.