Hacker News new | past | comments | ask | show | jobs | submit login
Endonuclease fingerprint indicates a synthetic origin of SARS-CoV-2? (biorxiv.org)
274 points by johnwdefeo on Oct 20, 2022 | hide | past | favorite | 229 comments



Popular science summary of paper from one of the authors: https://alexwasburne.substack.com/p/a-synthetic-origin-of-sa...


"To avoid losing power with multiple comparisons, we focus our analysis on the BsaI/BsmBI sites in SARS-CoV-2 and compare the BsaI/BsmBI map in SARS-CoV-2 to all other restriction maps of all other CoVs used in our analysis."

It's important to know whether or not the authors picked BsaI & BmBI blind, before looking at the genome. If they picked BsaI & BmBI with knowledge of the SARS-CoV-2 genome, that doesn't dodge the multiple comparisons problem and the p-values aren't reliable. I guess it depends on how many other commonly used type IIS endonucleases there are. The authors use 214 to generate their null hypothesis distribution for the CoV restriction maps but say only 6 are specifically amenable to BAC cloning.

The "wild type distribution" null distribution for fragment length (Figure 3C) being a simulation (permutation of known CoV genomes, split at randomly selected restriction sites) bothers me. On the first read I thought it was a distribution of fragment lengths in real viruses. Does synthesizing virtual genomes by permutation produce a realistic distribution of fragment lengths?


Bioinformatician here, BsaI and BsmBI are Type IIS restriction enzymes. Which means, they are unique in the fact that they cleave DNA at a defined distance outside of their recognition sequence.

BsaI has been used in high throughput assembly techniques such as Golden Gate assembly and Golden Braid assembly.

Golden Gate assembly is an extremely robust method for building modular genetic components. For example, one can create plasmids (circular pieces of DNA) with billions of variants of the spike proteins, each carrying a different combination of mutations. Then, those plasmids are transfected into corona viruses and incubated in a tissue culture. Now, one usually let natural selection do its thing and the most infective variants replicate in the tissue much faster and take over the population.

Having said all that, type 2s restriction sites usually are cut out during assembly so I'm not sure how having those is a good evidence for engineering. Actually, the opposite is true. Having none at all is one evidence which is very much suspicious


They weren't picked blind, they were picked since they are commercially available and conveniently split the genome into 6 similarly sized segments with the spike protein entirely in one. It's precisely what a bioengineer would have elected to use and why the paper concentrates on it.


> It's important to know whether or not the authors picked BsaI & BmBI blind, before looking at the genome.

You could never convince me that these restriction enzymes were picked blindly, no bioinformatician I have ever met does science this way. There is a preliminary period of exploratory data analysis which is done before any hypothesis is put forward, and data dredging and leakage are rampant in the literature.

That's not to say that the spacing of BsaI/BsmBI restriction sites isn't noteworthy, just something to keep in mind.

To your point however, could someone comment on the suitability of BsaI/BsmBI for the in vitro assembly of synthetic coronaviruses? Is it all just about finding sites in the genome at the right locations which can be turned into restriction sites without disrupting any existing functional genetic elements? or is there more to it than that. If a research team were to come along and decide they wanted to engineer their own coronavirus, how likely would it be that they would choose these restriction enzymes?


> To your point however, could someone comment on the suitability of BsaI/BsmBI for the in vitro assembly of synthetic coronaviruses?

These are very commmon enzymes. Perhaps the most common today.

The GP comment is sort of misleading...you wouldn't just pick enzymes at random to do this analysis. You'd pick the enzymes in common use. These count.

> Is it all just about finding sites in the genome at the right locations which can be turned into restriction sites without disrupting any existing functional genetic elements? or is there more to it than that.

You can add or remove sites using different techniques, such as PCR mutagenesis.

> If a research team were to come along and decide they wanted to engineer their own coronavirus, how likely would it be that they would choose these restriction enzymes?

Highly likely.


> The GP comment is sort of misleading...you wouldn't just pick enzymes at random to do this analysis.

I said pick blind, not at random. I recommend reading https://info.umkc.edu/drbanderson/p-hacking-and-the-problem-...


I know what p-hacking is. There's no reason to believe that they've done that here. The choice of enzymes was motivated by the logic they outlined in the paper, the enzymes chosen are some of the most popular today, and the authors are completely forthright that the choice might affect the outcome.

To fairly make a critique like that, you need to have at least some evidence that a selection bias was applied for no other reason than to affect the p-value. Otherwise, literally every study can be accused of "p-hacking". Here, there's a very good, obvious explanation for the choice that they made, and therefore all you can really say is that the results might be different if you looked at a different set of enzymes.


I realized why this being based on simulation bothered me: this is a machine learning classifier that classifies viral genomes as synthetic or natural. The training set n = 72 (all negative, which is justifiable if you're ok with null hypothesis significance testing) the validation set n = 6 (only synthetic examples, which is less fine), and there's no test set. No effort was made to estimate true positive rate, false positive rate, etc. If this was published as a machine learning paper instead of a biology paper it would probably be held to a higher standard.


nothing stops you from gathering an (in this analysis) unseen set of wild viral genomes and known engineered ones and generate your own test set, but be sure to preregister your study and document every search query so that you can prove to the rest of the world you set yourself to the same standards as you hold others.


There are at least a couple of suspicious points in this study:

First and foremost the central claim is that 5 potential restriction binding sites versus 2 means that SARS-CoV2 is non natural. That does not necessarily follow. Just as SARS-CoV2 is unusually infectious and damaging to humans it could just happen to have an additional 3 restriction binding sites. So there is nothing inconsistent with natural selection of viral characteristics, only a comparison between wild and lab viruses.

Second, the evidence for the wet market origin is trivialized. That argument points out that genetic drift is well characterized and the presence of two closely related SARS-CoV2 variants cultured from the wet market is extremely strong evidence that is where the virus initially appeared. Both arguments make use of detailed genetic evidence, but the wet market argument based on genetic drift is quite robust while this alternative theory merely presents similarities while not ruling out natural selection.

Thirdly, this paper emphasizes the strong impact of the COVID pandemic and asserts that understanding the origins of the virus would necessarily aid in preventing future pandemics. This does not clearly follow. Especially if the virus had natural selection origins there is no clear and obvious way of systematically reducing risk. Simply living or traveling where host populations like bats live could be enough to generate exposures and it is not simple to clear people off of rural habitations.

These second and third criticisms are not direct against the evidence and logic presented, but show a dangerous level of sloppiness in the research that makes this paper appear more like slanted analysis from someone with an agenda than a critical thinking scientist genuinely interested in the truth and therefore needing to consider alternatives and potential falsification of the hypothesis.


> First and foremost the central claim is that 5 potential restriction binding sites versus 2 means that SARS-CoV2 is non natural. That does not necessarily follow. Just as SARS-CoV2 is unusually infectious and damaging to humans it could just happen to have an additional 3 restriction binding sites. So there is nothing inconsistent with natural selection of viral characteristics, only a comparison between wild and lab viruses.

No, you've completely misunderstood the analysis. The number of restriction sites is not what is important. It's the location of the sites, and the spacing between them. This is suspicious, and has a high degree of variability, as is shown in Figure 3c. They also generated 100,000 random mutations to RaTG13 and BANAL52, and found that only ~1.2% and 0.1% of these, respectively, had restriction maps as deviant as the one found in SARS CoV2 (Figure 4).

The spacing here alone is suspicious, but couple with the number of synonymous (silent) mutations, and you're looking at an outcome extremely unlikely to be found in nature.

https://www.biorxiv.org/content/10.1101/2022.10.18.512756v1....


> They also generated 100,000 random mutations to RaTG13 and BANAL52, and found that only ~1.2% and 0.1% of these

This could also be selection pressure - right? Ie imagine 99,000 yield viruses that are non viable… you’d see the same behavior of rare traits being common.


It was an in silico experiment.

But regarding your broader question: there's no reason to believe that thesse viruses experience any selective pressure for the number or location of cutting sites of the particular enzymes being investigated here. They're bacterial enzymes, entirely unrelated to coronaviruses.


We've seen horizontal gene transfer in fish, was it aliens or natural selection?

Extremely unlikely events happen when an extreme amount of attempts are made, and natural selection amplifies them if they increase fitness


Your reply after reading the explanation is like you seeing a farm of branded cows and wondering if it was due to extreme natural selection that caused that.

From the layperson article:

> In wild viruses, these cutting/pasting sites are randomly distributed because there's no evolutionary pressure for the virus to be thusly cut and pasted in nature. In infectious clones, however, the humans behind the screen tend to modify restriction sites in a regular way. For any given restriction enzyme or set of enzymes, the set of all cutting sites is called the “restriction map”, and looking at these restriction maps helps us see the fingerprint of infectious clones.

> It turns out, the sticky ends produced by BsaI/BsmBI digestion of SARS-CoV-2 are all unique, non-palindromic, and all contain at least one A or T - all criteria either required or recommended for in vitro genome assembly.


The explanation given is based on some premises that I'm not qualified to assess, and others I am.

One of these premises is that their work properly models reality, there seems to be a lot of well informed doubt by subject-matter experts.

Another is that an event with a probability of 0.1-1% is exceptionally rare, its occurrence thus being most likely artificial, and with that I disagree, by looking at endless counter-examples nature provides.

Is the fact that humans found some optimization a proof that any occurrence of it is man-made? I believe most people would say it isn't.


> One of these premises is that their work properly models reality, there seems to be a lot of well informed doubt by subject-matter experts.

The model is fine. There's no more "well informed doubt" than for any other paper. You can certainly debate the details of what they did, but none of this debate is substantial enough to invalidate the work.

What you're seeing is a group of people who have largely pre-judged the outcome, inventing reasons to reject an experiment that disagrees with their prior conclusions. This always happens, in any scientific domain. Nonetheless, there are also a large number of well-informed people who see this as an interesting result. If you don't listen to both groups, you will be misled.


Cows don't randomly develop brands on the order of 0.1% to 1% probability. There isn't a great analogy for this and your example, as well as the hyena one in another comment, are way overstating the case.


Your summary of the paper is unfair. From one of the author's blog:

> Recap: BsaI/BsmBI are particularly useful restriction enzymes to use if you wanted to study a bunch of chimeric coronaviruses like the close relatives of SARS-CoV-2. The SARS-CoV-2 BsaI/BsmBI cutting sites look regularly-spaced (ish). The maximum fragment length is in the bottom percentile of all CoVs digestions in the idealized fragment-number range, the bottom 0.07% for all type IIS digestions within the idealized range, and the number of fragments is also in the idealized range. The SARS-CoV-2 BsaI/BsmBI restriction map looks a lot more like known pre-COVID infectious clones than a wild coronaviruses. All sticky ends are unique & meet other nice criteria for good assembly. All mutations separating these sites from close relatives are silent, and there’s a significantly higher rate per nucleotide of silent mutations within BsaI/BsmBI recognition sites than the rest of the viral genome.

> The odds of meeting any one of these criteria vary, from 1%-0.07% of having such a small maximum fragment length to 1/250 to 1/100 million odds of having such high concentration of silent mutations within BsaI/BsmBI recognition sites. The odds of meeting every single one of these criteria are even smaller. Much smaller.

https://alexwasburne.substack.com/p/a-synthetic-origin-of-sa...

So it's not just "5 vs 2" sites, it's the spacing of the sites, the fact that the site mutations are silent, and the fact that the "sticky ends" are unique.

This paper should move your needle toward "synthetic origin". Similar to how some other papers published in the last few months (the wet market paper among them) should have moved your needle toward "natural origin".


> That argument points out that genetic drift is well characterized and the presence of two closely related SARS-CoV2 variants cultured from the wet market is extremely strong evidence that is where the virus initially appeared.

I assume you're referring to Pekar et al. here? The two lineages are literally just two SNPs apart, so it's near-impossible to distinguish whether they arose from two separate introductions, or just from two super-spreading events after cryptic evolution in humans from a single earlier introduction. Pekar builds an epidemiological model that purports to find that evolution in humans is p ~ 0.5% unlikely; but that result is highly sensitive to the assumptions in that model, most notably their choice of a scale-free infection network (and thus power-law distribution of number of other people each patient infects). Robustness to that infection network isn't studied.

The author of this endonuclease fingerprint preprint also has a preprint on Pekar's model,

https://www.biorxiv.org/content/10.1101/2022.10.10.511625v1

Note that I'm criticizing Pekar here, not endorsing the endonuclease preprint. I don't have a great sense of the correct Bonferroni correction (to borrow Prof. Balloux's framing) to apply to the latter's probabilities.


Asserting that understanding the origins of a virus helps control its adverse impacts is completely bog-standard fare in the literature. Finding it suspicious in this case smacks of an isolated demand for rigor and unfamiliarity with the field. Even a cursory look at papers making a case for a zoonotic origin is likely to reveal statements to the same effect, here's just two for example:

https://www.sciencedirect.com/science/article/pii/S009286742...

>Failure to comprehensively investigate the zoonotic origin through collaborative and carefully coordinated studies would leave the world vulnerable to future pandemics arising from the same human activities that have repeatedly put us on a collision course with novel viruses.

https://www.nature.com/articles/s41591-020-0820-9

> Detailed understanding of how an animal virus jumped species boundaries to infect humans so productively will help in the prevention of future zoonotic events. For example, if SARS-CoV-2 pre-adapted in another animal species, then there is the risk of future re-emergence events.


Right - it should never really be suspicious for a research paper to include text rationalizing why it's a good idea to invest in researching that topic. That's just basic self advocacy.


Their claims around Type IIS assembly are also suspect. eg in Golden Gate assembly, you choose Type IIS that reach over and cut, so the restriction site is absent from the final assembled product.

"Additionally, because the final product does not have a Type IIS restriction enzyme recognition site, the correctly-ligated product cannot be cut again by the restriction enzyme, meaning the reaction is essentially irreversible"

https://en.wikipedia.org/wiki/Golden_Gate_Cloning ----

The choice of focusing on a particular RE pair also smells of p-hacking. Their claim that BsaI/BsmBI makes for easy mixing/matching genomes doesn't make sense in this day and age, when you can use other techniques to make hybrids more effectively (eg, you are not restricted to the natural location of those restriction enzyme sites)


The argument is about the negative space of the RE. Regardless of what the article says, To do golden gate/Gibson/etc. best practice is to cut the template first (with RE) then assemble against the open ends, so to do this you must ablate the existing re sites. The alternative is to linearize by round the horn pcr. At 33kb, it's not impossible but why bother with the pain when it's much easier to snip.


Their description of the assembly strategy as "Golden Gate" seems like incorrect terminology. The WIV has at least published papers using BsaI and BsmBI, though.

https://twitter.com/jbkinney/status/1583267221047869441

https://twitter.com/jbkinney/status/1583248052969562112

https://journals.plos.org/plospathogens/article?id=10.1371/j...

EDIT: Kinney also asserts here that they left the sites in a final assembled genome (i.e., the genome of a replication-competent virus). I'm still trying to figure out if that's true, though.


Just because there are more modern techniques doesn't mean there haven't been older ones used, or techniques used incorrectly.


>Thirdly, this paper emphasizes the strong impact of the COVID pandemic and asserts that understanding the origins of the virus would necessarily aid in preventing future pandemics.

I strongly disagree. If of natural origin, there are a plethora of simple controls that could be implemented. Control doesn't necessarily need to be perfect. Some simple controls could be restrictions or bans on Commercial trade or transport of high-risk animals. If not of natural origin, it obviously indicates that BSL4 controls are inadequate or inconsistently applied. A simple but perhaps costly solution might be to not certify bsl4 Laboratories in dense Urban settings.


I generally agree, but would note that the WIV worked with novel natural or synthetic bat-origin viruses at BSL-2 or -3, mostly not BSL-4. From an interview with Dr. Shi:

> A: The coronavirus research in our laboratory is conducted in BSL-2 or BSL-3 laboratories. [...]

https://web.archive.org/web/20210727042832/https://www.scien...


I believe the research on the bat viruses was being done under BSL-2 conditions, which basically means normal lab practices, an autoclave for waste, and some of the work needs to be done in a biosafety cabinet. Of course, that assumes the procedures are strictly followed.


BSL-2 biosafety cabinets only have laminar flow hoods. They're effectively open air and depend upon proper protocols being followed.

Typically you're more concerned about what's getting into your BSL-2 hood than out.


"the central claim is that 5 potential restriction binding sites versus 2 means that SARS-CoV2 is non natural"

That claim is not in the paper at all. The argument is based on the length of longest fragment and could apply to any number of restriction binding sites.

"the evidence for the wet market origin is trivialized."

It's a statistical analysis of viral genomes. The wet market evidence is unrelated.

Your third point is arguing in favor of ignorance and defeatism. "there is no clear and obvious way of systematically reducing risk" because we do not understand the origin. If we did, there may very well be; e.g. restrict trade in civet cats. Simple actions like this have controlled pandemics in the past.


I really don't see how saying that the pandemic was a bad thing and that there is value in understanding its origin is "a dangerous level of sloppiness".


Well considering that's not what the OP said at all I guess it's ok you don't see that.


>These second and third criticisms are not direct against the evidence and logic presented, but show a dangerous level of sloppiness in the research

OP _literally_ said that.


Neither the "second [or] third criticisms" were "saying that the pandemic was a bad thing and that there is value in understanding its origin".

Just because "show a dangerous level of sloppiness in the research" was used in both sentences does not mean they are saying the same thing at all.

The OP's second and third criticisms were:

"the evidence for the wet market origin is trivialized."

and

"Thirdly, this paper emphasizes the strong impact of the COVID pandemic and asserts that understanding the origins of the virus would necessarily aid in preventing future pandemics. This does not clearly follow."

While "saying that the pandemic was a bad thing and that there is value in understanding its origin" might approach a summary of the third criticism it doesn't address the second - more serious - criticism at all.


> third criticisms...show a dangerous level of sloppiness


and the second criticism?


The accusation of sloppiness is based on two claims, and you're only responding to a bastardization of one of them.


You're suggesting that the species jump is likely to have happened in the wet market, but also claiming that there's no clear & obvious way to reduce risk in the future? That's inconsistent, isn't it?


This comment appears more like slanted analysis from someone with an agenda than a critical thinking scientist genuinely interested in the truth.


It’s amazing how long this farce has gone on. People who had every incentive to lie lied. That’s all.


>Thirdly, this paper emphasizes the strong impact of the COVID pandemic and asserts that understanding the origins of the virus would necessarily aid in preventing future pandemics. This does not clearly follow. Especially if the virus had natural selection origins there is no clear and obvious way of systematically reducing risk. Simply living or traveling where host populations like bats live could be enough to generate exposures and it is not simple to clear people off of rural habitations.

Exterminating all the bats is an option. It would be our first purposeful extinction and would no doubt cause many problems. We need to at least weigh up the possibility. Then we can run the risk benefit analysis.


> first purposeful extinction

Smallpox? And extinction of Guinea worm is in process.


Fair point


People could just try and stop eating them first instead, that's my humble idea.


The opening line: "To prevent future pandemics, it is important that we understand whether SARS-CoV-2 spilled over directly from animals to people, or indirectly in a laboratory accident."

Is that true? I mean I understand why people would be curious, but does it really matter, in terms of what we need to do? We know that viruses _can_ spillover from animals to people, and we know that viruses in a laboratory setting _could_ plausibly get released in an accident. In terms of what we need to do to prevent (as far as possible) it happening in the future, I don't know that I believe that the answer to which one happened in the case of covid-19, is all that important.

Again, I see why people would care. But as long as we know that both are realistic possibilities (and we do), it doesn't matter moving forward.


Yes, this is true.

If we knew that this pandemic came from a lab, that would have massive implications for how tightly we control this type of lab research - both in terms of whether we do the gof research, and also in how we run and regulate the labs. Surely you can imagine the reaction if, hypothetically, the world knew that this coronavirus was made in a lab and then escaped?

I'm amazed that you think otherwise - it's one of the nice things about hn that I occasionally encounter views that are so different from my own.


This is a case of cognitive biases, and an unfortunate reflection on peoples inability to reason about risk.

Obviously if it is possible that COVID came from a lab then we should act as though it did - the best case is these labs are catastrophes still waiting to happen. And I suspect so is the web of international travel we've built up over the last few decades.

But at a population level, the human race can't process the implications of that. People are profoundly evidence based - we won't see action until there is evidence that a risk is manifesting. Which is possibly why people care about whether COVID originated in a lab or outside one.


> I don't know that I believe that the answer to which one happened in the case of covid-19, is all that important

It is when literally this week a version with an 80% kill rate but with the spread of the most transmissible was made in a lab for no good reason.

How many would die if that leaked out and the only reason it’s been made is just so some academic can get their name in a paper and earn some grant presumably.

The idea of we just have to trust that these labs are secure and competent enough when the fact they’re even making something so deadly raises questions about their competence. We need to start judging the people making these things with the potential to kill millions the same as we would anyone else building something that had the potential to kill millions.

I don’t see why playing with Weapons of Mass Destruction in a random lab is just considered fine when the weapon is invisible and can live inside your body or a mouse. At least Fat Man and Little Body couldn’t be carried off base by a single person.


> a version with an 80% kill rate

An 80% kill rate of mice that are genetically engineered to express human ACE genes. For comparison, the original SARS-CoV-2 strain killed 100% of those humanised mice. Those rates aren't the same among human populations.


These kinds of experiments are going to be done either way - for example gain-of-function research on animal smallpox (which usually doesn't kill and can't infect humans) to make it lethal.

Experts on the matter (I read a book recounting smallpox research, not a domain expert myself [0]) were concerned more about the publishing of that research than about the research being done, since proliferation is deemed very easy in the field. You don't need resources as with nuclear weapons to reproduce research like that.

(Edit: typo)

[0] The Demon in the Freezer by Richard Preston


it can't be that easy, there's plenty of groups that would routinely use something like that for terrorism etc. It hasn't happened yet, so it must be more difficult than it looks?


I'm only citing experts on the matter (to paraphrase a quote: "all that is needed is access to a modern university's biology lab") but I think the reason we're not seeing attacks like that is that you cannot control a bioweapon - and everyone who is able to make one knows that.


Stopping public funding of that would reduce risk.


I don't know whether the research should have been done, but it wasn't done for no reason, and information it obtained seems quite useful. The hope, and general assumption, has been the same mutations that led omicron to be so transmissible also reduced its lethality. This research suggests otherwise (that it was happy coincidence). That would be very concerning (and makes me want to get the omicron variant vax)


> It is when literally this week a version with an 80% kill rate but with the spread of the most transmissible was made in a lab for no good reason.

It was a modified version bringing the kill rate down from 100% to 80% in lab mice specifically bread to be weak to this virus.


It could be a random lab in Cambodia, run by a professor with dreams and ambitions for fame and fortune. The point is you can't really regulate gain of function research. Pandemic preparation has to keep that in mind. That includes isolating countries immediately after a disease of unknown origin is found.


You can, in the same way as we regulate murder.

It will not be 100% efficient, but reduction can be achieved.


Then those countries will just not tell anyone.


Good point. Arguing about who or what to point the finger at for SARS-CoV-2 seems more like politics than epidemiology. The evidence of a laboratory source is unclear enough, and the chance of a natural source plausible enough, that I'm not sure how the cost/benefit is supposed to work out here.

Not to mention if your goal is just to point out the ethical lapses of the Chinese government, you don't really need any extra ammunition.


Epidemiology doesn't work without taking into account politics.

You can't have go unpunished specialists conspiring (or at best, just unconsciously covering their ass) when they're the ones that are supposed to be on the front lines preventing and fighting epidemics !

P.S.: Wow, I just remembered... this lampooning hits quite differently now !

https://youtu.be/9WfZuNceFDM?t=504


I think you have a reasonable point here.

I saw Marc Lipsitch give a talk with a back-of-napkin "Expected mortality from gain of function research" estimate in...I want to say 2013.

We know there's zoonotic spillover events of a number of scary pathogens, including the last two epidemic coronaviruses.

In a global sense of preventing future pandemics, an either/or framing is immensely flawed.


I think the way in which we prevent lab made viruses to break free is fundamentally different from how prevention of a spill over would need to be done. Because it causes problems at the exact opposite ends of the chain of events. The first couldn't be more detached and isolated from nature, while the other happens if hygience and mixing of species is sloppily disregarded. This is a challenge if both sites are right next to each other of course.


> happens if hygience and mixing of species is sloppily disregarded

Evidence of lab negligence should not become a reason to neglect or defund research and prevention about animal spillover risks. Unfortunately, the two problem classes are synergic rather than mutually exclusive.


If it is true that it made its rounds in a lab, we might want to consider putting a moratorium on such research out of security concerns. Or at least put such labs under more scrutiny. Also in regard to funding and people involved, have some be responsible for the safety.


This is brilliant: it looks at the negative space in restriction enzymes cut patterns to determine the likelihood that these sites have been engineered out. I don't see details on why they picked BSM-B1 to analyze, but the only thing is if they looked at several re sites and only reported the interesting one, that alters the meaning of the statistics to the negative of the hypothesis. (I happen to believe the lab leak hypothesis -- there are receipts if you search hard, but I think we should be careful about our evidence)


Looks interesting:

> "The evidence we find is independent of other genomic evidence suggestive of a lab origin of SARS-CoV-2, such as the furin cleavage site (FCS) found in SARS-CoV-2 yet missing from all other known sarbecoviruses. However, the BsaI sites in SARS-COV-2 flank the S1 gene and S1/S2 junction, and a similar design has been used before for substitutions in this region. The restriction map alone also does not indicate the lab of origin."


Coincidentally, Alex Washburne is also trying to get his startup going: https://selvasci.substack.com/p/coming-soon


There's a line in this paper which has been cited a few times in this thread which really stands out to me:

> 100,000 random in silico mutants were generated for both RaTG13 and BANAl-20-52...Only 1.2% of RaTG13 mutants resulted in a BsaI/BsmBI restriction map with a larger z-score than SARS-CoV-2. BANAL52 is the closer relative to SARS-CoV-2 by over 200 nucleotides, yet only 0.1% of mutants yielded z-scores as great or greater than SARS-CoV-2

1.2% of 100,000 is a weird way to express an occurrence rate. Because 1.2% is a little over 1 in 100. Which means from random perturbation they generated almost 1,200 candidates which would also match their conclusions. The 0.1% mutant number would still be about 100, by sheer random chance.

They also don't support this argument with any reference to the observed mutation rates of any of their candidates in the environment. How often, in nature, do we expect a new mutation to arise? They claim to use data for nucleotide substitution frequencies, so they're addressing the fact that mutation is a process with a temporal component, but not what the time span is.

If 1 in 100 commercial jetliners crashed every year, we'd regard that as so common as to make commercial aviation unsafe.

They then conclude:

> It’s unlikely such an idealized reverse genetic system would evolve by chance from the close relatives of SARS-CoV-2

From 1 in 100 occurrence rates? Of a virus?

Now we do in fact have data on how frequently viral mutation happens. During cell cultures it appears to be about at a rate of 9e-7 substitutions per nucleotide per replication cycle of 12 hours for RaTG13 (reproduced from culture in animal tissue too[2]). With a ~29,800 BP genome, that means 0.02 nucleotide substitutions on average per replication cycle. So about 25 days for a single virus strain, serially replicating, to substitute 1 base pair.

Of course, this is all ignoring the fact that viral mutation is highly parallel - which they also do not mention in reference to this conclusion. How many mutations are necessary to generate a z-score match which would fool their detection method? 1, 2, 10? How many potential restriction enzyme cleavage sites exist which would become cleavage sites by a single base-pair flip? What's the dynamic addition/removal rate of restriction enzyme sites expected to be? We have per nucleotide estimates for this for RaTG13, so it's also not an independent variable: to achieve the distribution of cleavage sites they propose, what is the mean-number of mutations for it to happen in the 1 in 100 candidates which achieved it? A virus doesn't explore it's mutation space serially, it explores it in a massively parallel way every single replication cycle.

This paper leads with some very specific claims, is based purely on simulation, and fails to ask obvious control questions based on it's own methods.

[1] https://doi.org/10.1371/journal.ppat.1000896 [2] https://doi.org/10.1371/journal.ppat.0030005


But the thing they're evaluating does not have an evolutionary benefit, so the fact that many possible mutated viruses exist does not mean it's any more likely that this specific pattern would be observed in a specific pandemic virus.


But the actual question is not how likely it is of those naturally occuring mutations (which also happen to be exactly like the commercial ones used for cutting genomes) to occur at the exact positions you would expect them to be in engineered genomes.

The real question is how probable it is if such a naturally mutated virus actually develops into a pandemic. First it has to spawn from nature and then it also has to be lucky enough to cause a pandemic.

You have to multiply those two probabilities as well. And only then you have a model that can be compared to what happened with the SARS-CoV-2, is it still that likely to be a natural spill over?


So a likely extremely controversial paper being shared publicly to a non-expert audience prior to any peer-review.

Is this going to be yet one more of those “will be withdrawn after peer scrutiny but by then it is too late because the false meme has been injected into the public consciousness” things?


They did share it with scientists. Here’s Francois Balloux saying he replicated the results, tried to find holes, couldn’t

https://mobile.twitter.com/BallouxFrancois/status/1583165259...


Here's another blue check immunologist on twitter with a substantially different take:

https://twitter.com/K_G_Andersen/status/1583252866394771456


The authors of this paper are confused about the technology they're using... it's not a good sign about their conclusions..

https://twitter.com/matias_kaplan/status/1583235087067336704

https://twitter.com/NoahOlsman/status/1583275862442807299


Oh dear... Yeah, the author affiliations on the title paper made me wonder if they knew what they were doing at all.


Their terminology would indeed seem wrong, but the WIV has apparently used a similar assembly strategy (leaving the sites in the final assembly) before:

https://twitter.com/jbkinney/status/1583267221047869441

I'd worry about their false discovery rate, for the same reason I worry about the large number of parameters in Pekar's epi model. It's still an interesting result, though.


As far as I've been able to read, those sites aren't in the final assembly in Shi's papers - just in the primers. And even more worrying than the false discovery rate are all of the missing genomes they should be comparing. I fear they've left them off because they punch big ole holes in the theory... e.g.

https://twitter.com/zhihuachen/status/1583258714340892672


Another good twitter thread on it this morning:

https://twitter.com/acritschristoph/status/15834864034169692...

This is the real kicker to me:

> What about missing sites? The authors propose that someone made a bizarre combination of additions and deletions of cut sites. RecCA matches SARS-CoV-2 at all missing sites because other viruses do. E.g. this one: similar to RpYN06 not just at the mutation, but the entire region.

Clear evidence of recombination across the whole region and not just mutations to manipulate the cut site.

And Francois Balloux seems to have deleted his twitter account this morning.


As far as I can tell, WIV (and other) researchers have left other restriction sites in final genomes, but not BsaI and BsmBI sites. That makes sense given the typical use of those two enzymes. So I'm inclined to agree there's nothing obviously special about that combination of restriction sites and spacings, which increases my concern that this is just a false discovery.

Jesse Bloom said he'd try reproducing with a wider range of natural viruses and potential synthetic assembly strategies. Unless and until that still gets an interesting p value, I'd agree this is oversold.


> As far as I've been able to read, those sites aren't in the final assembly in Shi's papers - just in the primers.

Where did you read that? Kinney explicitly asserts the opposite, but I can't figure out what he's referring to. Someone else linked to Figure S9 from their 2017 PLOS Pathogens paper, which indeed seems not to contain the site in the final assembly. I've edited my other comment here to reflect that.


Nope - the restriction sites aren't in the final assembly in that paper. These authors just don't understand the tools remotely well enough to have written their paper. It will never be published.

https://twitter.com/alchemytoday/status/1583361758903013376


Yeah, I think Kinney is wrong here. Someone else found a survey paper noting the theoretical possibility of using BsaI and BsmBI in a way that leaves the sites in the final genome, but I haven't seen anyone who actually did that. So I'm leaning to a false discovery here.



It feels telling that when I click that link, the associated tweets are the kookiest of conspiracy theories about Pfizer execs, vaccine mandates and people screeching about things they clearly don't understand. I suspect in a few days, with Bloom and others reviewing the preprint they're going to be forced to pull it down and "rework" it.


It's incredible that the only sensible comments on this thread are this far down. The HN audience really has a massive Dunning-Kruger problem when it comes to science topics (especially true for COVID conspiracy theories).


As a pharmacologist reading the HN crowd arguing about hydroxychloroquine a year or so ago really highlighted 90% of this site regarding topics other than development or start-ups is at the same level as youtube comments, but with more arrogance.


Eh... That's why it's important to speak up and teach where you can. You never know who might learn something, and you may find yourself learning something in return.

Nobody gains anything in silence however.


I had the same experience a couple of years back when people started discussing AI. I try to keep this in mind but somehow I keep forgetting. It’s genuinely difficult to filter good from bad takes without expert level understanding of a topic unfortunately.


Looks like he deactivated his account. I would guess the response to these tweets might have something to do with it.


I sincerely hope not. From one of the authors:

"Scientists publish papers not because the paper is the end of science, but because it is a unit of research that is valuable to share with others so that others can use this brick of knowledge and either build with it… or find its weakness and break it down...We wrote our entire analysis in R and shared our code with the world. I tried SO hard to check every single line of code and make our pipeline clear & easy to reproduce. However, despite nearly giving myself stomach ulcers checking every line and stressing about these findings, it’s possible someone finds a mistake in our work. We don’t share this work happily - this is the saddest paper I’ve ever written. We’ve shared our code precisely for that reason: we want you to see exactly what we’ve done, and if we’ve done something wrong we are open to hearing it."

As to your original concern, it is a valid one. I wrote this is response to pre-prints popularized via the press earlier this year:

-> Make bold, unjustifiable claims in the preprint; -> Ensure widespread coverage in the science press; -> Walk back those claims during peer-review; -> Get published; and then -> Watch blue checks tout original claims as "Fact!"


Any publicity is good publicity. Sprinkle in some words about "this needs further study" and hope someone comes along to fund the next few years of your lab.


Lazy question. Is there a git url for the code?


To save you the two clicks, the code is at https://github.com/reptalex/SARS2_Reverse_Genetics


> Is this going to be yet one more of those “will be withdrawn after peer scrutiny but by then it is too late because the false meme has been injected into the public consciousness” things?

They only get withdrawn if they go against the narrative. Any kind of paper that says masks work, lockdowns work, or any paper suggesting Covid is worse than any virus ever… it’s totally cool to share publicly. Doesn’t even matter if it is poorly constructed or turns out to be false.


do I understand this correctly? that the paper is saying that covid 19 is highly likely a synthetic virus?


It's claiming that there are sequences on the viral code that are unlikely to have occurred naturally, but are really convenient for slicing the genetic sequence in a lab context.

Sort of like if you shaved the fur on a hyena and discovered a "THIS END UP" tattoo on its skin.


Or rather: you have many hyenas without fur, and on the skin you can see scars that suggest the skin has been cut at very specific places, places where you'd normally see operation scars. Now all the discussion has come down to is how likely it is that the positional patterns and the type of scar itself could occur in nature. For example whether Hyena bites and scratches make the same scars as human scalpel.


oh damn! thats very interesting


On October 18, these three scientists presented an argument with evidence that Covid-19 was created in a laboratory. Their paper has not been peer reviewed, and it is self-published. The discussion here has a couple of counter-arguments. Non-experts would do well to wait for peer review before accepting these arguments.



I've seen several papers with the opposite conclusion. Why do we only see the lab leak hypothesis front paged on HN?


To my mind, there are a few reasons:

1) It makes the pandemic deterministic (bad lab security means an outbreak) instead of stochastic (wildlife spillover). That is, to be frank, even as an epidemiologist who is very skeptical of the lab leak hypothesis, a comforting thought.

2) It's a popular topic in the Substack/Medium set, because it moves the pandemic back into their wheelhouse of expertise, international relations, policy, etc. It becomes a human problem with human solutions.

3) It appeals to the contrarian mindset.

4) All of the lab leak papers at least attempt to show definitive proof. In contrast, actually finding the source of spillover events is the work of decades (and isn't always or even often successful). "Science is slow and uncertain" is a less compelling narrative.


> That is, to be frank, even as an epidemiologist who is very skeptical of the lab leak hypothesis, a comforting thought.

Yes, much more comforting to believe that the virus originated in a lab, that a successful (so far) conspiracy has been carried out to conceal its origins, and that there will be no transparency or accountability for any of the people involved, who are likely continuing similar research today.


There have been a huge number of knock on ramifications in the field to the idea that the lab leak hypothesis might be true already.


You're talking about the field of above-board academic research. Academics didn't create SARS-CoV-2 and cover up their own role in it.


It is not possible to be certain of this. You can't state that as an incontrovertible fact.

Is it beyond reasonable doubt that a virus created as part of a gain-of-function research programme escaped? We simply don't know definitively one way or the other, and it's rather unscientific to make blanket statements about these things one way or the other in the absence of evidence to validate the claim. (I'm an immunologist, by the way.)


In a way, the actual origin of Covid is a secondary matter at this point (because it's unlikely that China will give the required evidence to definitively disprove either the zoonotic or the lab leak hypothesis... am I wrong here ?).

The more pressing matter is the conspiracies between some top level specialists and some governments that seem to have effectively pushed public opinion away from the lab leak hypothesis (where they both can take a lot of blame) to the zoonotic hypothesis (less blame) - even though actual specialist opinion was (and still is) pretty split... (Not so much if you remove the specialists with a conflict of interest and the pressure they and the governments might have been able to exert ?)


I know very few specialists, including those without a conflict of interest, who think the lab leak hypothesis is true.

From my perspective inside the field, a small number of specialists and a very enthusiastic lay readership have pushed the lab leak hypothesis in a way that presents far more disagreement than there actually is.


Hmm, it's interesting that it's so different from my impression of specialist opinion...

Especially when the CNRS - (which would be partially guilty in the case of a lab leak !) - published the claim that the lab leak hypothesis could not be dismissed.

But you know how paradigms work, you have to be an anti-conformist to support what is going to become the new paradigm...

And sure, there most certainly is a very enthusiastic lay readership on the lab leak side... but why do you think there isn't one on the zoonosis side ?

And especially an even bigger fraction of specialists and lay readership that aren't rejecting either ?

(It bears repeating : I'm not anti-zoonosis, since I am not competent to judge either way anyway, I just find it very worrying that there seemed to be a campaign from quite early on, when social pressure wouldn't have had as much effect yet, to outright dismiss the lab leak theory, even though among the specialists gathered at the WHO in January 2020, there was no such consensus.)

I guess it only goes to show how bubbly the Internet (and your specific sub-field bubble ?) can be...


So one key is to not that not thinking the lab leak hypothesis is true =/= thinking it can be dismissed.

Because showing a zoonotic origin of a disease like SARS-CoV-2 is the work of years, if not decades, and with the disruption of the pandemic to research + the political ramifications of that work, it may well already be beyond us.

I think the reason there isn't an enthusiastic readership on the zoonotic side is simply because that's how people are conditioned to think about these diseases. Everything from movies like Outbreak to Contagion, the current fuss about the emu YouTube woman, etc. primes people to think about that. It's hard to foster an enthusiastic readership for what people think they already know.

It's also not particularly...sensational. There's no smoking gun. No shadowy conspiracy. The problems that emerge from zoonoses are hard and dispersed. It makes for very bad SubStack material.


One of the leading lab leak hypotheses (of which there are several) is that it was created as part of academic research - hence the fuss over PREDICT, the NIH and WIV.


> It makes the pandemic deterministic (bad lab security means an outbreak) instead of stochastic

"Bad lab security means an outbreak" is also stochastic though?


Not in the same way - it is admittedly still stochastic, but the product of conscious though, decisions, etc. that have very obvious bad actors and process improvements. So semi-deterministic might be a fairer way to put it. But at the very least, "Stochastic with a much, much smaller threat surface".


> 1) It makes the pandemic deterministic (bad lab security means an outbreak) instead of stochastic (wildlife spillover). That is, to be frank, even as an epidemiologist who is very skeptical of the lab leak hypothesis, a comforting thought.

As if epidemiologists are the only profession studying stochastic phenomena. What does deterministic vs stochastic have to do with it? Every phenomena has stochastic effects.

Bad lab security is stochastic too: consider experimental security protocols; imagine them being set in year X, no protocol results in 100% security, we can only make attempts to drive down probability. Say a certain step requires tubes to be UVC sterilized with a certain fluence dose per cm^2. This was deemed suitable in year X during which experiments happened at rate R_x. This implicitly corresponds to some unknown rate of lab leaks world wide (perhaps once in a century for example). As education and automation progresses, such experiments are occurring at higher and higher rates. A possible lesson from a conclusion that it was a lab leak, and that people again underestimated the exponential curve of scientific growth could be that that security protocols be formulated in terms of the global rate of research. If theres 100x the rate of experiments compared to 20 years ago, perhaps the protocols should be tighter so that the scurity lapse rate decreases 100x as well, in order to maintain the constant tolerable global rate of lab leaks. The electrons in a flip flop may individually behave stochastically, but collectively very deterministically. The misconception -that it is safe enough if lab security protocols meet a constant per single execution safety bar- will deterministically be violated with constant growth of rate of experiments. (EDIT/ADDITION: In a world that includes the global rate of experiments, a current researcher, reading the securiy protocols of the past will find the current protocols unfairly absurd, when he realizes they are 100x more stringent, but thats because there was perhaps 10x fewer researchers in the past and 10x higher troughput of experiments; similarily a researcher at the end of his career will find the security measures to be say 100x more stringent compared to the start of his career)

> 2) It's a popular topic in the Substack/Medium set, because it moves the pandemic back into their wheelhouse of expertise, international relations, policy, etc. It becomes a human problem with human solutions.

Why does humanity like to install lightning rods? because it moves the 'act of god' lighting phenomena (and resulting fires etc.) back into their wheelhouse of expertise, policy, etc. It becomes a human problem with human solutions.

> 3) It appeals to the contrarian mindset.

There exists no such thing as the contrarian mindset. That's an epidemic idea used to silence criticism.

> 4) All of the lab leak papers at least attempt to show definitive proof. In contrast, actually finding the source of spillover events is the work of decades (and isn't always or even often successful). "Science is slow and uncertain" is a less compelling narrative.

For the average cold coronavirus (not sars cov), its very easy to detect and trace the path of a virus, its just that nobody bothers.

When a new potential epidemic is detected (just a few death and a few sick people, as the likelihood punctures the noise-floor), it was quickly traced to the cooling towers nearby the city where I live. Turns out there was negligence which allowed a microbe to prosper and infect people. Initially there was doubt about which cooling tower of which company caused the infections, but by genetic sequencing it was quickly identified which tower caused the issues. It didn't take decades, just weeks.


Zoonotic spillover is likely increasing in China due to climate change, more human-animal contact and animal farming, it probably has a large "determinstic" background.

Of course if you go down that rabbithole then you might wind up concluding that the vegans have been right all along, which certain groups might find an unpalatable result.


I don't think you have to attribute it to some interest at stake: if it is zoonotic, people see it as the cost of living. Of course, that has to be changed, but it's a boring conclusion, and unlikely to cause a change quickly, because we are slow as molasses. A lab leak is more exciting, should not happen, gives more to discuss, and can be outlawed easier. I think that explains that there's more interest.

I'm still on the fence, BTW.


Cynical engineers are required to believe in it because it's 1. "the opposite of the current thing" 2. contains more rationality than the other leading hypothesis (i.e. more of what happened was planned, rather than an accident) 3. and lets them blame a person rather than a bat.

An ironically fitting end for them would be that it is true, they get all similar research banned, and then the next virus is zoonotic and nobody is prepared for it due to not doing any research.

edit: added "ironically"


Unfalsifiable speculation about the psychological motives of people proposing a hypothesis, a retribution fantasy on the "fitting end" for such cynical deviants, plus an inversion of what side of the investigation was subjected to censorship? Nice combo.


That's pretty dark.

A nicer thing to wish for is that it leads to better safety an accountability protocols.


None of the gain-of-function research has been any help at all in the CoVID pandemic, nor in the SARS outbreaks. I also fail to see how it could have helped. What particularly helpful insight or technique came from that particular lab in Wuhan?


Now that you mention it, I wonder if there’s any correlation between the startup-mindset (believing that success comes from hard work more than good luck), and the conspiracy-mindset (believing that disaster comes from hard work more than bad luck)...


Honestly, I think part of it is a wider failure of people not engaging in stochastic thinking.


"Dog bites man" is not a story because it's just a thing that happens. "Man bites dog" is a story because it is surprising and not normal.


I have trouble understanding your point. Do you mean that HN front page should only show articles which are in line with the most common papers?


Wasn't this a conspiracy theory at the start if this?

I'm sure this kind of talk might have been squashed, suppressed, and some people might even people got banned for saying such things?


It is a conspiracy theory until you have good data and rigorous analysis. If you cannot see that people resisting jumping on an emotionally appealing bandwagon with racist undertones before careful evidence is collected on it, then I honestly don't know what to do for you.


>If you cannot see that people resisting jumping on an emotionally appealing bandwagon with racist undertones

It wasn't that back then, especially if you were following trends in that vein of research. We just had a cheeto in office, so people reached for the first shoe to dismiss it/discredit the cheeto instead of actually grappling with an intrinsically frightening, destabilizing possibility.

Also, treating anything as a conspiracy theory (a phrase coined, popularized, and cultivated during the Cold War/Vietnam era to discredit activists/leakers) by default is perhaps not the best starting point of a proper investigator. You should be going in with as few notions as can be supported by facts you can prove, or that can't be otherwise immediately disproven. The disprovable ones should only be thrown out if you are handed evidence that disproves it. Something you can't prove, bit you can't get cooperation to disprove is still on the table.

Once you have eliminated the impossible, whatever remains, no matter how improbable, must within it contain the truth. And you won't ever get there if you're putting out your eyes in the beginning.


It is still a conspiracy theory, because it theorizes the existence of a conspiracy designed to cover up the true origins of SARS-CoV-2, it just happens to be a theory that is almost certainly true. The conflation of the term "conspiracy theory" with an entirely separate meaning of "untrue claim" is a huge problem, exactly because it is used as an excuse to censor people who are pointing out where conspiracies are likely to be hiding - a highly valuable and useful activity that society should encourage!

At any rate, this particular conspiracy theory was obviously very likely to be true right from the moment it was first proposed. The analysis and data was always rigorous. This new genetic evidence is good, but the extraordinarily tiny probability of a very unusual new virus emerging right next to one of the only labs in the world doing research on that exact type of thing, already rendered it irrational to adopt the natural origin theory right from day one. You don't even need to study the RNA at all to know that SARS-CoV-2 is a very unusual event, having CoV virus labs in a city is also very unusual, that lab leaks are common, that public databases being pulled offline due to "hacking" right before such a virus emerges is very unusual, and that these things intersecting followed by suspiciously immediate and vehement denial on the back of no evidence at all is also very unusual. Estimate those probabilities and then multiply them together. The number you get is astronomically low.


If this turns out to be true, how many cumulative man years of media coverage on this issue can we say was complete garbage?


.


I'm pretty sure the way they did science in the 24th century of a fictional TV show is slightly different from how it's done in reality.


I Honestly thought it was common belief that corona come from that lab in china.

But now after reading this, I searched a bit and read up on it and I guess it is still a somewhat honest debate on the topic.

Even if it was lab made, it would be sort of stupid to dig in to it, due to the political nature of the matter. What happened happened, most likely the release would have been accidental, so why play blame games.


I don't think it's all about blame, knowing it came from a lab would also pull into question the practice of engineering viruses and the safety and security standards required.


True, it would be helpful in that kind of way. The risks does seam to outweigh the benefits.

That said, the blame game would still be there, and people would pull that card...


If it is possible that viruses of this type could be engineered, then lab safety needs to be upgraded anyway (or GoF research banned, or both) regardless of the origin of this specific virus.

It’s all about blame. Blame is a useful geopolitical tool.


If it was released from a lab, no matter whether or not it was accidental, Covid would be the mother of all torts. The entire planet can show harm and will be very interested in recovering their losses from the entity that mis-handled a lethal virus.


It'd be much like trying to recover money stolen by a drug addict though. Instant bankruptcy and proportionally nothing recovered, to the point that it's not even really worth it.


A drug addict with a $14T GDP.

With ~7M deaths and a ~$1-10M value of human life [1], that's $7-70T in losses in lives alone, before lost productivity and economic value.

[1] https://en.wikipedia.org/wiki/Value_of_life#:~:text=In%20Wes....


Only if liability transfers to the state. Even if state-owned, plenty of state-owned enterprises have limited liability.


Even if that state has nukes?


The poor handling of the pandemic in many countries likely greatly increased the losses in those countries and also made it harder for other countries to get it under control. The tortfeasor would likely only be responsible for the losses that would have likely occurred if the pandemic had been handled competently.

A lot of countries would probably really not want to have the kind of public in-depth examination of their pandemic responses that would be necessary to figure damages.


i wonder what % of damage it caused to economy, compared to WW2 reparations.


Yup, I think that's the real worry of CCP officials and other world leaders, that not only will China lose face, lose influence, but also the world will demand reperation. That might cause a collapse of the already unstable Chinese system with ripple effects all over the world.


> why play blame games

Because of the question of liability: If it was lab made and accidentally released, was it due to recklessness or criminal negligence? Is someone guilty of involuntary mass-manslaughter? Or if this was state-sponsored research, could they be found liable for the damage caused?


What court would usefully find this? What authority would enforce it?


I think you'll find the wheels are already in motion (if the hypothesis is true).


US court?


Against China?


Even the big paper publishers say "No conclusive evidence for either theory."

https://www.science.org/content/article/do-three-new-studies...

>> Still, Worobey and his co-authors concede, even that evidence might not be enough to end this polarizing debate. “With the way that people have been able to just push aside any and all evidence that points away from a lab leak, I do fear that even if there were evidence from one of these samples that was full of red fox DNA and SARS-CoV-2 that people might say, ‘We still think it actually came from the handler of that red fox,’” Worobey says.


> Even the big paper publishers say "No conclusive evidence for either theory."

They initially claimed their evidence was "dispositive" but their peer reviewers (correctly, IMHO) made them take it out.

https://ayjchan.medium.com/evidence-for-a-natural-origin-of-...


I originally didn't think that it came from a lab. Primarily because flu epidemics have happened before and they were of natural origin and there hadn't been good evidence that it came from a lab. But my opinion changed as more good evidence started to turn up that suggested engineering.


There's literally no evidence suggesting engineering. This is a deeply flawed paper (several cornaviruses of natural origin "rate" higher as artificial using their methodology) and essentially every other paper claiming to find some signal of engineering has been disproven or retracted.


Genuinely surprises me that there are people out there who think or have been persuaded the virus is of natural origin. The lab right next to the market was literally studying and experimenting with the exact same type of virus. How can a someone think that that's just a coincidence?

Add that to the fact that the funding for that research lab was approved by the same guy who become the de facto thought leader on the virus in the U.S., AND funded by the foundation of one of the most recognizable American billionaires. To put the cherry on top, even suggesting a synthetic origin resulted in bans on most social platforms!

This stuff is common sense...Occam's razor comes to mind. No wonder there were so many "conspiracy theories".


> Genuinely surprises me that there are people out there who think or have been persuaded the virus is of natural origin. The lab right next to the market was literally studying and experimenting with the exact same type of virus. How can a someone think that that's just a coincidence?

It seems plausible, at least, that it leaked from the lab, in the sense that labs aren't magically impenetrable and leaks could happen.

> Add that to the fact that the funding for that research lab was approved by the same guy who become the de facto thought leader on the virus in the U.S., AND funded by the foundation of one of the most recognizable American billionaires. To put the cherry on top, even suggesting a synthetic origin resulted in bans on most social platforms!

I don't understand what this is supposed to mean. How does the fact that this guy (Fauci?) approved some funding make COVID seem more likely to have been leaked from a lab? It seems natural that somebody who's been working in government on medical topics at a high level for a long time would have approved funding on lots of things, and also likely that they'd become a figurehead in a pandemic, but I don't see any deeper links.


Absolutely incredible to think that the national institute of allergy and infectious disease might fund research into infectious disease.


I think you’re committing a motte-and-bailey here. We are talking about research deemed so dangerous Obama banned it, not infectious disease research in general.


It looks like the US government stopped funding Gain of Function research for 3 years while working on handling guidelines, is that what you are referencing? That seems more like working out some details than burying the forbidden tomes in the desert.


My Occam's razor says that every other virus has a natural origin, why wouldnt this one too? Maybe your razor needs sharpening?


Seeing how the previous SARS-CoV was from natural origins, most likely the following ones will be, too.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7113851/

>> In most bat families, both alpha- and betacoronaviruses are known, and these detections have originated from both frugivorous and insectivorous bat hosts. Lack of detection in the remaining bat families is likely due to non-exhaustive sampling of the almost 1200 extant bat species (Schipper et al., 2008, Simmons, 2005, Teeling et al., 2005). This void may be filled in future studies.


> My Occam's razor says that every other virus has a natural origin, why wouldnt this one too?

That's simply not true. There's an entire branch of virology dedicated to synthetic viruses; the first was made over 20 years ago.


[flagged]


Does your razor include a previous pandemic originating from a Soviet lab leak?

https://en.wikipedia.org/wiki/1977_Russian_flu


Updated so that now all viruses both past and present are of synthetic origin. Thanks!


Virus that escaped from a lab != intentionally engineered synthetic virus.


Obviously. I’m just pointing out his razor isn’t well informed.


You seem to think a natural origin virus can't be A) examined in a lab, and leaked; B) cultured in a lab, and leaked; C) tweaked in a lab, and leaked.

All three of these are possible. And all are compatible with your Occam's razor (and mine).

My bet is with option C.

Lab leak of a natural-origin, but tweaked, virus.

Yes, it can be both natural origin and tweaked. When you tweak something, you have to start with a something, and that something can have a natural origin.

The evidence against the lab leak appears cooked up based on data provided by interested authorities in China, and was assembled and presented by compromised people (people in the field with an active interest in protecting their own field) who actively promoted the false claim that they had no conflict of interest.


What do you think "gain of function" research is even doing? Starting with natural origin viruses, and tweaking them.


How can you prove that natural origin is the hypothesis selected by the razor in this particular case?


The point of Occam's Razor is you can't prove the things the Razor leans you to. Not in a way sufficient to remove the need to invoke the Razor. But you can say that one explanation is simpler than another (such as a pandemic virus being more closely patterned to every other pandemic in human history than to a novel mechanism that has never become a pandemic before).

I see a bright glow on the eastern horizon about 7AM and it's probably the sun coming up. It could be the first strike in a world-ending nuclear exchange. I can't prove it isn't.

... but it's probably not.


I wasn't asking for a proof of the phenomenon, but a proof that the razor points to that phenomenon.


That's going to come down to an individual observer's priors on probabilities of "pandemic virus being more closely patterned to every other pandemic in human history" vs. "pandemic introduced via a novel mechanism that has never become a pandemic before."


Wasn't the lab working with coronaviruses? So maybe some of it escaped. I really don't see how that's an unnecessary multiplication of entities. Is the objection simply that, pandemics have emerged from markets, but not labs? But we know that an escape of a coronavirus could lead to a pandemic.

I see nothing needlessly complex here, and certainly not extraordinary.


I don’t believe anyone is saying it’s needlessly complex, but Occam’s Razor isn’t “the second simplest explanation is usually the correct one.”

On one hand you have the market which housed animals that throughout its history some have had and spread different coronaviruses naturally. Every prior pandemic has been believed to have been from animal to human transmission, it’s seemingly the simplest explanation.

The lab leak theory of course assumes a lab accident. Biolab accidents are fairly rare, some years have no recorded accidents, others a handful. They’ve all been successfully contained with no more than a couple fatalities each.

It’s not that it’s needlessly complex. It’s that you have a market with animals who have coronaviruses and you have a lab that studies coronaviruses. Both are reasonable explanations, but a pandemic similar to all other prior pandemics and lots of humans being exposed to lots of potential virus carriers with no protection _is_ a simpler explanation.

Having said that, I’ve remained open minded and willing to consider both reasonable and unproven. Some pandemic will be the first from a lab leak, this may have been it. We will probably never definitively know.


> It’s not that it’s needlessly complex. It’s that you have a market with animals who have coronaviruses and you have a lab that studies coronaviruses. Both are reasonable explanations, but a pandemic similar to all other prior pandemics and lots of humans being exposed to lots of potential virus carriers with no protection _is_ a simpler explanation.

The likelihood a lab accident is independent of "all other prior pandemics", because it is not adjusted by whether there have been two natural pandemics, or two million of them. Rather, it's adjusted by the safety practices at the lab.

It's still not obvious that the wet market explanation is simpler than the lab leak. I don't see Occam's Razor making a selection here.


When you hear hooves, think horses not zebras.

All throughout history viruses have spread from close animal contact with humans. Including—and I cannot stress this enough—a novel coronavirus circa 2002 that spread from bat to intermediary host to humans in a wet market in the very same country.

In comparison, for the lab leak you’re assuming a lab accident (rare), that it wasn’t noticed and quarantined immediately (even rarer), and that the wet market still randomly became the epicenter of early cases despite at that point the transmission being human to human (rare, there’d be no reason for it to be any more significant than any other place people gather).

So when we’ve never seen a significant outbreak from a lab accident (zebras) and throughout all of human history we’ve seen viruses spread to humans from close contact with animals (horses)… yes when there is ambiguity to what the cause is, defaulting to what we’ve always seen before—including literally just 20 years ago—is simpler.


Lab accidents may be rare in general, but the lax safety at Wuhan had already [0] been noted two years prior.

The horse/zebra analogy is not so convincing within miles of a zebra breeding farm, with fences reported to be weak. Near this place, such assumptions don't carry their normal weight.

[0] https://www.voanews.com/a/covid-19-pandemic_chinese-lab-chec...


Entities should not be multiplied without necessity.

Put another way, the simplest explanation that fits the available evidence is most likely to be correct.

At this point it seems to me that a lab origin that superspreads at a market looks rather simpler than a market origin with double zoonosis and no trace of an intermediate host.


To be fair, there’s no trace of it in the lab either /s

Double zoonosis? The virus jumped to us and then to dogs, cats, tigers, hippos, pigs, etc. 29 known species! An unfathomable triple zoonosis that happened many, many times (thousands?) over the years. I’d say double zoonosis isn’t too uncommon.

The initial SARS presumably passed from bat to intermediate host to humans in a wet market. This exact type of event happened in China just 20 years ago.

And there isn’t “no trace” of an intermediate host. There are suspected intermediate hosts but none confirmed. Similarly SARS had a couple potential intermediate hosts, and MERS had a potential intermediate host but unanswered questions around that, too. Scientists had more confidence in those theories, but still uncertainties.

Again, I think both theories are plausible. But being a repeat to what we’ve seen before _is_ simpler. When you hear hooves, think horses not zebras.


https://en.m.wikipedia.org/wiki/Sagan_standard

Extraordinary claims require extraordinary evidence. This is why people are skeptical.


Great attempts are made to prevent lab leaks, so a failure of procedure can't be considered extraordinary.


It depends on bayesian priors.how mahy coronaviruses were studied, how frequently a new virus is found next to a biolab of said virus by chance etc. The extraordinary claim could be that it is of natural origin actually.


Fun fact! There are just 9,110 named virus species as of 2021. But there are estimates that mammals and birds may have as many as 1.7 million undiscovered viruses. An older source says there were ~200 some known viruses that could infect humans as of 2013. They estimate there are 800k unknown ones which could infect us. Basically every time they look for viruses, they find new ones.


However, the evidence that Ecohealth Alliance refuses to share their Wuhan lab notebooks “explains away” other beliefs.

If you’re at home and you hear loud booms, then look at the calendar and see it’s July 4, the date “explains away” the thousands of other explanations for why you are hearing loud booms.


How many bat coronaviruses existed in the vicinity of the Wuhan market, including the specimens from the laboratory ? How many bats were freely roaming in the neighborhood? That's a much better probability analysis, not a flu virus in Antarctica


Right, what I’m saying is we know <1% of virus types, and it’s reasonable to believe we know <1% of coronavirus strains.

The bats don’t need to be at all near the market, the theory is an intermediate host was in the market. A bat could infect the host population days prior or years prior. A long potential span of time.

It might be illustrative to look at MERS. They found camel populations >3,000 miles away from the outbreak with antibodies to MERS. One potential explanation is the virus doesn’t sicken camels and had been spreading in their population for a long time prior to transmission to humans.

Or we look to SARS. They did not find SARS in a bat population, but after 5 years of searching they did find a cave where all the building blocks existed. It was a couple provinces away from the outbreak. >500 miles.

So despite your beliefs, how many bats freely roaming the neighborhood may not actually be a much better probability analysis than flu viruses in Antarctica. /s


Because the theory of intermediate host hasn't Led anywhere after 3 years but we know for sure and as a fact that experimentos with bat coronaviruses did happen 1km from epicentro, I don't get why people insist so much. I think it's more like "I hope it's not true" thing.


No experiments with bat Coronaviruses happened "1 km" from the epicenter. WIV is 15km away.. so even if you were to believe it originated at the WIV, it's still odd why it took off at a seafood market 15km away instead of at the very busy campus or any market much closer to their research center.


It's very clear that you really don't want to believe in the probability. You know nothing is 100%. If you assign 100% for no leak then your bias level is 100%. If you assign some probability then we just disagree on the probability level. Btw, people from the lab live there and shop there.


Of course there's the possibility - but the probability surely changes if the "market right next to the research lab" actually turns out to be a 20-minute drive away, no?

For people in the Bay Area, this is roughly equivalent distance-wise to a conspiracy theory about an outbreak at Twitter's HQ started by an "adjacent" lab in the Upper Mission that actually turned out to be in Jack London Square in Oakland instead.


Again, bayesian priors. If there are only 4 labs worldwide messing with exactly these viruses, the combinatorics of this happening 20km from one of those 4 is infitesimal.


I'm bored of this debate that happens every time one of these charlatans releases a paper that grabs a bunch of headlines and is disproven two days later (as already has happened with the original in this thread).

But "This happening" needs to be defined for Bayes to be of any use. State clearly what you think happened and it can be judged, because as has been proven over and over again, the virus is of natural origin. So the proximity to a lab studying Coronaviruses isn't nearly as interesting given China's history with these viruses.

This is long but in my opinion, worth reading regarding the Bayesian probabilities that actually matter here: https://protagonistfuture.substack.com/p/natures-neglected-g...


See how your wording is super biased? "Charlatans"


What would you call non-virologists writing virology papers, releasing them as non-reviewed preprints with a huge PR push and then weeks later quietly admitting that their science was bad and none of their conclusions held up?

What if they’d done that multiple times?

“Bias” isn’t relevant when it comes to calling out bad actors.


How is it an extraordinary claim?

It is by far the simplest and most obvious explanation.


Add to that that China forbade investigations and furthermore these regimes have a long history of obscurity


Studied and then escaped from a lab does not equal synthetic.

Exactly nothing in your post supports a synthetic origin over a sample from nature that got studied in a lab.

Jumping to these kinds of one sided conclusions should be a red flag.


It could also be studied and altered in a lab, and then escape.


> It could also be studied and altered in a lab, and then escape.

Of course. But saying "It was studied in a lab, therefore it must be synthetic" is just idiotic and pretty much derails the entire conversation.

It derails the conversation because it conflates a number of issues that have to be assessed seperately: lab safety, mucking about with deadly deseases, and the intent to set deadly deseases free.

If you start conflating any one of these, it becomes more easy to deride the entire conversation as tinfoil hattery. If just one of those components is an easy target for derision or is seen as unlikely, everyone who dares to pick up any of the other points gets painted in the same colour in the public eye. There doesn't even need to be any evil intent behind it. It's just how people are.

And this is precisely what happened when people started pointing out the existence of a lab next to ground zero. A lot of scientists retracted their speculation on this because they were immediately put into the same category as the crowd shouting "tHe cHiNeSe made iT!!".


Exactly. And so the crowd shouting "tHe cHiNeSe dIdN't mAkE iT" has been dominating the conversation by disingenuously pretending that the bar to be met is 100% total from-scratch lab synthesis or nothing.


I thought it was common sense to not believe in a thing if there is no evidence of a thing. The only way I am going to believe it was a lab leak is if credible evidence shows up. Until then it is an unknown.


Exactly this. I'm completely stunned how everybody seems to be ignoring the fact that the closest relatives of the Covid virus are in Yunnan, which is 1500 km away from Wuhan, or even further, in Laos. So a zoonotic origin of the outbreak which started in the Wuhan market would have involved animals transported to the market over 1500km or more. Doesn't anybody ask themselves, why Wuhan? Why was there no outbreak in any other city in China that was closer to Yunnan? Or even some in some market in Yunnan itself?


Everyone agrees that SARS1 was of natural origin, right? All of the same points you're making about Covid19 apply equally to the origin of SARS1.

First outbreak of SARS1: Foshan

Distance from cave containing progenitor virus in Yunnan to Foshan: 1,400 km

Doesn't anyone ask themselves why Foshan? Why not Qujing or Hanoi?

People don't take the location as some dispositive point proving the conspiracy because 15 years prior, a bat coronavirus from nearly 1,500km away infected an animal in the wildlife trade that started a pandemic in busy metropolitan area. So it doesn't seem especially unlikely for that to have happened again.


I have the opposite Occam's razor thoughts. My opinion is we are not capable of developing in a lab a virus that is so transmissible and survivable in human species only. I think the complexity of the virus machinery and its interactions inside of our bodies and immune system is beyond astronomical in complexity. It's laughable to suggest that we are so intelligent as to invent a better version of the machinery that is hypothesized as the very machinery responsible for creation of multi cellular life itself.


Well, SARS-CoV-2 isn't transmissible in humans only; it survives just fine in many other mammals, including pets.

This is basically the reason we can't get rid of it. Even if we had a perfect human vaccine that blocked transmission, it wouldn't help.


Speaking as an artist, many (most?) of my enduring works were the result of an accident of some kind. I call them "happy accidents" because I recognized that the mistake was better than whatever the vision was that I had at the time.

As a corollary, there are unhappy accidents, and with respect to life forms in a chaotic system, such accidents can perpetuate and endure without human recognition.


The lab in question was doing gain of function research.

The existence of gain of function research shows that you are wrong.


> My opinion is we are not capable of developing in a lab a virus that is so transmissible and survivable in human species only. I think the complexity of the virus machinery and its interactions inside of our bodies and immune system is beyond astronomical in complexity. It's laughable to suggest that we are so intelligent as to invent a better version of the machinery that is hypothesized as the very machinery responsible for creation of multi cellular life itself.

The very first synthetic virus created in 2002 and was modeled after polio, which is fairly transmissible and affects humans. That virus was made 20 years ago; synthetic biology has come a very long way since then.

Does that fact alter your opinion?


Sorry it does not. Was that virus more deadly, effective, or in any other measure better than the original polio? Or was it "polio" with a spike protein glued to it's head?


You can trigger a pandemic by making a virus less deadly too.


There is a huge variety of viruses, just because someone wrote the equivalent of "Hello World" doesn't mean you can write a complicated CMS anytime soon.

Synthetic biology (the actual synthesis of DNA) has come a long way, we don't understand all the components yet though.


> The lab right next to the market was literally studying and experimenting with the exact same type of virus. How can a someone think that that's just a coincidence?

If they built the lab next to the viruses that were already there, then it would be reverse causation, which isn't a coincidence, but also kind of is one.

That's the meaning of "correlation is not causation".


Except they didn't. The WIV was founded decades before coronaviruses were a subject of interest, and the bat coronaviruses they were researching that SARS-CoV-2 is related to all come from caves in Yunnan, which is about 1,500km away.


I see dang no longer bothers to comment when he editorializes a title with a "?" to indicate a non-narrative headline.


Sometimes I do and sometimes not. That's nothing new.

I don't think it's so unclear why we'd put that up there.


[flagged]


More likely it's academic laboratories doing reckless gain of function research because it keeps the grant money rolling in, and in Wuhan Lab's case, they also half ass the safety precautions.


Isn't it enough infections already to sell tons of drugs, why to design the new for this case?


Well, if you own the patent on the only approved drug for the condition, it would make financial sense.

NOTE: I am emphatically not asserting that this is the case with SARS-CoV-2. I am only responding to the parent comment's criticism of a possible financial motive for hypothetical biotech companies to purposefully engineer pathogens for profit.


>So biotech companies secretly manufacturing illnesses....

They are not but they profit from lab leaks liked Covid.


The lab leak hypothesis is the “jet fuel can’t melt steel beams” of our day.


Or the "Iraq doesn't have WMD" of our day.


Of course the lab leak origin is probably accurate, it appeared right in the vicinity of the lab and is the exact thing they were studying. This shouldn't be controversial, it shouldn't have been politicized to the point where the truth matters far less than whose narrative it supports. We should have taken it as an accident, learned lessons, improved processes, and moved on. Instead we tore ourselves apart, and now we're back to playing nuclear Russian roulette, with maybe half the chambers loaded this time. Good job humans.


True story, in early January of 2020, on an academic virology forum, which I wont link, it was known the virus was synthetic. Before the story hit the mainstream media, professors were sharing data about the virus fingerprint. There were concerns about the integrity of the data shared by the Chinese. One specific comment I will never forget by a Harvard professor when discussing the implications:

"Should we turn on the bat signal"

Which I always interpreted it to mean should we alert the authorities. A week later all posts were deleted and nothing could be found.


> which I wont link

Why not? HN promotes logical evidence-backed discussion. The least you can do is link the forum.


He's talking about virological and making up threads that never existed. There was a lot of discussion in the early days about how it seemed possible it was synthetic based on the binding site affinity but as scientists do, they researched the literature, conferred with their colleagues and almost all came around to the virus arising via natural origin.

The early threads are super interesting in an anthropological sense:

https://virological.org/t/tackling-rumors-of-a-suspicious-or...


You can claim whatever you want, I'm just sharing what I discovered


This is super interesting but without any sort of substantiation it is indistinguishable from rumor-mongering.

I would truly love to hear more.


Is there a wayback machine archive of that conversation?


“it was known” are weasel words on wikipedia.

Known by whom? Why? Based on what evidence? Can we know it, too?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: