On August 27, 2021, the Office of the Director of National Intelligence released a summary of the US Intelligence Community’s assessment on the origins of COVID-19.1 Four of the agencies involved and the National Intelligence Council assessed “with low confidence that the initial SARS-CoV-2 infection was most likely caused by natural exposure to an animal infected with it or a close progenitor virus.”2
One of the agencies—later reported as the FBI3—assessed “with moderate confidence that the first human infection with SARS-CoV-2 most likely was the result of a laboratory-associated incident, probably involving experimentation, animal handling, or sampling by the Wuhan Institute of Virology.”
“These analysts,” the summary continued, “give weight to the inherently risky nature of work on coronaviruses.”4
According to the World Health Organization (WHO), there have now been more than 360 million confirmed cases of COVID-19, resulting in over 5.6 million deaths worldwide.5
Questions about the origins of COVID-19 are of more than academic interest.
From Animal Hosts
Zoonosis is considered the default explanation for the outbreak of any new infectious disease. A number of pandemics occurred during the twentieth century, almost all of them of zoonotic origin. The one known exception is the 1977 H1N1 flu pandemic, which was caused by an insufficiently attenuated vaccine candidate that escaped either from a laboratory or from clinical trials.6
A number of disease outbreaks began in Southeast Asia following zoonotic jumps: the Asian flu pandemic (1957), which originated in China; the Hong Kong flu pandemic (1967); and the avian flu outbreak (2005), which was first reported in Vietnam. The first SARS (severe acute respiratory syndrome) coronavirus outbreak began in China during 2002 and infected more than 8,000 people worldwide between 2002 and 2003, as well as dozens more people in 2004 after several laboratory leaks.
In a 2007 paper for Clinical Microbiology Reviews, a team of virologists from the University of Hong Kong issued a clear warning:
The presence of a large reservoir of SARS-CoV-like viruses in horseshoe bats, together with the culture of eating exotic mammals in southern China, is a time bomb. The possibility of the reemergence of SARS and other novel viruses from animals or laboratories and therefore the need for preparedness should not be ignored.7
Horseshoe bats, the genus Rhinolophus, are the natural reservoir for hundreds of coronavirus strains closely related to the SARS virus.8
Once the SARS-CoV-2 outbreak had begun, virologists quickly reached the conclusion that the pandemic was almost certainly of natural origin. In February of 2020, barely a month after the SARS-CoV-2 genome was released, a team led by Kristian Andersen, an immunologist at the Scripps Research Institute in California, published a preprint and then a paper in Nature Medicine entitled “The Proximal Origin of SARS-CoV-2.”9 If SARS-CoV-2 had been designed, they argued, it could have been designed better, and since it was not designed better, it most likely was not designed. “While the analyses … suggest that SARS-CoV-2 may bind human ACE2 with high affinity,” the Nature Medicine paper noted, “computational analyses predict that the interaction is not ideal and that the RBD [receptor-binding domain] sequence is different from those shown in SARS-CoV to be optimal for receptor binding.” “The high-affinity binding of the SARS-CoV-2 spike protein to human ACE2,” the authors concluded,
is most likely the result of natural selection on a human or human-like ACE2 that permits another optimal binding solution to arise. This is strong evidence that SARS-CoV-2 is not the product of purposeful manipulation [emphasis added].10
A month before Nature Medicine issued the paper by Andersen et al., The Lancet published a letter signed by 27 leading virologists dismissing the hypothesis that the virus originated in a laboratory:
The rapid, open, and transparent sharing of data on this outbreak is now being threatened by rumors and misinformation around its origins. We stand together to strongly condemn conspiracy theories suggesting that COVID-19 does not have a natural origin.11
One of the authors of the letter was Peter Daszak, the president of EcoHealth Alliance, a US-based nonprofit NGO. Since 2004, EcoHealth had been collaborating with the Wuhan Institute of Virology (WIV) on studies of coronaviruses in bats.12 The relationship between EcoHealth and the WIV was close. A specialist in the transmission of infectious diseases among animals, Daszak was frequently listed as a coauthor on their papers, often alongside the director of the WIV’s Center for Emerging Infectious Diseases, Shi Zhengli.13
The authors of the letter that appeared in The Lancet, Daszak among them, declared that they had reached their conclusions while holding no competing interests. It was not until 16 months later that the journal issued a demurral with respect to Daszak’s declaration. He updated his statement to clarify his employment at EcoHealth and the nature of EcoHealth’s research in China, and to affirm that their “work in China was previously funded by the US National Institutes of Health (NIH) and the United States Agency for International Development (USAID).”14 Daszak’s updated disclosure does not include any mention of the WIV, instead referring to EcoHealth’s “collaboration with a range of universities and governmental health and environmental science organisations.”
On January 14, 2021, a multidisciplinary team of international experts, Daszak among them, traveled to Wuhan to investigate the origins of the virus on behalf of the WHO.15 The study lasted 28 days. The WHO team was given a guided tour of the WIV facilities and they were able to interview some of its scientists. The “introduction [of the virus] through a laboratory incident,” the WHO concluded, “was considered to be an extremely unlikely pathway.”16 Instead, they argued, “introduction through an intermediate host is considered to be a likely to very likely pathway.”17 Elsewhere in their report, the WHO team repeated assurances they had received during their time in China:
The Wuhan CDC [Center for Disease Control and Prevention] lab which moved on 2nd December 2019 [to a new location near the Huanan market] reported no disruptions or incidents caused by the move. They also reported no storage nor laboratory activities on CoVs or other bat viruses preceding the outbreak.18
If, in February of 2021, the WHO’s team of experts were prepared to take the WIV scientists at their word, by August of 2021, some of them confessed to having had reservations all along. In an interview for a Danish television documentary, Peter Ben Embarek, the leader of the WHO team, admitted that Chinese officials had pressured them to drop the laboratory leak hypothesis. “In the beginning, they didn’t want anything about the lab [in the WHO report], because it was impossible, so there was no need to waste time on that,” Ben Embarek remarked. “We insisted on including it,” he continued, “because it was part of the whole issue about where the virus originated.”19 Ben Embarek added that there were scenarios under which the laboratory leak hypothesis could be consistent with the assumption that COVID-19 had an animal origin:
A lab employee infected in the field while collecting samples in a bat cave—such a scenario belongs both [emphasis added] as a lab-leak hypothesis and as our first hypothesis of direct infection from bat to human. We’ve seen that hypothesis as a likely hypothesis.20
When questioned about the interview by the Washington Post, Ben Embarek initially claimed his remarks had been mistranslated before declining to comment further.21 But Ben Embarek was not the only one expressing reservations. A month earlier, the WHO’s director-general, Tedros Adhanom Ghebreyesus, conceded during a press conference that there had been a “premature push” to rule out the laboratory leak hypothesis—comments that contradicted the conclusions of the WHO’s own report, released just a few months beforehand.22 He called on China to allow a full audit of the Wuhan laboratories.23 “I was a lab technician myself, I’m an immunologist, and I have worked in the lab, and lab accidents happen,” Tedros remarked. “It’s common.”24
As it turned out, Tedros had every reason to express caution. To date, nearly 82,000 animal samples have been tested in China for SARS-CoV-2. No intermediate animal host has been identified in Wuhan or anywhere else in the country.25
When Mandarins Command
Anthony Fauci has been the director of the US National Institute of Allergy and Infectious Diseases (NIAID) since 1984. Over the last few decades, he has expressed his support for gain-of-function research on numerous occasions. In a 2011 op-ed for the Washington Post co-authored with Francis Collins, the director of the NIH between 1993 and 2019, they made the case for viruses “engineered in isolated biocontainment laboratories” as a means to identify “genetic pathways by which such a virus could better adapt to transmission among people.”26 The benefits were not elaborated in detail, the authors simply noting that, “important information and insights can come from generating a potentially dangerous virus in the laboratory.” The op-ed concludes with a brief consideration of the risks involved.
The following year Fauci published a paper entitled “Research on Highly Pathogenic H5N1 Influenza Virus: The Way Forward,” again making the case for gain-of-function research.27 In his commentary, Fauci acknowledges the question of whether “knowledge obtained from these experiments could inadvertently affect public health in an adverse way, even in nations multiple time zones away.”28 He then invites the reader to consider a hypothetical scenario concerning “an important gain-of-function experiment involving a virus with serious pandemic potential … performed in a well-regulated, world-class laboratory by experienced investigators.” The information gleaned from the study is then “used by another scientist who does not have the same training and facilities and is not subject to the same regulations.”
In an unlikely but conceivable turn of events, what if that scientist becomes infected with the virus, which leads to an outbreak and ultimately triggers a pandemic? Many ask reasonable questions: given the possibility of such a scenario—however remote—should the initial experiments have been performed and/or published in the first place, and what were the processes involved in this decision?
Fauci’s answer is unequivocal:
Scientists working in this field might say—as indeed I have said—that the benefits of such experiments and the resulting knowledge outweigh the risks [emphasis added]. It is more likely that a pandemic would occur in nature, and the need to stay ahead of such a threat is a primary reason for performing an experiment that might appear to be risky.
In his conclusion, Fauci acknowledges “genuine and legitimate concerns about this type of research,” but his message remains clear: the research is worthwhile and important.
Of course, no amount of gain-of-function research has helped the world to “stay ahead” of the COVID-19 pandemic, nor can any advocate of virological gain-of-function research explain exactly how one can stay ahead of nature.
At the end of 2012, Fauci spoke at a workshop on gain-of-function research on HPAI H5N1 viruses hosted by the NIH. “There’s disagreements to the scientific and/or public health value of these experiments,” he remarked in a section of his presentation that discussed funding guidelines, “but I believe people who feel they shouldn’t be conducted are in the minority.”29
During Fauci’s tenure at NIAID, the NIH funded numerous studies involving coronaviruses and gain-of-function research. In 2015, the NIH supported a study led by Ralph Baric, a virologist from the University of North Carolina at Chapel Hill, and the WIV’s Shi. Published in Nature Medicine, their paper described the creation of a chimera, the result of a spike protein gene from a bat coronavirus being pasted into a mouse-adapted SARS virus.30
The completion of this study was only possible after Baric received an exemption for his research from NIH officials.31 In October 2014, the White House Office of Science and Technology instituted a pause in new funding for gain-of-function research after a series of “biosafety incidents at Federal research facilities.”32 They also recommended “those currently conducting this type of work, whether federally funded or not, to voluntarily pause their research while risks and benefits are being reassessed.” Baric wrote to the NIH’s biosecurity board to plead his case and an exemption was granted.33
Three years later, following the election of Donald Trump, Fauci played a key role in the NIH’s decision to resume gain-of-function research.34 The NIH funded a new study that expanded on the WIV’s 2015 work with Baric, creating eight novel chimeric coronaviruses.35 When the 2019 SARS-CoV-2 outbreak occurred, work at the WIV was underway on further research, under yet another round of funding.36
In his May 2021 US Senate hearing, Fauci claimed that the NIH-funded research at the WIV did not constitute gain-of-function research.37 He was emphatic in his denial because his memory was defective in its scope. In a February 2020 email that Fauci sent to his subordinates, obtained under the Freedom of Information Act (FOIA), an attached PDF of the Baric and Shi paper was labeled “SARS Gain of Function.”38
A Perfect Call
Fauci had at his command virologists willing to offer him their advice. Kristian Andersen was among them. Having consulted with his colleagues, Andersen sent Fauci an email on February 1, 2020—also obtained under the FOIA—in which he claimed that the SARS-CoV-2 genome looked engineered, and, what is more, that its genome was “inconsistent with expectations from evolutionary theory.”39 Within hours, Fauci held a teleconference with Andersen, Sir Jeremy Farrar, director of the Wellcome Trust, Collins, and several other virologists.40
A June 2021 article by USA Today reported that, “details of what was said in the meeting, including extensive notes taken by one participant and further thoughts shared by others, were blacked out by the NIH before the emails were made public.”41 Interviewed for the same article, Fauci recalled:
“It was a very productive back-and-forth conversation where some on the call felt it could possibly be an engineered virus” … Others, [Fauci] said, felt the evidence was “heavily weighted” toward the virus emerging from an animal host.42
Although the details of the conversation remain opaque, when the preprint of Andersen’s “Proximal Origin” paper appeared several weeks later, what had before looked engineered now looked natural.43
When the Fauci emails were published in June 2021,44 the shifts in Andersen’s views were greeted with consternation. It had been the WIV’s release of the genome for a viral strain called RaTG13, Andersen later explained, that had changed his mind.45 Curiously enough, Andersen had tweeted about RaTG13 a week before writing his initial email to Fauci.46 Rather than attempting to resolve all these inconsistencies when they were pointed out to him, Andersen instead chose to first delete the offending tweets, and then to delete his Twitter account altogether.47
According to Andersen’s senior colleague, Farrar, other coauthors of the “Proximal Origin” paper were initially even more convinced the virus originated in a laboratory. Farrar later described the events surrounding the meeting with Fauci, Collins, Andersen, et al., in his book Spike: The Virus vs. The People.48 That account was the subject of a mid-2021 article by Unherd:
Before the call on 1 February, Farrar says Andersen was “60 to 70%” convinced the virus came from a lab, while Australian virologist Eddie Holmes was “80% sure this thing had come out of a lab.” Patrick Vallance, Britain’s chief scientific officer who joined the call, tipped off intelligence agencies about their concerns. But others on the hour-long call argued the new virus “was more convincingly explained, scientifically, as a natural spillover than a laboratory event.” Afterwards, the participants swapped notes but Farrar remained torn on the origins. “On a spectrum if 0 is nature and 100 is release I am honestly at 50,” he emailed Fauci. “My guess is this will remain grey unless there is access to the Wuhan lab—and I suspect that is unlikely.”49
The emails obtained under the FOIA revealed that, three days after the call with Fauci, Andersen and Baric assisted Daszak in drafting the letter that subsequently appeared in The Lancet denouncing what, in an email, Andersen would call the “crackpot” and “fringe” hypotheses that SARS-CoV-2 was engineered.50
The following day, Farrar emailed Fauci and Collins again.51 In his message, Farrar reported having convinced the WHO to form a group that would look at the origins of SARS-CoV-2. He also informed Fauci and Collins that the WHO had asked for “names to sit on that Group” and requested that the pair “please do send any names.” Farrar proposed a subsequent meeting to “frame the work of the group” and suggested there would be “pressure on this group from your and our teams next week.”
The emails also reveal that having helped draft the Lancet letter, Baric and Daszak—initially at least—opted not to sign it.52 Baric expressed concern that if he were to sign the letter it might look “self-serving, and we lose impact.” Daszak, on the other hand, sought to downplay his own involvement, along with that of Baric and another virologist, Linfa Wang. “You, me, and him should not sign this statement,” Daszak suggested to Baric and Wang, “so that it has some distance from us and therefore doesn’t work in a counterproductive way.”
“We’ll then put it out in a way that doesn’t link it back to our collaboration so we maximize an independent voice.”
Hot Spot
Whatever the origins of SARS-CoV-2, it was first observed in Wuhan, the initial outbreak occurring between October and December of 2019. The hypothesis that SARS-CoV-2 originated elsewhere and traveled undetected until it reached Wuhan is implausible. Earlier transmission would have led to earlier outbreaks in other locations, or would have produced viral lineages at earlier spots on the SARS-CoV-2’s phylogenetic tree. The virus phylogeny is strongly rooted in Wuhan.53
While there is little doubt that SARS-CoV-2 originated in Wuhan, questions remain about where in Wuhan it originated. After the 2002 SARS outbreak in Guangdong, the first SARS patients had almost immediately been traced to restaurant workers handling exotic animals: palm civets sold at a local market were, within weeks, identified as an intermediate host.54
In November of 2021, the virologist Michael Worobey, writing in Science, argued that the SARS-CoV-2 outbreak originated in the Huanan Seafood Market in Wuhan.55 In an interview with University of Arizona News, Worobey remarked that the evidence was like a “flashing red arrow pointing to the Huanan market as by far the most likely site of origin, with a failure to put a stop to sales of illegal wildlife in markets like Huanan as the reason.”56
Worobey’s article, it should be noted, provided no new evidence for zoonosis and his conclusion was based solely on a reanalysis of Wuhan patient data from December 2019. The data were subsequently shown to be erroneous.57 The “strong evidence” for zoonosis cited by Worobey in his article for Science amounted to nothing more than conjecture: “[T]hat most early symptomatic cases were linked to Huanan Market—specifically to the western section (1) where raccoon dogs were caged (2)—provides strong evidence of a live-animal market origin of the pandemic.”58 Not a single racoon dog has yet been found carrying a progenitor of SARS-CoV-2, nor has any other animal been infected by such a progenitor. Around 82,000 animal samples have now been analyzed in China, including 1,700 recent wildlife trade samples sold in wet markets.59 All were negative for any SARS-like virus.
Whether the outbreak originated in a female seafood vendor at the Huanan Market remains unclear. But the market itself clearly served as an epidemiological hot spot, harboring what Worobey described as a “genuine preponderance of early COVID-19 cases.”60 While some early human cases were, indeed, linked to the Huanan Market, many cases predated the market outbreak.61 Moreover, the SARS-CoV-2 strains circulating in the market were not ancestral, all of them carrying three novel mutations not seen in earlier patients.62 Nor is Wuhan home to the horseshoe bats known to carry SARS-like viruses. Indeed, the likelihood of a bat virus outbreak in Wuhan was deemed so small that in 2018 the city was used as a negative control for a study by the WIV that assessed the risk of zoonotic jumps of SARS-like viruses in Yunnan from bats to people who lived within one to six kilometers of such bats.63 The study found that six of 218 farmers carried antibodies to the bat SARS-like virus called Rp3, in contrast to none of the 240 residents of Wuhan. Both Daszak and Shi are listed among the sixteen coauthors for the study.
After the SARS-CoV-2 outbreak broke out, Daszak cited this study in a tweet to estimate the general incidence of coronavirus zoonotic spillovers.
These jumps occur every day. We conducted sero-surveys in SE Asia & found 3% of rural people have antibodies to bat CoVs. That means 1-7 million people per year exposed to bat origin SARS-related CoVs. It’s utterly illogical to think that this did not lead to the current outbreak.64
If 218 residents of rural Yunnan living in proximity to bat caves showed a 3% rate of seropositivity, then by extrapolation, he argued, one to seven million people in rural Southeast Asia should be exposed to some SARS-related coronavirus every year. It was certainly a curious argument for someone in Daszak’s position to make.
By contrast, Shi acknowledged that Wuhan is an unlikely place for a SARS-like virus to emerge. She addressed the topic in a 2020 interview with Scientific American:
“I had never expected this kind of thing to happen in Wuhan, in central China,” [Shi] remarked. Her studies had shown that the southern, subtropical provinces of Guangdong, Guangxi, and Yunnan have the greatest risk of coronaviruses jumping to humans from animals—particularly bats, a known reservoir. If coronaviruses were the culprit, she remembers thinking, “Could they have come from our lab?”65
The Wuhan Laboratory
In 2019, EcoHealth was scheduled to receive another round of funding from the NIH for project 2R01AI110964-06, “Understanding the Risk of Bat Coronavirus Emergence.”66 This grant, the umbrella project that had funded EcoHealth’s collaboration with the WIV since 2014, had been initiated with three broad aims. The first was to “[c]haracterize the diversity and distribution of high spillover-risk SARSr-CoVs in bats in southern China,” while the second involved “[c]ommunity, and clinic-based syndromic, surveillance to capture SARSr-CoV spillover, routes of exposure and potential public health consequences.” The third aim was much more explicit about what the researchers had in mind:
In vitro and in vivo characterization of SARSr-CoV spillover risk, coupled with spatial and phylogenetic analyses to identify the regions and viruses of public health concern. We will use S protein sequence data, infectious clone technology, in vitro and in vivo infection experiments and analysis of receptor binding to test the hypothesis that % divergence thresholds in S protein sequences predict spillover potential.67
Prior to the cancellation of the NIH grant in April 2020,68 EcoHealth received US$3.1M in funding for the project.69 Of that amount, US$600,000 was passed on to the WIV.70
In a December 2018 paper for Nature Reviews Microbiology, researchers from the WIV outlined their vision for the next stages of the project:
[F]uture work should be focused on the biological properties of [SARS-like and MERS (Middle East Respiratory Syndrome)-like] viruses using virus isolation, reverse genetics and in vitro and in vivo infection assays. The resulting data would help the prevention and control of emerging SARS-like or MERS-like diseases in the future.71
The ultimate goal of such work may have been to create a pan-coronavirus vaccine. Research focused on SARS-like and MERS-like viruses was a stated goal not just for WIV, but for EcoHealth as well. Daszak said as much publicly in a November 2019 interview:
You can manipulate [coronaviruses] in the lab pretty easily, it is the spike protein drives a lot of what happens with the coronavirus zoonotic risk. You can get the sequence, you can build the protein. We worked with Ralph Baric at UNC who did this, insert into a backbone of another virus and do some work in the lab. So, you can get more predictive when you find a sequence. … The logical progression for vaccines — if you are going to develop a vaccine for SARS, people are going to use pandemic SARS, but let’s try to insert some of these [other spike genes] and get a better vaccine.72
In addition to sub-grants from EcoHealth, research at the WIV was supported by Chinese funding. Ben Hu, a researcher at the WIV, was awarded a three-year grant from the Youth Science Fund for a project to investigate “Pathogenicity of Two New Bat SARS-Related Coronaviruses to Transgenic Mice Expressing Human ACE2 Receptor.”73 Hu has been a member of Shi’s group at the WIV since 2015.74
The WIV undertook its work for the best of reasons. Prior to the emergence of SARS-CoV-2, it was widely held among researchers that a future epidemic, or Disease X as the WHO termed it, might be caused by a coronavirus.75 In June 2020, Shi and her colleague Shibo Jiang published a paper entitled “The First Disease X Is Caused by a Highly Transmissible Acute Respiratory Syndrome Coronavirus.”76 “Disease X,” Shi and Jiang observed, “would be a new disease with an epidemic or pandemic potential caused by an unknown pathogen.” Unknown? Not quite. “[T]he first Disease X,” they wrote, “could be a transmissible infectious disease caused by a novel coronavirus originated from bats.”
The Tell-Tale Genome
SARS-CoV-2 contains a number of curious genomic features—its novel furin cleavage site most obviously. No other known SARS-related coronavirus has a furin cleavage site. To enter human cells, SARS-CoV-2 uses a spike protein that attaches to human ACE2 receptors. The protein must then be cut by an enzyme in order to fuse with the cell membrane and penetrate the cell. The spike protein consists of two parts, S1 and S2. S1 is responsible for primary contact with the receptor, and S2, for fusion and penetration. For S2 to initiate fusion, the S1/S2 junction must be cut by a host enzyme like furin or TMPRSS2. This junction is where the novel furin cleavage site is found in SARS-CoV-2. Furin is a very efficient enzyme, found both on the surface and in the interior of many human cells, most notably in the airway epithelium. It is furin’s presence in the interior of the cell that allows newly formed virions to emerge in a pre-cut conformation, enhancing their infectivity.
The furin cleavage site in SARS-CoV-2 was created by a peculiar 12-nucleotide insertion—so peculiar, in fact, that the genomic locus in SARS-CoV-2 enveloping its furin cleavage site is, at least, twelve nucleotides longer than any of its relatives.77 Virologists have created novel furin cleavage sites in coronaviruses repeatedly.78 It is obvious why.79 Furin cleavage sites greatly expand both the tissue and species tropism of a virus.80 And furin cleavage sites enhance the adaptation of a viral strain to certain cell lines.
The WIV failed to mention the novel furin insertion in its first two papers on SARS-CoV-2,81 even though the WIV had in its possession the closest relative of SARS-CoV-2 at that time—the strain RaTG13.82 Genomic comparison made the furin cleavage site obvious. In their diagram comparing the two genomes, the WIV cut off the comparison just before the novel insertion. In the paper that first mentioned RaTG13, the WIV researchers did not explain where RaTG13 came from or how they came to possess it.
The novel insertion is comprised of the nucleotides T CCT CGG CGG GC; the corresponding amino acids are proline (CCT) arginine (CGG) arginine (CGG) alanine (GCA)—or PRRA in one-letter amino acid notation. The nucleotide insertion is odd because it is not completely in frame, the insertion splitting the ancestral serine codon TCA while preserving the downstream frame.83 Odd as well are the two repeating CGG arginine codons. CGG is the rarest of the six codons to code for arginine in bat coronaviruses, and the SARS-CoV-2 insertion is the only example in which two CGG codons are consecutive. In fact, the CGG-CGG doublet is the only one coding for two arginines in all 255 SARS-like viruses with protein annotations listed in the NIH Genetic Sequence Database (GenBank).84
In contrast to bat coronaviruses, CGG is the most frequent arginine codon in humans.
RaTG13
The virus RaTG13 is SARS-like and belongs to the family of beta-coronaviruses. It is a close relative of SARS-CoV-2. Having obtained the SARS-CoV-2’s genome on December 27, 2019,85 the WIV would have been in a position to see that it matched RaTG13 by 96.2%. The WIV announced they had RaTG13 in their possession in a preprint uploaded to bioRxiv on January 23, 2020, and shortly thereafter published in Nature.86 Their explanation was terse:
We then found that a short region of RNA-dependent RNA polymerase (RdRp) from a bat coronavirus (BatCoV RaTG13)—which was previously detected in Rhinolophus affinis from Yunnan province—showed high sequence identity to 2019-nCoV. We carried out full-length sequencing on this RNA sample.87
This suggests that WIV researchers first detected a match between SARS-CoV-2 and a short RdRp fragment of RaTG13. With the match in hand, they were then led to fully sequence RaTG13. After the WIV was forced to release raw sequencing data, it was noted that they contained amplicons of 2017 and 2018.88
When had RaTG13 been sequenced?
In 2018, as the WIV later admitted.89
It was a compromising admission. No match between the RdRp fragment and SARS-CoV-2 was needed in order to establish a match between RaTG13 and SARS-CoV-2. The WIV already had the full RaTG13 genome: it would have shown up as the top match to SARS-CoV-2.
But there was another important aspect of RaTG13’s history that the WIV failed to disclose: the fact that it had been collected in 2012 from a mine in Mojiang, a county in the south of Yunnan province. That year six miners contracted viral pneumonia while working in the mine, and three of them later died.90 The WIV was subsequently invited to analyze tissue samples from the miners. They found SARS-reacting IgG antibodies.91 Over the next several years, researchers from the WIV visited the Mojiang mine several times looking for new viruses. The WIV eventually acknowledged these details in an addendum published nine months after the Nature paper.92 In the same addendum, the WIV claimed that RaTG13 is identical to a sample labelled Ra4991, which was first mentioned in a 2016 paper,93 and whose 370-nt RdRp fragment was deposited with GenBank at the time.94
The WIV also neglected to mention the novel furin cleavage site in SARS-CoV-2: it would have been immediately apparent to any trained coronavirologist looking at the alignment of spike proteins in SARS-CoV-2 and RaTG13. In their paper disclosing RaTG13,95 the WIV chose to cut off that alignment just before the novel furin cleavage site. Days before coauthoring that paper, Shi coauthored another paper, this time with Jiang, on SARS-CoV-2 that correctly identified the site of SARS-CoV-2’s S1/S2 cleavage at the RRAR|S novel cleavage site.96
It is hard to believe that experts such as Shi or Jiang could have missed the novel furin cleavage site at the S1/S2 cleavage junction—while specifically performing their alignment in the search for the S1/S2 cleavage site of SARS-CoV-2. It seems Shi missed it twice. The Nature alignment used the corrected amino acid numbering of SARS-CoV-2’s spike, whereas the Jiang paper used the uncorrected numbering: the WIV initially erroneously included nine extra amino acids in SARS-CoV-2’s spike protein sequence they uploaded to GenBank.97 Thus, the proper S1/S2 SARS-CoV-2 cleavage locus is R685/S686 and not R694/S695. Another researcher who presumably missed the novel furin cleavage site was Ben Hu, who was acknowledged in the Jiang and Shi paper for his work on “phylogenetic analysis of 2019-nCoV S gene.”98
RaTG13 itself remains somewhat mysterious. Its receptor-binding domain does not bind to any bat ACE2 receptor studied. A recent study tested the ACE2 receptor from the very bat species RaTG13 was allegedly sampled from, R. affinis.99 It found that RaTG13 is bad at binding to R. affinis ACE2. Even the T403R spike mutation, which was observed to make it bind well to human ACE2, was helpless when it came to R. affinis ACE2 binding.
By contrast, RaTG13 binds very well to human ACE2, and binds best of all to rat and mouse ACE2 receptors. Using the cited study’s metric of the number of infected cells per well, RaTG13 was only about half as effective as SARS-CoV-2 at binding to the human ACE2 receptor (100k cells/well), and about eight times better than the effectiveness of SARS-CoV-2 using the R. affinis bat ACE2 (12k cells/well).
These findings suggest that RaTG13 might not be the original bat virus but could instead be the result of significant serial passaging of a bat virus in human cells or in mice100—which is where it could have encountered selective pressure to optimize its binding to both human and rodent ACE2 receptors. The WIV definitely sampled some SARS-like coronavirus from a mine in Mojiang, which they originally called Ra4991. This name first appeared in print in a 2014 master’s thesis by Ning Wang, written under Shi’s supervision.101 As part of his thesis, Wang amplified the N gene for a number of bat coronaviruses, Ra4991 being among them. Ra4991 was then briefly mentioned in a 2016 WIV paper as a novel SARS-related strain.102 A 370-nucleotide fragment of its RdRp gene was deposited with GenBank.103 In 2019, a WIV master’s thesis by Yu Ping, co-supervised by Shi, described Ra4991 as having been fully sequenced, along with three other SARS-like coronaviruses.104 Those genomes were never made public.
It is unclear why RaTG13 had to be renamed in early 2020 if it was completely acceptable to keep calling it Ra4991 in 2019. Renaming viral sequences is quite rare in coronavirology and renaming something without referencing its previously published name is unheard of. In a Q&A published by Science in July 2020,105 Shi provided the following explanation:
Ra4991 is the ID for a bat sample while RaTG13 is the ID for the coronavirus detected in the sample. We changed the name as we wanted it to reflect the time and location for the sample collection. 13 means it was collected in 2013, and TG is the abbreviation of Tongguan town, the location where the sample was collected.106
For a sample attributed to a bat fecal swab, the metagenome of RaTG13’s sequencing data contains an uncharacteristically low number of bacterial reads.107 Just 0.65% of the total reads belong to bacteria. By comparison, another WIV fecal swab sample from R. affinis (SRR11085736), which was uploaded to GenBank on the same day as RaTG13, contained 91% bacterial reads. The metagenomic profile of RaTG13 raw data is more consistent with a cultured sample.
In the same Q&A with Science, Shi claimed that the original RaTG13 sample is no longer available for external verification.
As the sample [RaTG13] was used many times for the purpose of viral nucleic acid extraction, there was no more sample after we finished genome sequencing, and we did not do virus isolation and other studies on it.108
This claim is not only extremely troubling given all of its peculiarities, but is plainly inconsistent with a cultured sample—that is, one that scientists have managed to get to self-propagate in a cell culture indefinitely.
The World Wide Web
Several lab leaks are known to have occurred over the past forty years. In November 2019, just prior to the current pandemic, an outbreak of brucellosis was traced to two labs in Lanzhou in northwest China.109 Around 100 students and staff were initially infected, that number eventually growing to 10,528 confirmed infections.
The deadliest pandemic of past years was the so-called Russian flu outbreak of 1977, which was first detected among children in China.110 Today, the scientific consensus is that the outbreak came about through either a lab leak or a clinical trial of an insufficiently attenuated vaccine.111
The ensuing pandemic killed 700,000 people.112
In 1979, there was an anthrax leak from a laboratory in Sverdlovsk, Russia, which killed 66 people.113 The first SARS virus has also escaped from laboratories on at least four occasions: in 2003 in Singapore, in December 2003 in Taiwan, and twice in the spring of 2004 in China.114
Outsider auditors raised concerns about safety at the WIV as early as 2018.115 That year, US Embassy officials visited the institute and conducted several interviews with researchers, including Shi. After their visit, the diplomats dispatched cables to Washington outlining their concerns about inadequate safety controls. “During interactions with scientists at the WIV laboratory,” one of the cables reported, “[the officials] noted the new lab has a serious shortage of appropriately trained technicians and investigators needed to safely operate this high-containment laboratory.”116
Concerns about the risks associated with operating research laboratories were shared by the Chinese government. In January 2019, China’s state news agency Xinhua reported that the Ministry of Education had ordered “a nationwide safety overhaul at higher education institutions’ laboratories”:
Universities were asked to have around-the-clock and all-around control over laboratory hazards and risks during procurement, transportation, storage and use of dangerous goods and hazardous substances and waste disposal, according to a notice issued by the ministry.117
Soon after the COVID-19 outbreak, in February 2020, the novel SARS-CoV-2 virus was reported to have infected lab personnel in China,118 although these reports were subsequently denied. In November 2021, a confirmed SARS-CoV-2 lab leak in Taiwan led to 110 people being exposed to the virus by a single infected BSL-3 lab worker.119
Among the numerous changes observed on the WIV website since the COVID-19 outbreak in recent years, one of the most noticeable was the removal of a page that listed bat coronaviruses as BSL-2 pathogens.120 The BSL designation signifies compliance with four levels of “standard microbiological practices, special practices, safety equipment, and laboratory facilities” for “activities involving infectious microorganisms, toxins, and laboratory animals,” defined by the Centers for Disease Control and Prevention.121 As part of her Science Q&A, Shi confirmed that “coronavirus research in our laboratory is conducted in BSL-2 or BSL-3 laboratories.”122 The critical differences between BSL-2 and BSL-3 were outlined in an article published by the MIT Technology Review:
BSL-2 is for moderately hazardous pathogens … and relatively mild interventions are indicated: close the door, wear eye protection, dispose of waste materials in an autoclave. BSL-3 is for pathogens that can cause serious disease through respiratory transmission, such as influenza and SARS, and the associated protocols include multiple barriers to escape. Labs are walled off by two sets of self-closing, locking doors; air is filtered; personnel use full PPE [personal protective equipment] and N95 masks and are under medical surveillance.123
In sharp contrast to the WIV, Baric’s research on constructing novel chimeric coronaviruses was undertaken in enhanced BSL-3 conditions with “additional steps like Tyvek suits, double gloves, and powered-air respirators for all workers.”124 The precautions did not stop there. “All workers,” the MIT Technology Review reported, “were monitored for infections, and local hospitals had procedures in place to handle incoming scientists. It was probably one of the safest BSL-3 facilities in the world.”125 But even with all these precautions in place, the risks were unavoidable: “That still wasn’t enough to prevent a handful of errors over the years: some scientists were even bitten by virus-carrying mice. But no infections resulted.”126
In May 2021, the Wall Street Journal broke a story that, according to a previously undisclosed US intelligence report, three WIV researchers were hospitalized in November 2019, “with symptoms consistent with both COVID-19 and common seasonal illness.”127 While Chinese authorities maintain that first cases of SARS-CoV-2 are only known to have occurred in December, there is at least one report that the first case was recorded on November 17, 2019.128
If
In March 2018, EcoHealth and the WIV submitted a grant proposal to the Defense Advanced Research Projects Agency (DARPA) for their Preventing Emerging Pathogenic Threats program.129 The proposal was entitled “Project DEFUSE: Defusing the Threat of Bat-Borne Coronaviruses.” It outlined a massive US$14 million research program that included collecting thousands of viral samples in bat caves in Yunnan to identify high-risk strains with the ultimate goal of immunizing bats against them. Most intriguingly, the proposal revealed intentions to genetically engineer novel cleavage sites in the spike gene of SARS-like coronaviruses:
After receptor binding, a variety of cell surface or endosomal proteases cleave the SARS-CoV S glycoprotein causing massive changes in S structure and activating fusion-mediated entry. We will analyze all SARSr-CoV S gene sequences for appropriately conserved proteolytic cleavage sites in S2 and for the presence of potential furin cleavage sites. … Where clear mismatches occur, we will introduce appropriate human-specific cleavage sites and evaluate growth potential in Vero cells and HAE cultures. … We will also review deep sequence data for low abundant high risk SARSr-CoV that encode functional proteolytic cleavage sites, and if so, introduce these changes into the appropriate high abundant, low risk parental strain.130
It is clear that the researchers planned to look for the presence of furin cleavage sites at evolutionarily conserved cleavage locations in the spike gene, and if, for some reason, there was a mismatch at such conserved locations, they would introduce a human-specific cleavage site into such viruses. They also proposed to look for “functional proteolytic cleavage sites” in other high risk SARSr-CoVs and then genetically engineer such cleavage sites into low risk strains, in order to evaluate their growth potential in human airway epithelial (HAE) cell cultures.
We are traveling in all the old familiar circles. The PRRA insertion into SARS-CoV-2 created a furin cleavage site at the evolutionarily conserved S1/S2 cleavage junction. It is there that many other coronaviruses have functional furin cleavage sites, including a rodent coronavirus with an RRAR furin cleavage site, collected by Shi’s team from a cave in Yunnan during 2017.131
The PRRA insertion to create the PRRAR|SV cleavage site might have been inspired by the PAAR fragment found at the S1/S2 junction in another SARS-like virus from Yunnan. The strain RmYN02 was extracted from R. malayanus bats in 2019—the same bat species that harbored the BANAL-52 strain discovered in Laos in September 2021.132 BANAL-52 is noteworthy as the first bat strain found to have an RBD that is nearly identical to the RBD found in SARS-CoV-2. Before the discovery of BANAL-52, only a pangolin-derived strain was known to harbor that particular RBD.
BANAL-52 has one further distinction. Once it had been evaluated across all of its genome, BANAL-52 displaced RaTG13 as the closest relative of SARS-CoV-2.133
As part of their collaborative arrangements, EcoHealth dispatched bat samples to the WIV for analysis.134 The WIV also gathered their own samples during field trips in Laos and from locations in Yunnan province near the Chinese border with Laos.135 The research involving these samples is discussed in a 2020 paper by Alice Latinne et al.:
Our phylogenetic analysis shows a high diversity of CoVs from bats sampled in China, with most bat genera included in this study (10/16) infected by both α- and β-CoVs. In our phylogenetic analysis that includes all known bat-CoVs from China, we found that SARS-CoV-2 is likely derived from a clade of viruses originating in horseshoe bats (Rhinolophus spp.). The geographic location of this origin appears to be Yunnan province. However, it is important to note that: (1) our study collected and analyzed samples solely from China; (2) many sampling sites were close to the borders of Myanmar and Lao PDR; and (3) most of the bats sampled in Yunnan also occur in these countries, including R. affinis and R. malayanus, the species harboring the CoVs with highest RdRp sequence identity to SARS-CoV-2. For these reasons, we cannot rule out an origin for the clade of viruses that are progenitors of SARS-CoV-2 that is outside China, and within Myanmar, Lao PDR, Vietnam, or another Southeast Asian country. Additionally, our analysis shows that the virus RmYN02 from R. malayanus, which is characterized by the insertion of multiple amino acids at the junction site of the S1 and S2 subunits of the Spike (S) protein, belongs to the same clade as both RaTG13 and SARS-CoV-2, providing further support for the natural origin of SARS-CoV-2 in Rhinolophus spp. bats in the region.136
Laotian BANAL viruses include strains designated BANAL-116 and BANAL-247. Both strains are identical to RmYN02 at their PAA locus at the S1/S2 junction, but differ in their RBDs.
If the WIV was gathering samples inside or near Laos before the pandemic began, they may well have encountered a BANAL-52-like bat virus in co-circulation with an RmYN02-like strain exhibiting a non-functional PAAR cleavage site at the S1/S2 junction. The discovery might have prompted them to carry out an experiment along the lines suggested in the DEFUSE proposal: an experiment to turn PAAR into PRRAR and create a fully functional RRAR polybasic cleavage site.
If this counts as a conjecture, it is by no means lacking in plausibility. The PAA fragment in RmYN02 and BANAL-116 and -247 is coded by CCT GCA GCG codons; the PRRA insertion in SARS-CoV-2 is coded by CCT CGG CGG GCA—i.e., the codons in the SARS2 insertion coding for proline (CCT) and alanine (GCA) are identical to those found in RmYN02 and Laotian strains.
The idea behind such work is obvious and clearly spelled out in the DARPA proposal: investigate what effect the novel furin cleavage site might have on human cells—e.g., HAE cells—or humanized mice to assess the risk of human emergence the novel bat strains might pose. Such experiments would have been a good fit for the 2019 grant from the Youth Science Fund awarded to Ben Hu at the WIV for investigations of the “Pathogenicity of Two New Bat SARS-Related Coronaviruses to Transgenic Mice Expressing Human ACE2 Receptor.”137
The decision to use CGG-CGG codons for the two arginines might have been informed by the desire to incorporate a FauI tracking beacon in the newly created furin cleavage site that would enable quick screening of whether the insertion is still present or has mutated away.138 Virologists make use of various restriction enzymes designed to recognize certain genetic sequences and cut nucleotide chains on recognition. The restriction enzyme FauI recognizes
5' CCCGC
3' GGGCG
and cuts
5' —CATG— 3'
3' —GTAC— 5'.
The method using restriction enzymes for the purposes of screening for presence or absence of a particular genomic feature is termed restriction fragment length polymorphism (RFLP),139 and it has been in use for decades.140 Examples of FauI being used for RFLP analysis are well-documented in the scientific literature,141 and the WIV is known to have employed the RFLP technique in the past.142 If a researcher at the WIV had chosen to insert a novel furin cleavage site into a coronavirus, they might have also chosen to equip their insertion with a tracking beacon that could assert its continued presence via the RFLP technique. The furin cleavage site has a tendency to mutate away in vitro or in certain lab animals.143
The WIV’s burgeoning interest in spike cleavage during 2019 may have been motivated by the work being undertaken by Baric’s group at that time.144 In 2015, Baric and Shi published a paper on the critical importance of the furin cleavage site in MERS as a catalyst for its jump from bats to human beings.145 One of the coauthors on their paper was Shibo Jiang. Two years earlier, Jiang had reported the creation of a novel RIRR cleavage site via a 12-nucleotide insertion (CGG ATC AGG CGC), although not in a coronavirus.146 In 2020, he collaborated with Shi to develop a pan-coronavirus therapeutic, a fusion inhibitor peptide.147 Work on this project seemed to have been ongoing in late 2019.148 The cleavage of the spike protein is what activates fusion-mediated entry.
These observations indicate a suggestive, or even suspicious, pattern of performed or planned research at the WIV, and one that could well have produced SARS-CoV-2 with its novel furin cleavage site so uncharacteristic of SARS-like bat coronaviruses.
EcoHealth and the WIV carried out gain-of-function research both on SARS-like viruses and the vastly more deadly MERS-like viruses. The MERS outbreak in 2012 killed approximately 35% of everyone who contracted the virus.149 Between 2016 and 2019, EcoHealth and the WIV were engaged in creating novel chimeric MERS-like viruses with different RBDs spliced from other MERS-like bat viruses.150
EcoHealth’s fifth-year progress report disclosed the creation of twelve novel chimeras.151 The resulting novel viruses were then tested in humanized mice and exhibited greatly increased pathogenesis.
The WIV seems to have been engaged in MERS gain-of-function research not just in collaboration with EcoHealth but separately, as well. Unpublished MERS-like reverse genetic backbones have been found in agricultural datasets from Wuhan that do not seem connected to the EcoHealth grant.152
Conclusion
The current SARS-CoV-2 pandemic has been, and continues to be, a public health catastrophe—the most serious in a century. Questions about the origins of COVID-19 are, at once, matters of legal, financial, and moral concern. For the moment, researchers can do no better than to hope for an inference to the best explanation; and, for the moment, the best explanation seems to be that the virus escaped from the WIV.
The WIV was the biggest transporter of viruses to Wuhan from all over Asia, including many SARS-like viruses from Laos and Yunnan. Phylogenetic analysis shows that the SARS-CoV-2 outbreak was perfectly localized in Wuhan, as all strains that have been found in other locations are descendants of the Wuhan strain. Had the virus been circulating undetected in other parts of China, virologists would have eventually noted those pre-Wuhan strains and their descendants in the phylogenetic tree. Even after sequencing over six million SARS-CoV-2 genomes, no evidence has been found of pre-Wuhan SARS-CoV-2.
Not only was the WIV the biggest reservoir of SARS-like viruses in Wuhan, if not the world, its scientists were engaged in creating novel SARS-like and MERS-like chimeras and potentially supercharging their transmissibility and pathogenicity. With these circumstances in mind, consider the following facts:
- Shi and Jiang were experts in spike protein cleavage and were working on a pan-coronavirus therapeutic to inhibit post-cleavage fusion of the virus with cell membranes.
- Jiang had previously created a novel furin cleavage site via a 12-nucleotide insertion, though not in a coronavirus.
- In a joint grant proposal the WIV and EcoHealth submitted to DARPA they suggested creating novel human-specific cleavage sites.
Taken together, these points make the 12-nucleotide insertion that has created a novel furin cleavage site in SARS-CoV-2—so uncharacteristic of SARS-like viruses—look extremely suspicious.
The behavior of the WIV and its scientists also raises any number of troubling questions. The viral strain RaTG13 is a case in point. First collected by the WIV in 2013, RaTG13 was sequenced in 2018, but not disclosed until after the SARS-CoV-2 outbreak. In their initial disclosure, the WIV failed to mention how or when they came to possess RaTG13, failed to indicate that it was previously called Ra4991, failed to cite their own 2016 paper first mentioning it, and seemed to imply that they only sequenced the sample after the outbreak. This does not seem like the behavior of scientists trying their utmost to establish how a Laotian or Yunnan virus came to cause an outbreak in Wuhan.
None of these points is in itself conclusive, but the circumstantial evidence is more suggestive of a lab leak than an act of nature.
There is an additional reason to take seriously the question at hand. It is prophylactic. Knowing at last that COVID-19 had its origins in the WIV would go some way toward enforcing a worldwide ban on gain-of-function research—research that is almost as useless as it is dangerous.