In 1881, Louis Pasteur realized that some diseases may be caused by entities that, unlike bacteria, remain invisible under an optical microscope and cannot be isolated on artificial cultures. After the invention of the Chamberland filter in 1884, it became clear that diseases can be caused by both bacteria and viruses. There are about 1031 viruses on earth and 4–6 × 1030 bacteria.1
Viruses are divided into those that are encapsulated and those that are not. The non-encapsulated viruses are composed of a capsid protein and a nucleic acid, either DNA or RNA; encapsulated viruses are surrounded by an envelope derived from their host and a nucleocapsid. The non-encapsulated viruses are cell killers, because their only way of exiting their host is by inducing lysis. Enveloped viruses leave their host without killing it.
To perpetuate itself, a virus has to infect a living cell. The cell does not enter the world unprotected. Its first barrier against infection is a membrane made of phospholipids. These are studded with proteins and glycoproteins embedded in lipophilic domains throughout the cell membrane. In the presence of water, their amino acids cluster together, extruding a large number of water molecules. The extrusion of water gives rise to a net force of attraction—whence the clustering.
Viral infection begins at the boundary of the cell’s membrane. The extracellular part of a transmembrane protein is studded with oligosaccharide chains covalently attached to amino acid side chains. When a virus approaches a possible host, the glycoproteins of the virus stick to the glycoproteins on the cell membrane. No sticking, no infection. Next—molecular docking between the amino acids of the viral protein and the amino acids of the transmembrane proteins. No docking, no infection. The rest of the story hinges on the type of virus. In non-encapsulated viruses, penetration into the cytoplasm occurs in four steps.2 The first is initiated by receptor-mediated endocytosis. In the second, the virus exposes its lipophilic residues. This allows the virion to be inserted into the cell membrane. The virus then triggers the release of a lytic factor that disrupts the lipid bilayer, enabling viral particles to cross the membrane. This is the third step. Finally, after replication, viral particles are transported across the limiting membrane and released into the cytosol. Encapsulated viruses are the subject of a much simpler story. Virally encoded lipophilic peptides promote the fusion of the viral and cellular membranes; penetration of the virus into the cytosol follows at once. No penetration, no infection. After penetration, the cell’s genome begins the replication and transcription of the viral genome.
The RNA-virus genome is generally small because single-stranded RNA is more susceptible to rupture than double-stranded DNA. What is more, RNA-dependent enzymes lack proofreading and repair capabilities. If they occur at all, large genomes in RNA-viruses are often segmented. The DNA-virus genome may be larger than the RNA-virus genome because the virus relies on the DNA-polymerase of its host cell. Polymerases can check their work, fixing most incorrectly paired bases through proofreading and mismatch repair. Consequently, the average mutation rate is low. Nevertheless, mutations do occur, circumstances that make vaccine development difficult.
The family Coronavirinae consists of four genera: genus α-coronavirus, genus β-coronavirus, genus γ-coronavirus, and genus δ-coronavirus.3 Coronaviruses (CoVs) are encapsulated viruses that can infect humans and many animal species, including swine, cattle, horses, camels, cats, dogs, rodents, birds, bats, rabbits, ferrets, mink, and snakes. Generally γ- and δ-coronaviruses are involved in the infection of birds, although some may also cause infection in mammals; α- and β-coronaviruses harm both animals and humans. Under the electron microscope, CoV virions have diameters ranging from 50 to 200 nm. From the envelope, some spike proteins project 15 to 20 nm and are mechanically bound by means of a footlike extension 10 nm long. These viruses are endemic in human populations, causing 15–30% of respiratory tract infections each year.4 They infect the human airway from the luminal side; progeny viruses are released from the same side, facilitating spread through coughing and sneezing. Seven coronaviruses are known to cause disease in human beings.5 Four commonly circulate in the human population and usually cause mild respiratory illness. Among them, HCoV-229E is the only one that infects non-ciliated cells as it interacts with the human aminopeptidase N (hAPN) receptor, a peptidase predominantly expressed on non-ciliated cells in the bronchus. The other three have caused epidemics of severe acute respiratory diseases. Two of them use the angiotensin-converting enzyme 2 (ACE2) for cellular binding, a receptor expressed on ciliated bronchial cells along with endothelial cells and both type I and II alveolar cells. MERS-CoV can actively replicate in both bronchial and alveolar tissue, contributing to its high mortality rate of 37%, relative to the SARS-CoV mortality rate of 10% and the SARS-CoV-2 mortality rate of 5%.6 Like influenza, these viruses can cause more severe disease in the immunocompromised and the elderly. The mortality rate from seasonal flu is typically around 0.1% in the US. The three coronaviruses kill in much greater numbers.
All coronaviruses develop in the cytoplasm of infected cells, budding into cytoplasmic vesicles from the endoplasmic reticulum. These vesicles are either extruded or released from the cell, and then the cell is destroyed. Coronaviruses encapsulate a positive single-stranded polyadenylated RNA molecule with the largest genome of all known RNA-viruses. They are characterized by a common genomic organization. A single replicase gene stretches over two-thirds of the genome. This gene is comprised of two overlapping open reading frames, ORF1a and ORF1b, that encode for up to 16 nonstructural proteins in a large polyprotein. The structural gene region, which covers the remaining third of the genome, encodes the canonical set of structural protein genes. The structural gene region also harbors several ORFs interspersed along the structural protein coding genes. The number and location of these accessory ORFs vary among the coronavirus species, and most of them have an unclear role. In addition, coronaviruses are capable of genetic recombination if two different viruses infect the same cell at the same time.7
Phylogenetic analysis of 272 coronavirus genomes revealed that the sister lineage of SARS-CoV-2 originated from bats in the city of Nanjing, China, between 2015 and 2017, with two major parents: bat-SL-CoVZC45 and bat-SL-CoVZXC21.8 Further sequencing of ten viral genomes isolated from bronchoalveolar lavage fluids of nine patients in Wuhan revealed a sequence identity above 99.98%, indicative of a very recent emergence into the human population of Wuhan.9 This suggests that SARS-CoV-2 originated from a single source. Phylogenetic analysis of these ten genomes revealed that SARS-CoV-2 is genetically distinct from SARS-CoV and should be considered as a new human-infecting β-coronavirus belonging to the Sarbecovirus subgenus, with a relatively long branch length compared to its closest relatives in bat-SL-CoVZC45 and bat-SL-CoVZXC21. The probable bat origin of SARS-CoV-2 was quickly confirmed with the publication of the complete genome of a new bat coronavirus RaTG13 collected on July 24, 2013.10 Bat-CoV-RaTG13 is the closest relative of SARS-CoV-2, and together they form a distinct lineage. Despite not being directly related to SARS-CoV, SARS-CoV-2 uses the same ACE2 cell entry receptor. But SARS-CoV and MERS-CoV usually pass into intermediate hosts, such as civets or camels, before leaping to humans. Thus three weeks after publication of the bat-CoV-RaTG13 genome, a reinvestigation of SARS-CoV-like coronaviruses detected in the lungs of dead Malayan pangolins, revealed that a pangolin-CoV was at the whole genome level 91.02% and 90.55% identical to SARS-CoV-2 and bat-CoV-RaTG13, respectively.11 With the publication of the two genomes of pangolin-CoV and bat-RaTG13-CoV, it is clear that switching pangolin and bat amino acid sequences would result in a sequence almost identical to SARS-CoV-2.
Genetic data show that SARS-CoV-2 is not derived from any previously used virus backbone. Besides the scenario of a deliberately designed virus, which cannot be disproved with the currently available genetic data, two other scenarios are possible.12 The first involves natural mutations, insertions, and deletions in an animal host followed by a zoonotic transfer to human beings. The second involves a pangolin-progenitor jump into human beings, with the genomic features typical of SARS-CoV-2 developed through adaptation during undetected human-to-human transmission. Proteolytic cleavage of the S-glycoprotein seems to be a barrier to zoonotic coronavirus transmission across species. Thus, acquisition of something like a furin cleavage site might have enabled a bat-CoV to jump to human beings.
At a molecular scale, infection induces an immune response. The adaptor protein PYCARD is rapidly relocalized to the cytoplasm, perinuclear space, endoplasmic reticulum, and mitochondria of the infected cell. Interferon-inducible proteins are also activated as soon as genomic DNA becomes released into the cytosol. They act both as pathogen sensors and as guardians of cellular integrity. On infection by SARS-CoV, the immune system senses the cellular damage induced by proteins that alter membrane permeability.13 These proteins are embedded in the oily envelope of the virus. After fusion with the host membrane, an efflux of K+ ion is induced at the plasma membrane, activating an inflammasome protein known to mediate a cytokine cascade. Blood plasma analysis of COVID-19 patients reveals that, as expected, they have cytokine concentrations in excess of those found in healthy adults.14 As a general rule, the older the infected host, the greater the cytokine up-regulation and the worse the outcome. In some cases, the immune system cannot shut itself down, leading to a cytokine storm. Multiple cytokine storms occurring in the lungs may spread throughout the body via the blood. If nothing is done, irreversible damage occurs in the liver, the kidneys, and the cardiovascular system, organs that are directly irrigated by the blood. Eventually these organs shut down; the body dies.
Elevated C-reactive protein (CRP) levels are also an essential characteristic in COVID-19 infections.15 CRP binds to the phosphocholine moieties expressed on the surface of dead or dying cells and some bacteria. This activates the complement system, promoting phagocytosis by macrophages, which clears necrotic and apoptotic cells and bacteria. In healthy adults, the normal concentrations of CRP vary from 0.8 mg/L to 3.0 mg/L and can increase 10,000-fold from less than 50 μg/L to more than 500 mg/L in cases of inflammation, infection, trauma, necrosis, malignancy, and allergic reaction. Measurement of CRP level in blood has been recommended for detection of COVID-19 in afebrile patients without dyspnea.