Scientists have always thought viruses much smaller than bacteria. And with good reason. Most bacteriophages are 100 times smaller than the bacteria that they infect. Bacteria can be viewed under an optical microscope; but an electron microscope is required in order to see a viral particle. When giant viruses were discovered in 2003, they came as a surprise. The giant mimivirus, for example, had actually been discovered in 1992, but misidentified as a bacterium—Bradfordcoccus.1 The confusion was understandable. Mimivirus particles are 750 nanometers long—easily visible with an optical microscope; and, what is more, the dye used to reveal bacterial cell walls also stained mimivirus particles.
A number of other monster viruses have been discovered in the last decade.2 Most of them have been isolated and described by Didier Raoult and Jean-Michel Claverie in Marseille. If Marseille is now the Mecca of giant virus research, Vancouver is something of a mini Mecca. It is there that Curtis Suttle and his team isolated and described both Cafeteria roenbergensis and Bodo saltans.3 Most giant viruses observed in the laboratory have been studied in amoebae,4 but giant viruses are found in extraordinarily diverse terrestrial and aquatic environments.5 Some infect algae, and there is some suspicion that the mimivirus infects human cells as well.6 All giant viruses infect eukaryotes.
Viruses closely related to the mimivirus have been grouped into the family Mimiviridae. The other giant viruses have been classified into three families: Molliviridae, Pandoraviridae, and Pithoviridae. Mimiviridae and Molliviridae produce virions, or viral particles, with a characteristically icosahedral shape. Pandoraviridae and Pithoviridae produce strange ovoid particles that have often been confused for intracellular protists.7 One of the most unusual of the giant viruses is a member of Mimiviridae. Discovered in Brazil, the Tupanvirus contains a virion featuring a gigantic head and an equally gigantic membranous tail. Such a shape is without precedent in the viral world.
Giant viruses contain linear or double-stranded DNA that encode for 500 to 2,500 proteins. The Pandoravirus encodes 2,000 genes, which is only 10 times fewer than a human cell, and, at roughly 2.5 million base pairs, its genome is the largest of any known virus. The mimivirus genome encodes about half that number. Produced by a Pithovirus, the largest known virion is an ovoid particle with a length of 1.5 micrometers and a width of 0.5 micrometers. The size of a virion and the size of its genome are not necessarily correlated. They are no good guide to the threshold beyond which a virus is counted giant.8
Five years after giant viruses were discovered, researchers learned that giant viruses can themselves become infected by smaller viruses.9 The virophages that infect them have genomes that code for only about twenty genes. These virophages, unable to infect amoebae by themselves, are transported inside amoebae by their giant virus hosts.10 Once inside, the virophages transcribe and replicate their genes using the machinery of the giant virus, the giant virus then using the amoeba’s machinery to transcribe and replicate its own genes.11 The three known virophages—Mavirus, Sputnik, and Zamilon—happen to infect members of the Mimiviridae family, but virophages targeting other giant viruses are likely to be identified.
Frankenlike
The discovery of giant viruses and their virophages immediately reopened an old question: are viruses alive? Viruses had been excluded from the tree of life because they lacked the machinery needed either to reproduce or to synthesize proteins. A virus must hijack a cell before it can do either. But when scientists realized that viruses are more complex than originally presumed—encoding several thousand genes and becoming infected by other viruses—they began to suspect that viruses might be alive after all. When a virophage infects a Mimiviridae, it seems to become ill, its virions manifesting an abnormal morphology.
How can something be ill if it is not alive?12
Viruses had been excluded from living systems for another reason. They did not seem to share proteins that are universal across the three cellular domains: Archaea, Bacteria, and Eukarya. Yet many giant viruses do encode universal proteins, including RNA polymerase, some aminoacyl tRNA synthetases, and a few proteins involved in protein synthesis or DNA replication. Some phylogenetic analysts now place giant viruses in a fourth monophyletic group somewhere between Archaea and Eukarya.13 For all that, the fact remains that giant viruses lack the capacity to synthesize their own proteins without parasitizing a cell. Purificación López-García and David Moreira have thus disputed the phylogenetic analysis behind the phylogenetic analysts, arguing that the giant viruses are nothing more than genetic pickpockets, their genes acquired from a cellular origin in yet another triumph of theft over honest toil.14
Chantal Abergel and Claverie have also argued for the cellular origin of viral genes. But they have noticed, in addition, that most of the genes that giant viruses encode lack homologues in both modern cellular organisms and giant viruses from other families. Giant viruses, they suggest, might have arisen by regressive evolution—features lost instead of gained—from cellular lineages that diverged from modern cellular organisms before the advent of the last universal common ancestor of Archaea, Bacteria, and Eukarya. Claverie predicts that, as new giants are discovered, the distinction between viruses and cells will blur even further.15
Virus, Virion, and Virocell
When in doubt, define. The existence of giant viruses prompted virologists to search for a definition that could encompass the whole range of viruses, from the smallest, with genomes encoding two genes, to the largest, encoding thousands. All viruses produce virions—a viral particle consisting of a core of nucleic acid surrounded by a capsid protein shell.16 It is the capsid that distinguishes viruses from other mobile genetic elements, such as plasmids. The smallest virus and the smallest plasmid both have one gene coding for a replication protein. The virus has an additional gene that codes for a capsid.17
All virions have at least one capsid. For this reason, Raoult and I initially suggested defining viruses as capsid-encoding organisms.18 Some small virions are formed by one or more DNA- or RNA-binding proteins; others, by several capsid proteins, with a lipid membrane inside or outside the shell. The virions of giant viruses are elaborate structures involving hundreds of proteins and a lipid membrane that is often decorated with polysaccharide extensions. Virions and viruses are not the same thing. Confusion between the two is pervasive. The confusion is easy to understand. Virions can be easily isolated, they are infectious, and they can be photographed.
But they are not viruses.
Claverie was the first to emphasize the distinction.19 Within the cytoplasm of an infected cell, the mimivirus produces a large compartment called a viral factory, where the viral DNA, while being transcribed and replicated, is shielded from the cell’s defense mechanisms. Many RNA and DNA viruses produce viral factories.20 But in the mimivirus, the factory is huge—the size of the infected amoeba’s nucleus. Claverie suggested that the viral factory is the actual virus, and that virions are the equivalent of the spores or gametes of cellular organisms.21
After Claverie published this argument, I observed that bacterial and archaeal viruses do not produce an isolated viral factory inside the cytoplasm of the infected cell: they transform the entire cell into a factory.22 I suggested calling the infected cell a virocell.23 Adopting Claverie’s idea, I argued that the virocell is the active viral organism. The cell’s metabolism no longer belongs to the infected cell; it is entirely dedicated to the production of virions. The virus takes control of the metabolic network either indirectly, via viral encoded regulatory proteins that modify the activity of cellular enzymes, or directly, via the activity of metabolic enzymes encoded by the viral genome.24 In virocells controlled by giant viruses, the metabolic pathways are especially complex and involve hundreds of viral encoded metabolic proteins.
Some biologists, such as López-García and Moreira, have argued that viruses are not alive, because they have conflated viruses and virions.25 Virions are, indeed, devoid of metabolic activity, comprising so many inert macromolecular assemblages of proteins and nucleic acid. In thinking of a virus as only a virion, scientists have seen viruses as passive byproducts of cellular evolution. López-García and Moreira thus write that viruses do not evolve by themselves: they are, instead, evolved by cells.26 The virocell concept substantiates the definition of viruses as capsid-encoding organisms. It recognizes the virus as an active and metabolic organism and removes the main argument of biologists who contend that viruses are not alive.27
Eukaryote Evolution
The concept of a virocell allows virologists to ask whether new genes may arise in viral genomes in the same way that new genes originate in cellular genomes. The mechanism for new gene formation is well documented in eukaryotes.28 In closely related species, randomly acquired translated regions become new genes when they encode peptides offering a selective advantage. If such processes occur in viruses, this would explain why giant viruses contain genes that have no homologues. For several years, researchers have entertained the idea that giant viruses, with their large genomes, could have spawned many new genes throughout their evolution.29 In a recent comparative analysis of six species of Pandoravirus, Claverie et al. find indications that de novo gene creation contributed to the evolution of the giant viruses.30 Phenomena such as mutagenesis, recombination, and gene capture can produce significant variation in viral genomes. Viral genomes are also bound to double duty. Several cases have been documented in which large DNA viruses have been integrated into alien eukaryotic genomes.31 These viral genes can then be reused by the cell for its own benefit.
Giant viruses belong to a broad family of eukaryotic DNA viruses called nucleocytoplasmic large DNA viruses (NCLDV).32 The smallest NCLDV has around 200,000 base pairs, and the largest has 2.5 million.33 All NCLDVs share three core genes, and most share eight. Phylogenetic analyses indicate that these genes have coevolved since the time of the last NCLDV common ancestor. They can be concatenated to produce a robust species tree that reflects their history.34 Early on, the NCLDV divided into two major lineages, each containing giant viruses: one Pithoviridae, and the other Pandoraviridae and Mimiviridae, separated by families of smaller viruses. The tree implies that gigantism originated independently several times during the evolution of NCLDVs. Its advent was likely triggered by the increasing size of viral hosts and the complexity of their interactions. Genome size increased via de novo gene creation, genome duplication, and the capture of host genes. Curiously, viruses producing ovoid-shaped virions also emerge at two different positions in the NCLDV tree. NCLDV belong to a huge lineage of viruses that infect members of the three cellular domains of life. They share similar major capsid protein and DNA packaging ATPases with several groups of small eukaryoviruses, such as the virophages, and with small viruses infecting archaea or bacteria.35 This suggests that the NCLDVs themselves probably evolved from small viruses that once infected ancient cellular lineages.
Phylogenetic analysis of cellular and viral sequences reveals that NCLDVs appeared after the divergence between Archaea and Eukarya.36 NCLDVs, in their turn, diverged into two major superfamilies before the last eukaryotic common ancestor (LECA). All modern eukaryotes contain the three versions of the RNA polymerase. It follows that these enzymes were present in the LECA. In the tree relating NCLDVs and cellular life, the LECA is located at three different positions, at the base of the three eukaryotic RNA polymerase clades. The three LECAs are separated in the RNA polymerase tree by several clusters of NCLDV families. These clusters must have separated from each other before the time of the LECA. Eukaryotic RNA polymerase III split off immediately after the separation of Archaea and Eukarya, the two others branching off within one of the two NCLDV superfamilies.
Two conclusions follow. First, NCLDVs and the ancestors of the LECA, the proto-eukaryotes, coevolved for a long time.37 And second, eukaryotic RNA polymerases I and II could have a viral origin.38 If so, then ancestors of modern NCLDV had an important role in the formation of modern eukaryotes. According to phylogenetic analyses, the major type II DNA topoisomerase (Topo II) present in eukaryotes,39 which is the target of important anticancer drugs, seems to have originated from NCLDVs.40 In Eukarya, Topo II belongs to the chromosome scaffold and plays a critical role in chromosome segregation during the cell cycle. It also interacts with RNA polymerase II, the enzyme that transcribes DNA into messenger RNA. This suggests that RNA polymerase II and Topo II were recruited together from ancestral NCLDVs. The capping of messenger RNA is another important eukaryotic feature that likely had an NCLDV origin. This phenomenon is widespread in eukaryotes, but unknown in archaea and bacteria. Several distinct and unrelated mechanisms for mRNA capping have been discovered in diverse viral lineages. Eukaryotic and NCLDV mechanisms are remarkably similar. One can imagine that mRNA capping was originally used by viruses to discriminate between their own mRNA and the mRNA of their hosts, and that it was stolen by eukaryotic cells in the arms race between cells and viruses.41
Several authors have proposed that ancestral NCLDVs played a critical role in the origin and evolution of the nucleus, the defining feature of eukaryotic cells.42 This scenario is sometimes called the viral eukaryogenesis hypothesis. After it was proposed in 2001, many biologists dismissed the idea, but it has gained more support since the discovery of the mimivirus. Adherents of the viral eukaryogenesis hypothesis point to similarities between viral factories and the eukaryotic nucleus. Both are formed via a reticulum endoplasmic membrane, a network of tubular membranous structures. In the mimivirus, the viral factory’s membrane vesicles bud from the nuclear membrane. Pandoraviruses transform the nucleus itself into the viral factory.43 One possible explanation is that an ancestral giant virus adopted the nucleus from a proto-eukaryotic cell and incorporated its cellular genes into its own genome. Under another scenario, the ancient proto-eukaryotic cell might have acquired from giant viruses the ability to protect its chromosomes by means of viral factories.
The viral eukaryogenesis hypothesis for the origin of the nucleus has been boosted by the recent discovery of head–tail bacteriophages that produce viral factories capable of imitating a nucleus inside the bacteria they infect.44 These bacteriophages encode only two proteins: one forming the nuclear membrane, the other, a tracking filament used to localize the nucleus at the midpoint of the bacterial virocell. The filament is a homologue of the tubulin that eukaryotic cells use to form the spindle separating newly replicated chromosomes during mitotic and meiotic cell division. The analogies between the virocell nucleus and the eukaryotic nucleus are striking and corroborate the viral eukaryogenesis hypothesis.45