Biology / Book Review

Vol. 5, NO. 1 / December 2019

The Yin and Yang of the Genome

Tyler Hampton

Letters to the Editors

In response to “The Yin and Yang of the Genome

Discovering Retroviruses: Beacons in the Biosphere
by Anna Marie Skalka
Harvard University Press, 192 pp., $27.95.

The word gift has a bizarre duality. In German, it refers to poison, and in English, a present. The word virus is similarly interesting. It is the Latin word for poison. This sense is preserved in biology for obvious reasons. Viruses are microscopic predators. They replicate themselves and kill millions. Nonetheless, this replicating poison also embodies a strange duality: some viruses have offered animal life precious presents, genetic gifts. If viruses have killed their millions, they have blessed their millions, too. Humanity would not be quite the same without them: both for the better and for the worse.

In Discovering Retroviruses, Anna Marie Skalka has assembled a rich account of the creatures she has studied for decades. Retroviruses are a unique type of virus. They are retro because they dramatically reverse the flow of the central dogma of molecular biology.1 Instead of the great arrows of biological priority flowing from DNA → RNA → protein, in retroviruses the genetic information begins in the form of RNA and is directed backward, and then forward again: RNA → DNA → RNA → protein.

Humans are, all of us, part retrovirus. Roughly 8% of the DNA in human cells is retroviral in origin: the relics of ancient infections in the germ line of our ancestors. This compares to the figure of 1.5% of the genome that encodes all the proteins that function to keep us alive. It is sobering to think that the double helix is really their world, these little parasites. We are just living in it.

Still, the master can occasionally become the slave. Scientists have uncovered striking instances in which these selfish parasites have in the end become genetically domesticated, donating precious genetic cargo to the host’s benefit. “The pages that follow,” Skalka writes in the introduction, “describe … the fascinating ‘yin and yang’ of their existence.”2

The Historical Record

Virology emerged at roughly the same historical moment as the fields of genetics and molecular biology were undergoing metamorphosis. Skalka deftly handles the broad historical outlines.

With respect to the history of genetics, Skalka naturally begins with Gregor Mendel. As an Austrian monk, Mendel made a discovery of nearly unrivaled biological importance, only to be treated with skepticism and then buried in obscurity. Mendel’s work was rediscovered after his death. What followed was a golden age of molecular genetics. These years from 1900 to the 1960s brought numerous advances, including the development of a chromosomal theory of inheritance; the realization that DNA, and not proteins, form the hereditary material of the cell; the determination of the molecular structure of DNA; the uncovering of semiconservative DNA replication; the characterization of transfer and messenger RNAs; the discovery of the triplet genetic code; and the revelation of the central dogma of molecular biology. Skalka’s summary is crisp and capable.

While molecular genetics was buzzing with life, the fields of virology and retrovirology became animated by the same vital energy. The connection between viruses and disease was obvious from the first. Peyton Rous, about a decade after the turn of the century, proposed the possibility of a connection between viruses and cancer. Like Mendel, he was met with skepticism, but he lived to see his own vindication. He was awarded a Nobel Prize for his efforts, and the Rous sarcoma virus was named in his honor.

An alluring hope loomed large: if viruses could cause cancer, perhaps study of viruses would lead to its full characterization and possibly its cure. Ingenious methods were developed to investigate the numerous viral kinds. Oddly, some cancer-causing viruses were found that contained no trace of DNA. In these viruses, RNA acted as the repository of genetic information. Rous sarcoma virus was one of them.

This raised the question of how the Rous sarcoma virus replicated. Howard Temin made a conjecture. In typical cells, DNA is copied into RNA. But Temin imagined this pattern running backward. He set forth the daring idea that viral RNA might be copied into a DNA form, which is then somehow implanted into the host’s own DNA. Thereafter, this viral DNA copy, which Temin called a provirus, would be transcribed or translated, and the products then combined to form a new working virus. Jan Svoboda independently came to the same conclusion.

As usual, the scientific community was skeptical, but the hypothesis proved solid. David Baltimore and Satoshi Mizutani showed that Rous sarcoma virus can indeed form a DNA polymer from individual DNA nucleotides, and this through the action of a special enzyme. The enzyme was named reverse transcriptase, for the obvious reason: the standard flow of genetic information in transcription had been reversed.

In a chapter entitled “Amending the Central Dogma,” Skalka writes that this discovery “shatter[ed] the central dogma,” because it overturned “the unidirectionality of the central dogma [that] was etched into the collective scientific consciousness: DNA → RNA → protein.”3 This is not entirely accurate. The unidirectional sequence was, in fact, an interpretive spin on the central dogma, one offered by James Watson in the mid-1960s.4 The dogma in its original form traces to Watson’s partner, Francis Crick. In Crick’s expression, the nucleic acids as a group asymmetrically determine the exact sequence of either other nucleic acids or proteins. This is something that the proteins cannot do. The Watsonian version melted in the purifying fires of experimental progress, while Crick’s formulation was unscathed.

Viruses composed of RNA and relying on reverse transcriptase and a proviral DNA sequence intermediately became known as retroviruses. Through collective effort, general features of retroviruses became known. A provirus is divided into three basic regions, abbreviated gag-pol-env. The gag region encodes those internal structural proteins surrounding the viral core, the pol region is dedicated to viral enzymes, and the env region codes for proteins associated with activity at the viral surface. Three major enzymes comprise the pol: reverse transcriptase (RT), retroviral integrase (IN), and retroviral protease (PR). RT, of course, copies RNA to DNA. IN integrates the resulting DNA provirus into the host. PR is a protein that cuts long viral amino acid chains at precise locations.

When RT sends RNA to DNA, it includes regulatory information in the form of repeated segments at either end of the provirus: LTRs, or long terminal repeats. Signals in an LTR will commandeer the host’s transcription machinery for viral ends. In addition, when a retrovirus attacks, it leaves a special mark on the genome. Retroviral integrase inserts a provirus into the DNA. The target site, between four and six base pairs in length, becomes duplicated in the process, and the duplicates lie directly outside the provirus on both ends. The overall molecular anatomy of the region of host DNA that has been invaded will look like this: TSD (target-site duplication)-LTR-gag-pol-env-LTR-TSD.

By the 1980s, the basic life cycle of the retrovirus was finally understood. A retrovirus attaches to a host cell by way of spikes on its surface: membranes fuse, the viral core is dumped into the cell, and reverse transcription occurs. The DNA provirus is then integrated, proviral promoter sequences hijack cellular transcription proteins, and mRNAs are produced and processed. Most mRNAs are translated into long amino acid chains; the requisite components self-assemble and bud off from the cell. After budding, PR cuts some of the long amino acid chains at the right spots; the freshly cut proteins condense, and the lifecycle is completed.

Retroviral Origins

Viruses are nefarious and well-made. No invasion is haphazard. A retrovirus has just the right sequence of molecular keys fitting just the right sequence of cellular locks at just the right times. Half of a virus, or half of its proteins, would never make it. Where did viruses and retroviruses come from? How did they get there, step by step?

Skalka notes that “no clear explanation for the origin of all viruses exists.”5 Some theorists say they predated single-celled organisms; some, that they derived from them in a daring act of escape. The debate looks intriguing. Unfortunately, Skalka does not go into it. Rather than weighing any evidential pros and cons of these scenarios, she instead offers a general picture of the early epochs in evolution and pre-evolution.

As is standard for all such discussions, the famed Miller–Urey experiment is mentioned, in which organic compounds such as amino acids made a dramatic appearance, the result of electrical sparks flashing in a system of prebiotic gases. Skalka mentions that uracil, adenine, and ribose can each arise under certain conditions. She does not linger over details. While it is pleasing that she mentions that the interpretations of these results can be questioned, it is unfortunate she does not mention how extensively. Nightmare is a word that appears in the literature to describe worries surrounding some of these scenarios.6 James Tour and Robert Shapiro provide an appropriately pessimistic perspective.7

The sequence by which life began and the nature of its initial form may be uncertain, Skalka admits, but it is rather secure that at an early phase of evolution, RNA dominated as the primary genetic material. DNA had not yet evolved; neither had sophisticated proteins. This is the RNA world hypothesis.

There are numerous reasons to believe there was an RNA world. Some are merely suggestive; others are more substantial. For example, some form of RNA, and not the proteins, plays a crucial part in fundamental and ancient processes in the ribosome. A natural inference is that in primordial antiquity, when the ribosome was new, RNA was more prevalent than proteins. Another bit of evidence is RNA’s diverse skill set. Though less efficient, RNA can code like DNA, and can catalyze like proteins. It is a bargain two-for-one molecule for any early evolutionary scenario. RNA has for decades been a laboratory wonder. It has been prodded to reveal considerable dexterity in function and structure. Natural ribozymes exist, like self-splicing introns. And ribonucleoproteins prevail in living systems, such as the spliceosome and RNasP, systems in which the RNA component overshadows its protein partner in chemical significance. These are thought to be remnants of the RNA world that once was. The reverse transcriptase found in retroviruses, and other similar genetic entities, may have been the mechanism by which the RNA world gave way to the DNA world. Finally, the very fact that RNA in retroviruses is the primary genetic constituent—not merely an intermediary, as in autonomous cells—makes it imaginable that it may have been the primary constituent for earlier life-forms.

In recounting the facts, does the chapter live up to its own name, providing an account of “The Origins of Retroviruses”? The answer is much more no than yes. An RNA world existed, surely. Retroviruses originated in it. Beyond this fact, the most important aspects of the picture are conspicuously absent. It is perhaps understandable. The book is not meant as a tome. Nonetheless, the explanation given is no deeper and essentially no more detailed than a hand-waving motion toward natural selection and random mutation. Is this not too vague to be of interest? A reader might want a deeper and more detailed discussion, for example, concerning the evolutionary sources of the LTR regulatory elements, as well as the gag-pol genes in some mobile genetic elements. Where in precursor cells did these genes and regulatory elements come from exactly, and what was their prior role? In what material form did they exist, and did it change over time? Skalka does mention that retroviruses originated when a retrotransposon acquired env genes.8 But this passing sentence is again too general to be a compelling account. Nonetheless, there is much of value in the chapter, both in its depiction of the RNA world and especially in its manageable overview of transposons, retrotransposons, and telomeres.


If aspects of the origins of retroviruses are opaque, what happened afterward is clearer. Skalka provides a sense of their significance with respect to matters of evolution.

Temin created the proviral hypothesis, but even before it was confirmed, there were signs that viral genomes, or portions thereof, might be embedded in the DNA of animals. Puzzlingly, these hints were found even in animals that had never suffered an infection. If there is no fire, there should be no smoke. A potential explanation was offered. If some proviruses were not acquired after birth, perhaps they had been inherited? Viral DNA may be endogenous, inherited from some infected ancestor. This hypothesis, made by Robin Weiss, was labeled impossible by critics. It is now validated beyond all doubt.

The genomes of nearly all animals contain elements that look like proviruses. These are called endogenous retroviruses (ERVs); in humans, they are named human endogenous retroviruses (HERVs). In recent years it was revealed that “retroviral sequences are distributed in an astounding 700,000 distinct sites in all forty-six chromosomes, representing about 8 percent of our genome.”9 Mercifully, none of these HERVs is an active virus; they have undergone debilitating mutations and alterations. They are dead. In many cases, due to details of recombination, everything except a single LTR has been deleted from the proviral structure, so the genome pattern at an integration site is TSD-LTR-TSD. These are called solo LTRs: they outnumber fully intact proviruses ten to one.

Endogenous retroviruses proved to be a great boon for a key tenet of evolutionary theory, since in humans and other primates they occur in corresponding positions. This cannot be a coincidence. Skalka gives the standard explanation:

Species that share related endogenous retroviral sequences in analogous positions in their genomes are assumed to have inherited these sequences from a common ancestor infected by the retrovirus before the species diverged from one another.10

Thus, humans and other primates must share a common ancestor.

Under this assumption, a comparison of the presence and absence of ERVs at like positions among different species allows an evolutionary timeline to be constructed. This timeline is cross-checkable. Divergence times can be calculated by the independent method of LTR comparison using known mutation rates.11 Analysis of pol genes has allowed a taxonomical classification of many ERVs. More than 50 different ERV families have taken up residence in the genomes of our ancestors, through more than 200 independent germline invasions over the course of a hundred million years or so.

Skalka gives this information in a flurry. One can almost hear the protest of an honest critic, an appeal to slow down. All this depends on assumptions. Could some other reason exist for the similarities in location and sequence of ERVs, other than Charles Darwin’s model of common descent? The question has been raised before.12

The answer seems to be no; common descent is the only viable explanation. The structure of the ERVs and the surrounding TSDs imply that they are genuinely retroviral in origin. Consequently, ERVs cannot be thought of as original to the design plan. At some earlier point, retroviruses invaded. Either a) a common ancestor was infected, which through genetic inheritance gave rise to ERVs at orthologous loci in different species, or b) the various primate species were infected separately, and the ERVs were retained at orthologous loci by mechanisms not involving common descent. Option b) should be ruled out. In modern retroviruses, parallel invasions almost never land at exactly the same nucleotide position.13 If primate species were separately infected, and if mainly neutral evolutionary forces carried some of the resulting ERVs to fixation, we would not expect to see many ERVs at the same loci, let alone more than 99% of hundreds of thousands of ERVs at the same relative genomic position. By contrast, proviruses and ERVs always occur at the same location if inherited from a common ancestor.14 The evidence matches this hypothesis much better.

Still, these extrapolations are based on data from modern viruses. Could the hypothesis of independent ancestry be salvaged if ancient retroviruses had stronger insertion preferences than modern ones? No again. Scientists have reconstructed and revived an ancient retrovirus, using some of the youngest members of the HERV-K (HML-2) family. When tested, certain biases were detected, but none significant enough to explain the data by parallel infection.15 If integration bias is out, could natural selection, as opposed to drift, be responsible, winnowing a large pool of variants the same way in different species? There appears little motivation to think so.16 Even more ad hoc scenarios would have to be invoked to avoid the natural conclusion of common descent.17 Thus, it appears that in deciding between the mutually exclusive and jointly exhaustive options of common ancestry versus independent ancestry for ERV origins, the data far better confirm the common ancestry model, and to that degree confirms a key tenet of modern evolutionary theory. ERVs represent some of the best available scientific evidence for Darwin’s model of common descent.

In keeping with the theme of the book, Skalka turns her attention to the dual nature of the retrovirus. The virus as poison needs no defense: it is obvious. But how could a retrovirus ever be a gift useful to an organism? In several ways, actually. The first is evolutionary. Retroviruses and ERVs increase the pool of available genetic variation. The richer the variation, the better the evolvability. ERVs increase the potential for recombination, resulting in more deletions, inversions, or duplications. There will be more bad mutations, but also more good, and selection can preserve the good. Moreover, the LTRs of retroviruses have rich regulatory signals: promoters and enhancers. When implanted in germ cells, these signals can alter regulation of downstream host genes, leading to new alternative mRNA transcripts, deletions, and additions that can be preserved over evolutionary timescales. Finally, the proteins coded for in viral DNA can be expressed in different developmental contexts, depending on the location of integration, and they may offer functionality in these novel contexts.

Examples abound. Some are nontrivial. Skalka reviews these in detail. If it were not for retroviruses, humanity might still be primitive hunter-gatherers without civilization; our immune systems would be weaker; we would be hatching from eggs. How so? A regulatory segment in an ERV controls a copy of the enzyme amylase, which allowed humans to incorporate starch into their diets. Control of amylase opened the potential for the species to become farmers instead of hunter-gatherers. Similarly, HERVs in the major histocompatibility complex have shaped the evolution of the immune system. And certain ERV genes were recruited during the evolution of the mammalian placenta: specifically those assisting with cell fusion and immune suppression, which is co-opted for the protection of the fetus.

The chapter on evolution closes with an account of the never-ending host-virus arms race. Viruses want a sure thing, and so tend to target stable host genes unlikely to change; in response, the host is forced to change them, but in noncritical regions that may help repel the virus while doing minimal damage. In humans, roughly 5% of codon changes in mundane housekeeping genes are due to the arms race with viruses. In other mammals, it is an astounding 30%.18 This attests to the pervading influence of retroviruses, not just in terms of their mere presence in the genome, but their indirect effects even on the seemingly unrelated housekeeping genes of our biological constitution.

Cancer, AIDS, and Precision Medicine

The fifth and sixth chapters of Skalka’s book address the relevance of retroviruses to two supreme menaces: cancer and AIDS. Ever since Rous’s work, it has been known that some retroviruses induce cancers in animals. There existed the unproven possibility that retroviruses might cause some human cancers as well. Well-funded initiatives commenced to study retroviruses and how cancer forms.

Skalka provides an overview of pertinent details. It was found, for instance, that two types of cancer-causing, or oncogenic, viruses exist. The first is known as acutely transforming. Almost all animals infected with it become cancerous in just a few days. The second is nonacutely transforming. Some but not all infected individuals will get cancer, and the onset is much longer, seen over weeks or months instead of just days. Researchers found that a special piece of genetic code was present in acutely transforming viruses and absent in the nonacutely transforming ones. The potent gene that caused cancer in short order came to be called an oncogene. In the case of Rous sarcoma virus, an acutely transforming virus, src was the sarcoma-inducing gene.

Where did src come from? The answer to this question would bear on the important topic of how cancers arise from oncogenes. It was discovered that viral src (vsrc) derived from a conserved cellular gene that is present in almost all multicellular lifeforms and known as cellular src or csrc. Csrc codes for a protein that modifies membrane receptor proteins. In its original cellular context, there is finely controlled regulation: csrc has an on–off switch. This gene was captured by an historical retrovirus, but the on–off switch was at some point broken. Thus in a retrovirus, src’s protein modifies cell receptors to make host cells divide without end, leading to cancer. Other oncogenes in different acutely transforming viruses were later characterized, such as v-myc in avian retrovirus MC29. This gene also derived from a cellular counterpart and is expressed in uncontrolled host cell division, albeit through a different mechanism.

If captured oncogenes were responsible for cancers in acutely transforming viruses, what is the cause for cancer in nonacutely transforming viruses, which lacks them? It is LTRs. Retroviruses are mobile elements. One provirus leads to another. By chance a provirus might land beside a cellular gene that can alter cell division or some aspect of physiology, leading to cancer. If the signals in newly implanted proviral LTR are oriented so as to adversely influence the regulation of such a gene, cancer can result. This is a chance-driven process. There is no intrinsic piece of code that each provirus carries which causes cancer automatically. It may take much time and many gene-jumping events for a provirus to be in just the right orientation to cause damage. This accounts for the cancer’s delayed onset. Additionally, researchers have found a sequence of events by which a nonacutely transforming virus can evolve into an acutely transforming one.

Although one-fifth of human cancers involve the action of viruses, most of them are not retro. The only retrovirus known to cause cancer is HTLV-1 (human T-cell lymphotropic virus type 1), this through the action of two of its viral genes, Tax and HBZ. When retroviruses infect a host, genetic deletion events or rearrangements may place a cellular oncogene under the control of an alien regulatory sequence—the viral LTR. This leads to altered cellular division. But the same sort of genetic rearrangements in cells without viruses can have the same effect. For example, exchange of genetic material between human chromosomes 8 and 14 can place the oncogene c-myc under the control of a new regulatory sequence, which may lead to cancer. Genetic rearrangements—the same sort seen in cancer-causing retroviruses—account for the majority of human cancers. Much of our present knowledge about how cancer forms has its roots in the study of the genetic rearrangements associated with retroviruses, even though most human cancers do not trace to the actions of retroviruses.

Like cancer, AIDS is a well-known killer. AIDS, or acquired immune deficiency syndrome, was catapulted into prominence in the late 1970s, when otherwise healthy gay men were contracting unusual diseases due to severe immune system deficiency. And not only gay men: drug users, infants, the female partners of men with this immune deficiency, and patients accepting blood from donors. It became a terrifying epidemic. A tremendous sense of urgency was felt in the scientific community to characterize the cause of the condition, to find a way to detect it, and to cure it.

Skalka masterfully describes the race to discover the AIDS-causing agent. Robert Gallo, famous for discovering the cancer-causing HTLV-1 retrovirus, conjectured that AIDS might be caused by a retrovirus similar to HTLV-1, since it affected the same T cells as HTLV-1. Françoise Barré-Sinoussi found convincing evidence for a retroviral cause of AIDS: reverse transcriptase activity in the tissue of an AIDS patient, and electron microscopic images of viral particles budding from T cells. Finally, Jay Levy and Paul Volberding also characterized an AIDS-related retrovirus. For a time, the AIDS-causing agent operated under three different names—LAV, HTLV-3, and ARV—until the International Committee on Taxonomy of Viruses offered a single name to rule them all: human immunodeficiency virus, or HIV. The name was deliberately chosen to avoid mention of AIDS. HIV is a virus; AIDS, on the other hand, is an immune condition that can result from it. While a person may have HIV, it does not necessarily follow that they have AIDS.

The story of the HIV discovery is a rich and dramatic one. Due to an unintentional laboratory mix-up, Gallo was accused of scientific fraud; later his name was cleared. France and the United States engaged in legal squabbles over commercial rights, patents for bloods screens, and recognition for the AIDS discovery. The suit was mercifully settled out of court, the terms announced in 1987 by French prime minister Jacques Chirac and United States president Ronald Reagan.

Skalka also describes research addressing the origins of HIV.19 In the early 1980s, HIV was spreading rapidly, and researchers were curious to know from where. Analysis of many tissue samples by the evolutionary biologist Michael Worobey and others place the common ancestor of pandemic HIV strains in the Democratic Republic of the Congo, between 1900 and 1910, in the city of Kinshasa. Up until that point, HIV was confined only to a few rural villages. At the turn of the twentieth century, Kinshasa had become a major population center, and was connected by the Congo River to several other surrounding countries. Once HIV made its way from a rural setting to Kinshasa, the retrovirus spread quickly. Haitian professionals stationed in Kinshasa carried it back to Haiti, where it also spread. A single Haitian patient carried it to New York City in 1969, which kickstarted the epidemic in the US. Of course, HIV did not simply appear in 1900 out of thin air. HIV has several viral subtypes which evolutionarily derive from retroviruses affecting African primates, collectively called simian immunodeficiency viruses (SIVs). Beatrice Hahn and others, comparing strains of SIV obtained from chimpanzee fecal and urine samples, determined the evolutionary sources and antecedents of HIV. The most prominent strain of pandemic HIV—HIV-1, group M—derives from the virus SIVcpz, which is carried by a subspecies of chimpanzee in central Africa, Pan troglodytes troglodytes. It most likely found its way to human hosts through the local practices of hunting and eating chimps. Through further comparisons, the evolutionary origin of SIVcpz itself was revealed, along with the other strains of HIV.

By the mid-1990s, the death count of HIV had skyrocketed. It was, at that time, the leading cause of death among young adults in the United States, even above unintentional injury, suicide, and cancer.20 Drugs inhibiting HIV reverse transcriptase, integrase, and protease were soon developed. Yet the victory was short-lived. HIV produces too prolifically and mutates too quickly for one drug to treat it. Resistant strains of HIV arose. Effective control of symptoms and viral infectivity would, it was realized, require the use of simultaneous drugs to confound the evolution of drug resistance. Hopes quickly increased again, as a spade of viral drugs became available, and drug cocktails were formed to stop the ravaging effects of the virus. Elimination of HIV is so difficult because provirus copies of HIV can integrate into inactive or dormant cells and remain there in perpetuity, occasionally reawakening to produce a blizzard of viral particles. A true cure is not yet available. Nonetheless, the existing antiviral drugs can allow people with HIV to live nearly normal lives, as these drugs reduce the highly dangerous active virus particles in the blood to undetectable levels. Skalka recounts the intriguing story of the “Berlin patient,” Timothy Ray Brown, the only human being to have been permanently cured of HIV. Skalka also describes the prospects of developing a future HIV vaccine and implementing a strategy for breaking the global chain of HIV transmission.

Though retroviruses are superb hijackers of cells, scientists are now learning to beat them at their own game: experimentalists can commandeer the retrovirus and use its powers for good rather than for destruction. In the epilogue, Skalka highlights an emerging front in biomedical science called precision medicine. It is possible now to go beyond treating generalities. Procedures exist to target specific idiosyncrasies of certain malignancies, such as tumors with unique genetic profiles. Researchers can gut a virus’s genes, while retaining its regulatory signals, and replace them with DNA sequences of their own choosing. The retrovirus can then be released into a host, and a much-needed DNA sequence can be promptly delivered at various sites throughout the genome. By using retroviruses as vectors in conjunction with the remarkable CRISPR-Cas system, which is a defense system against viruses in some prokaryotic organisms, virtually any DNA segment in any cell type can be targeted and replaced. This has massive potential for treating genetic diseases. However, it also involves ethical issues, as the system could be used to sculpt the genetic architecture of human embryos and make them into designer children. This again returns to the recurring theme of the book: the dual nature of retroviruses. Viruses deform and kill millions, but also provide opportunities to help millions. The principle is illustrated with no greater irony than in the case of HIV. Humans are now using genetically modified HIV particles as vectors to treat HIV symptoms, as well as other genetic diseases.

Discovering Retroviruses is intensely thought-provoking and satisfying, the material expertly handled. Though it is not by any stretch a light read, it is nonetheless immensely valuable.


  1. At least in one of the dogma’s popular (mis)interpretations, as explained below. 
  2. Anna Marie Skalka, Discovering Retroviruses: Beacons in the Biosphere (Cambridge, Massachusetts: Harvard University Press, 2018), 2. 
  3. Skalka, Discovering Retroviruses, 32. 
  4. Wikipedia, “Central Dogma of Molecular Biology.” 
  5. Skalka, Discovering Retroviruses, 46. 
  6. Michael Robertson and Gerald Joyce, “The Origins of the RNA World,” Cold Spring Harbor Perspectives in Biology 4 no. 5 (2012): a003608, doi:10.1101/cshperspect.a003608.  
  7. Robert Shapiro, “A Simpler Origin for Life,” Scientific American 296, no. 6 (2007): 46–53; James Tour, “Animadversions of a Synthetic Chemist,” Inference 2, no. 2 (2016). 
  8. A retrotransposon is a type of jumping gene. Barbara McClintock famously discovered that a great many genes jump in the genome. They are mobile and can change a cell’s genetic identity. Such entities came to be called transposable elements. Some were found to have a molecular anatomy very similar to retroviruses. Shockingly, transposable elements seem to account for roughly half of the human genome. There was something almost Copernican in the realization: just as our world is a speck in the galaxy, so our genes are a speck in the DNA molecule. Whether this Copernican philosophy is true no one yet knows, though there are reasons to suspect it is. 
  9. Skalka, Discovering Retroviruses, 74. 
  10. Skalka, Discovering Retroviruses, 76. 
  11. When initially integrated, a provirus includes two identical LTRs, one on either end. Once implanted, each LTR undergoes mutation independently. Assuming no strong selective pressures, drift occurs in each LTR. Using known mutation rates, one can work backward to estimate the time elapsed since they were identical, which is roughly the same as the time of infection. 
  12. Jonathan McLatchie, “Revisiting an Old Chestnut: Retroviruses and Common Descent (Updated),” Evolution News & Society Today (blog), Discovery Institute, May 25, 2011; Jonathan McLatchie, “Do Shared ERVs Support Common Ancestry?Evolution News & Society Today (blog), Discovery Institute, May 26, 2011; Jonathan McLatchie, “More Points on ERVs,” Evolution News & Society Today (blog), Discovery Institute, May 28, 2011; Anjeanette Roberts, “Questioning Evolutionary Presuppositions about Endogenous Retroviruses,” Today’s New Reason to Believe (blog), Reasons to Believe, September 17, 2015; Anjeanette Roberts, “A Common Design View of ERVs Encourages Scientific Investigation,” Today’s New Reason to Believe (blog), Reasons to Believe, September 17, 2015. 
  13. Elizabeth Withers-Ward et al., “Distribution of Targets for Avian Retrovirus DNA Integration in Vivo,” Genes & Development 8, no. 12 (1994): 1,473–87; Rick Mitchell et al., “Retroviral DNA Integration: ASLV, HIV, and MLV Show Distinct Target Site Preferences,” PLoS Biology 2, no. 8 (2004): e234. See also Paul Lesbats, Alan Engelman, and Peter Cherepanov, “Retroviral DNA Integration,” Chemical Reviews 116, no. 20 (2016): 12,730–57; Tania Sultana et al., “Integration Site Selection by Retroviruses and Transposable Elements in Eukaryotes,” Nature Reviews Genetics 18, no. 5 (2017): 292. 
  14. Endogenous retroviruses are inherited in a Mendelian fashion, which means that there is a faithful transmission of the once-active provirus at precisely the same genetic locus from generation to generation. In contrast, independent infections can land at very different genetic loci, and are unlikely to coincide at precisely the same locus. For instance, related Australian koalas were found to have higher ERV similarity than unrelated koalas. See Rachael Tarlinton, Joanne Meers, and Paul Young, “Retroviral Invasion of the Koala Genome,” Nature 442, no. 7,098 (2006): 79. 
  15. Troy Brady et al., “Integration Target Site Selection by a Resurrected Human Endogenous Retrovirus,” Genes & Development 23, no. 5 (2009): 633–42. A critic could argue that this resurrected retrovirus may be an odd exception, and that it is still possible that all the other families of ancient retroviruses are overwhelmingly locus-specific. It may be, but this hypothesis is thus far unsupported by any evidence. This hypothesis must be considered more ad hoc than the hypothesis that most retroviruses, both modern and ancient, do not have integration preferences that are perfectly locus-specific but can integrate into a wide range of genomic locations. This is a generalization from observations, whereas the convergent hypothesis is based on pure possibility and speculation. 
  16. There is good evidence that natural selection did impose general biases on ERV location, though not necessarily the locus-specific biases necessary to avoid an inference of common descent. See Troy Brady et al., “Integration Target Site Selection by a Resurrected Human Endogenous Retrovirus,” Genes & Development 23, no. 5 (2009): 638–40.

    It is implausible that a significant selective difference would exist between two genomes with the same provirus, just shifted one nucleotide to the left or right. Yet this sort of selective difference is what the convergent hypothesis requires. Furthermore, data from a few papers seem to more naturally fit with the hypothesis that selection is general and that convergent evolutionary processes are not locus-specific. See Chris Yohn et al., “Lineage-Specific Expansions of Retroviral Insertions within the Genomes of African Great Apes but Not Humans and Orangutans,” PLoS Biology 3, no. 4 (2005): e110. 
  17. In existing LTR sequences, there is (as a general rule) a nested hierarchy: a 5' LTR of a particular ERV is more similar to the 5' LTRs of the orthologous ERV in other primate species than it is to the 3' LTR of the very same provirus in the same individual. Why should this be? This is well accounted for if all 5' LTRs of a particular provirus in the various primate species descended from a common ancestor provirus, and likewise for the 3' LTRs. There would be no reason for this pattern in the absence of common ancestry, since under the independent infection scenario, the LTRs should be accumulating mutations randomly in different species, and there is no reason to expect a pattern or coincidence of the same mutations in different lineages. One could posit that perhaps convergent selection pressures existed, so that selection not only favored ERVs in the same genomic position but also with the same sequences to form a nested hierarchy. In this scenario, LTRs are under intense selection for most of their evolutionary history. However, there may be reason to believe typical cases of convergent evolution do not produce a nested hierarchical pattern, which would work against the hypothesis.

    The convergent evolution hypothesis is also implausible for a different reason. Under this posited scenario, breaks in the nested hierarchical pattern—conversion events between the 5' and 3' LTR—should never be established by selection. If a gene conversion event did occur, the posited heavy convergent selection pressure would have evolved the population away from that state if it happened early in the divergence of primate species (so that we should not detect such conversions); or, if the gene conversion event occurred late, the hypothetical strong convergent natural selection pressures would have never allowed such a gene conversion event to take over a population. But there is evidence of gene conversion in some LTRs of some species. This does not fit well with the convergent hypothesis. If LTRs were subject to largely neutral evolutionary forces, one would expect gene conversion events might drift to fixation in some species and some LTRs, but as an exception rather than as a rule. This pattern matches observation. This difficulty for the convergent scenario would need the addition of further hypotheses to explain these deficiencies. In view of these details, it would appear the most parsimonious explanation is common descent.

    For analysis of LTRs, nested hierarchies, and gene conversion events, see Welkin Johnson and John Coffin, “Constructing Primate Phylogenies from Ancient Retrovirus Sequences,” Proceedings of the National Academy of Sciences 96, no. 18 (1999): 10,254–60. For information about convergent molecular evolution as it pertains to retroviruses, see “Responding to the ‘Evolution News & Views’ Articles Addressing My Essay on the ERV Evidence for Common Ancestry,” Evidence for the Evolutionary Model, last modified May 2012. 
  18. Skalka, Discovering Retroviruses, 94. 
  19. For clarity, it should be noted that although HIV is a retrovirus, it is not an endogenous retrovirus in humans, since HIV does not infect germ cells, and therefore its proviruses cannot be transmitted in Mendelian fashion. 
  20. Skalka, Discovering Retroviruses, 143. 

Tyler Hampton is an independent researcher in Pineville, Kentucky.

More from this Contributor

More on Biology


Copyright © Inference 2024

ISSN #2576–4403