Progress Overlooked

Pedro Tiago Martins

To the editors:

After warning against using stories, as opposed to testable hypotheses, to describe the evolution of language, Robert Berwick and Noam Chomsky state that the human language faculty is a species-specific property. This much we know. There is no evidence to the contrary, and most likely it will remain that way; only people have language. But Berwick and Chomsky go on to say:

There are no significant analogues or homologues to the human language faculty in other species. The notion of a species-specific biological trait is itself unremarkable. Species-specific traits are essential to the very definition of a species, at least for multicellular animals requiring reproductive isolation, and species specificity is both widespread and expected according to conventional evolutionary theory.

This is misleading, for two reasons. The first is that there are no set criteria for the definition of a species, and in this regard, there is no “conventional evolutionary theory.” John Wilkins describes twenty-six ways of defining a species.¹ Of these, maybe a couple would partially correspond to the criterion Berwick and Chomsky present as essential. Most likely, there is no way of clearly defining a species, as Julian Huxley pointed out in the 1940s.² This example illustrates the style of rhetoric that Berwick and Chomsky employ in Why Only Us.³ It would be misleading to pass over such statements as trivial.

Language is a complex trait. We are seeking to understand the biology underlying language and not the general notion of having it as a capacity. The biology is what the field of language evolution is concerned with, and in that sense, it can be seen as a subsidiary of comparative biology, aided by other disciplines. Again, that only humans have language is circumstantially self-evident. We already know what a phylogenetic tree with nodes labeled “language” and “no language” would look like. The lack of homologues and analogues that Berwick and Chomsky refer to does not hold once one leaves the surface, below which homologues and analogues do exist, sometimes quite deeply.⁴

Berwick and Chomsky shun the idea that they reduce language to a single trait, “the Basic Property” (BP)—or recursion, after the famous paper by Marc Hauser, Chomsky, and Tecumseh Fitch in 2002.⁵ They claim that critics making this argument, such as Cedric Boeckx in a review of Why Only Us published in this journal, are mistaken.⁶ Instead, Berwick and Chomsky describe “the BP as a Basic Property, one among others [emphasis added].” Yet, they go on to talk about the BP and relate all work that they cite to it. It is, moreover, easy to notice that whenever they mention anything else that does not address the BP, they invariably conclude by sweeping any progress being made in that area under the rug of externalization. In the same vein, they attribute any progress toward answering the question of language evolution to the refinement of generative linguistics in the 1990s, made possible by Chomsky’s Minimalist Program, which gave pride of place to the Merge operation, which yields the BP.⁷ While many linguists do recognize the virtues of minimalism compared to previous iterations of generative grammar theory, to present it as the one thing that was missing in order for sensible theories of language evolution to be devised seems excessive.

Eric Lenneberg, as Berwick and Chomsky mention, realized in the 1960s that linguistic theory was not yet adequate to address certain biological questions. His concerns included the dangers of speculation on the basis of impoverished evidence, as in the following passage, quoted by Berwick and Chomsky:

Another recent practice is to give speculative accounts of just how, why, and when human language developed … Most speculations on the nature of the most primitive sounds, on the first discovery of their usefulness, on the reasons for the hypertrophy of the brain, or the consequences of a narrow pelvis are in vain. We can no longer reconstruct what the selection pressures were or in what order they came, because we know too little that is securely established by hard evidence about the ecological and social conditions of fossil man. Moreover, we do not even know what the targets of actual selection were. This is particularly troublesome because every genetic alteration brings about several changes at once, some of which must be quite incidental to the selective process.⁸

But in this regard, it is hard to see whether Berwick and Chomsky see progress in the intervening five decades. Lenneberg’s warning was important, but it is also true that we know much more today than we did back then. We can now reconstruct things that no one could even dream of in the 1960s. Genomics is a case in point. Genomic data must be used with caution, but the kind of hypotheses that can be derived are probably beyond Lenneberg’s wildest dreams. I am sure he would welcome them.

Berwick and Chomsky’s reductionist perspective prevents them from recognizing most work as progress. All the work they cite seems to suffer, in their view, from not relating to the BP. Consider the following passages (emphasis added in each instance):

“Recent work continues to point to the role played by FOXP2 in the sequential ordering of motor gestures, but without identifying its adaptive role in the origin of the BP.”
“For all that, the chasm between phenotype, algorithm, and neural implementation remains just that—a chasm. We do not yet understand the space of algorithms that might inform, or guide, the BP.”
“Birdsong and speech follow linear order rather than hierarchical structure and, for this reason, they are remote from the BP.”
“It remains true that we do not have a soup-to-nuts, or gene-to-neural-circuit-to-phenotype account for any trait of interest, let alone the BP.”

But even if one focuses on what allows for the BP, their single-mutation theory seems unwarranted. After providing an example of how a push-down stack can be achieved by rewiring a sequential processor, they write:

Detailed knowledge of axon guidance might prove useful in the future, but for the moment, we do not know whether or how shift registers are implemented in the brain, nor do we know much about the phenotypic changes that produce such changes in neuronal structure.

That is perfectly true. No one knows. Yet, subject to the same lack of knowledge as anyone else, Berwick and Chomsky assume that all it took for the BP to be instantiated was a slight rewiring of the brain:

Absent a more complete, concrete understanding of the space of genomic, developmental, and neurological possibilities, it is difficult to go beyond the phrase that we have so often adopted: the BP emerged by means of a slight rewiring of the brain.

It does not follow, out of necessity, that a single, simple change at the computational level corresponds to a single, simple change at the implementational level.⁹ This is tantamount to conflating computational and biological complexity.¹⁰ This view follows neither logically nor biologically. A recent paper has modeled the scenario proposed by Berwick and Chomsky’s Why Only Us and found no reason to prefer it over a multi-step hypothesis; it found quite the opposite.¹¹

No one really knows how language evolved—that is what the field of language evolution, made up of several disciplines, is devoted to research. Berwick and Chomsky’s proposal, which takes the BP as the be-all and end-all of language evolution, has not led to testable hypotheses. The particular story they put forward, human language arising from a sudden event resulting from a single mutation, has not born any fruit. They have defended this position for some years now and use it as a measure of progress: if work on language evolution has nothing to say about the BP, they find progress to be negligible, or at least not really about what matters.

One day, we may indeed understand how language evolved. So far, the heavy lifting is being done in work that Berwick and Chomsky cite only to ignore. There are no indications that anything will change in that respect.

Pedro Tiago Martins

Robert Berwick and Noam Chomsky reply:

Burning academic straw generally serves only to stir up obscuring smoke, not clarifying fire. The letter from Pedro Tiago Martins is no exception. Martins believes we have overlooked progress; he provides alleged examples of many kinds. The only way to proceed is to go through them one by one, explaining why we see no substance to his critique.

Martins begins with a disquisition on the difficulty of defining species, responding to our observation that human language is a common human possession, with no known group differences and with “no significant analogues or homologues … in other species.” He challenges this observation only on one point: “[t]he lack of homologues and analogues that Berwick and Chomsky refer to does not hold once one leaves the surface, below which homologues and analogues do exist, sometimes quite deeply.” He refers to a study, one that concerns vocal learning in birdsong. But vocally learned birdsong is an analog of speech, not language. We discussed these matters carefully in Why Only Us, explaining why they are not relevant to our observation.¹²

Next, Martins objects to our correction of Cedric Boeckx, who mistakenly claimed that we reduce language to a single trait, the Basic Property. Martins’s objection is based on simple confusion. We never reduced language to the BP, as we explicitly say on many pages in Why Only Us.¹³ But in proceeding to discuss the evolution of language, we did focus on the BP rather than on anaphora, agreement, phase theory, cartography, and many other aspects of language. We did so for sound reasons. These other aspects of language are not yet understood well enough to allow the question of evolution to be raised seriously. In contrast, as we have shown, the BP is not only a core property of language, but, crucially, it is explained in terms of the simplest computational operation, Merge. The latter, furthermore, explains a number of fundamental properties of human language, solving longstanding puzzles. Accordingly, it makes perfect sense to focus discussion of the evolution of language on this property of the phenotype. And Martins makes no attempt to refute the four examples he cites that illustrate our focus on the BP.

Martins next points out that there has been progress in genomics, which he claims we ignore. We quite agree about the progress. We have indeed made use of it when it has been relevant to the study of language. One example is the new genomic evidence on the separation of human groups, which, as we discussed, corrects what we wrote in Why Only Us and provides additional evidence for the apparent suddenness, in evolutionary time, for the emergence of human language. Indeed, Martin Kuhlwilm and Boeckx himself have assembled a substantial catalog of modern human versus Neandertal and Denisovan variation in single “DNA letters”—i.e., single nucleotide polymorphisms—with an eye for determining functional associations between the DNA level and “what makes us human,” as these two authors put it.¹⁴

Rob DeSalle and Ian Tattersall’s letter in response to our essay points out that the genomic pickings have been slim. This is true. Still, we do know some things. While modern humans hung on to selectively advantageous genomic regions introgressed from interbreeding with Neandertals and their Denisovan cousins, adaptive for skin pigmentation or high-altitude life, the Neandertal and Denisovan DNA corresponding to the FOXP2 genomic region has been purged; it is not present in modern humans.¹⁵

Martins then claims that a reductionist perspective prevents us from recognizing progress, such as with regard to the role of FOXP2 “in the sequential ordering of motor gestures.” Its role is, again, for speech. But we know that this particular mode of externalization does not matter, because language can just as well use visual gestures or touch. Consequently, the entire line of work on birdsong, FOXP2, the lowered larynx in primates, and much more, all become of peripheral interest for the study of language. Knowing what the phenotype is comes first. As we discuss in Why Only Us, there is strong linguistic evidence that externalization is not part of the core system that generates semantic and pragmatic interpretations of expressions. Rather, it is a peripheral system, much like a printer attached to a computer, an amalgam of two unrelated systems: core I-language and a sensorimotor system. Much of what is central to externalization appears to be generated by the output system, not strictly speaking by a property of language itself. Flattening or linearizing structures in the mind into a string of spoken words is required for the articulatory and auditory system, but relaxed for signs because of the extra dimensionalities of visual space. We can sympathize with the unhappiness of Martins at the collapse of what had seemed to be an archetypal example, but that happens to be the way the apples fall from the tree.

Martins further objects to our single-mutation theory. But we have not, in fact, proposed this. While one place in Why Only Us tentatively advances it as “the simplest assumption,” this claim is tempered in several other places.¹⁶ We do suggest that the BP might have arisen from “a slight rewiring of the brain.” This is a reasonable supposition and one that lines up naturally with DNA changes in relation to axon guidance systems of the sort discussed by Kuhlwilm and Boeckx, about which Martins has nothing to say.

Martins finally turns to his pièce de résistance, a paper coauthored with Boeckx that offers a multi-step hypothesis for the development of Merge. Unfortunately, this paper is simply a mélange of confusions and misunderstandings. The primary problem is that the hypothesis does not seem to grasp the formulation of Merge. It claims the impossible: that there can be something called “half Merge,” that is, half of binary set formation. Mathematically, there can be no halfway point to binary set formation. It either exists or it does not.

Second, the paper attempts to implement a different linguistic theory, one that breaks apart into multiple steps, resorting to the Chomsky hierarchy. It claims that a simpler operation than Merge exists, at an intermediate level of hierarchy below Merge. This operation apparently evolved before the more computational machinery involved in the next step. This hierarchy has to do with the strings generated by rewriting systems, or semi-Thue systems, which can be traced back to the fundamental ideas of Emil Post. But this track is wrong. Merge does not fall along the hierarchy in this way; it does not properly belong in the hierarchy at all. Merge applies to structures that have no linear order. In contrast, the hierarchy generates strings with linear order.

The final claim made by Martins is that “[s]o far, the heavy lifting is being done in work that Berwick and Chomsky cite only to ignore.” Actually, we reviewed and provided whatever heavy lifting we knew about that was relevant to the questions we addressed. If Martins has some specific suggestions about what we omitted—he offers none—we will be more than glad to consider them.

Letters to the Editors

More Letters for this Article