Invariant and Variable Properties

David Lightfoot

To the editors:

Noam Chomsky sketches ways in which, over several decades, the Generative Enterprise has addressed the Galilean Challenge and made progress in our understanding of language that would have been hard to imagine not long ago.¹ The enterprise, now mutated into the Biolinguistic Program, reflects the work of many people from many countries analyzing many very different grammars. A grammar is what we used to call the formal, generative system that characterizes a person’s mature language faculty, which is represented in an individual’s mind/brain; this is now often referred to as an internal, individual or I-language.

Rich, invariant principles were discovered, often in response to arguments from the poverty of the stimulus; such principles, defined universally, bridged the gap between information conveyed by a child’s typically very limited experience and the rich structures that characterize the mature grammar. Those invariant principles, it was postulated, were available to children through their biology, as attributes of their genetic material. The principles explained how simple experiences could trigger rich structures in the biological grammars of some form of Japanese or of Javanese.

Over the last two decades, under the Minimalist Program, linguists have sought to simplify the principles, minimizing the information they embodied. Part of the motivation is a commitment to the simplicity that was the gift to science of William of Occam and part is to provide a plausible biological account whereby we might attribute the evolution of the language faculty in the species to a single mutation. Invariant computational operations of Project and (internal and external) Merge build hierarchical structures bottom-up, which suffice for all languages. A repeatable operation assembles two syntactic elements a and b into a unit, which may, in turn, be merged with another element to form another phrase, and so on. That universal, invariant property raises the prospect that the option of Merge was the mutation that made language and thought possible for Homo sapiens.²

Thinking in terms of hierarchical structures resulting from minimalist computational operations has also facilitated remarkable neuroscientific work linking brain activity to the structural units underlying language and thought in novel ways. Nai Ding et al. showed that when people listen to connected speech, cortical activity of different timescales tracks the time course of abstract structures at different hierarchical levels, such as words, phrases, and sentences.³ Results indicate that “a hierarchy of neural processing timescales underlies grammar-based internal construction of hierarchical linguistic structure.”⁴

Minimalist ideas about hierarchical structures being formed by multiple applications of Merge not only help us think differently about the evolution of the language faculty and of thought in the species and enable new neuroscientific work, but they have also facilitated new approaches to the acquisition of language by children. The hierarchical structures formed by multiple applications of Merge constitute the means by which people, including very young children, begin to analyze and parse what they hear—the key component of the acquisition process. Work has shown repeatedly that children rely on the tools provided by their biology and learn much from very little experience.

Work has examined language acquisition by children exposed only to unusually restricted data, much of the work focusing on the acquisition of signed systems. A striking fact is that ninety percent of deaf children are born to hearing parents, who are normally not experienced in using signed systems and often learn a primitive kind of pidgin to permit some communication. In such contexts, children surpass their models readily and dramatically and develop effectively normal mature capacities, despite the limitations of their parents’ signing.⁵

This is not surprising in light of studies of creoles more generally, and of new languages beyond creoles, which show that children exposed to very limited experiences go well beyond their models in quickly developing the first instances of rich, new I-languages.⁶ Not much is needed for a rich capacity to emerge, as demonstrated by many of the contributors to the 2013 volume edited by Massimo Piattelli-Palmarini and Robert Berwick.⁷

Extraordinary events have cast new light on these matters: the birth of new languages in Nicaragua and in the Bedouin community. In Nicaragua the Somoza dictatorship treated the deaf as sub-human and barred them from congregating. Consequently, deaf children were raised mostly at home, had no exposure to fluent signers nor to a language community, were isolated from each other, and had access only to home-signs and gestures. In 1979, the Sandinistas took over the government and provided a school where the deaf could congregate, which quickly enrolled four hundred deaf children. Initially the goal was to have them learn spoken Spanish through lip reading and finger spelling, but this was not successful. Instead, the schoolyard, streets, and school buses provided good vehicles for communication and the students combined gestures and home-signs to create first a pidgin-like system, then a kind of productive creole, and eventually their own language, Nicaraguan Sign Language. The creation of a new language community took place over only a few decades. This may be the first time that linguists have witnessed the birth of a new language and they were able to analyze it and its development in detail. Judy Kegl et al. provide a good general account and Ann Senghas et al. examine one striking development, whereby certain signs proved not to be acquirable by children and were eliminated from the emerging language.⁸

Wendy Sandler et al. discuss the birth of another sign language among the Bedouin community, which has arisen in ways similar to Nicaraguan Sign Language and was discovered at about the same time.⁹ These two discoveries have provided natural laboratories to study the capacity of children exposed to unusually limited linguistic experience to go far beyond their models and to attain more or less normal mature I-languages.

If successful language acquisition may be triggered by exposure only to very restricted data, then perhaps children learn only from simple expressions. They only need to hear simple expressions, because there is nothing new to be learned from complex ones. This is degree-0 learnability, which hypothesizes that children need access only to unembedded material.¹⁰ Such a restriction would explain why many languages manifest computational operations in simple, unembedded clauses, which do not appear in embedded clauses (e.g. English subject-inversion sentences like Has Kim visited Washington? but not comparable embedded clauses *I wonder whether has Kim visited Washington), but no language manifests the reverse (operations that appear only in embedded clauses and not in matrix clauses). One explanation for this striking asymmetry is that children do not learn from embedded domains. Therefore, much that children hear has no consequences for the developing I-language; nothing complex triggers any aspect of I-languages.

Postulating hierarchical linguistic structures formed by a simple Merge operation has yielded new understanding of the invariant properties of language and has generated an immensely fruitful research program, bringing explanatory depth to a wide range of phenomena, indeed discovering a huge range of properties (as Chomsky notes and I have reinforced here). However, a hallmark of human language, alongside its invariant properties, is its variation. The environmentally induced variation that one finds in language is biologically unusual, not seen in other species or in other areas of human cognition, and requires a biologically-coherent treatment. Children attain significantly different internal languages, depending on whether they are raised in contexts using some form of Swedish or a kind of Vietnamese. English-speakers in seventeenth-century London typically acquired different grammars from those acquired three generations earlier. Furthermore, people speak differently depending on their class background, their geography, their interlocutors, their mood, their alcohol consumption, and other factors.

For a good biological understanding, variation in grammars, I-languages, needs to take its place among other types of variation. This is an area where we have made much less progress than with invariant properties; there needs to be new thinking. Chomsky initiated the Principles and Parameters approach in 1981, seeking to find a Universal Grammar with both invariant principles and a set of formal parameters that children were thought to set on exposure to Primary Linguistic Data.¹¹ For four decades, linguists have been postulating parameters, usually binary (either structure a or structure b), but no general theory has emerged. Minimalists set on reducing the complexities of the invariant principles that had emerged by the mid-nineties have not devoted equivalent efforts to minimizing the complexities of parameters nor to giving an account of how parameter settings might be acquired by young children.

The difficulty is aggravated by the absence of an adequate account of which Primary Linguistic Data set which parameters. It is often supposed that children evaluate candidate grammars by checking their generative capacity against a global corpus of data experienced, converging on the grammar that best generates all the data stored in the memory bank. But that entails huge feasibility problems.

Even considering a small number of parameters, the problems become clear. If the parameters are independent of each other, forty binary parameters would entail over a trillion possible grammars, each generating an infinite number of structures. Parameters, of course, are not independent of each other and therefore the number of grammars to be evaluated might be somewhat smaller. On the other hand, Giuseppe Longobardi et al. postulate fifty-six binary parameters just for the analysis of noun phrases in Indo-European languages, which would suggest much larger numbers. On any account, the relevant numbers are astronomical.¹²

There are other conceptual problems with viewing children as evaluating the generative capacity of numerous grammars and setting binary parameters accordingly. It may be opportune to consider other approaches.

I have proposed that rather than evaluating the generative capacity of grammars and setting formal parameters, children are born to parse, endowed with the tools provided by a restrictive and minimalist Universal Grammar.¹³ They parse what they hear and use the structures necessary to do so, thereby discovering them, in principle one by one. Those structures are part of the child’s emerging I-language and in aggregate they constitute the mature I-language. Such an approach enables us to understand how children develop their internal system and how those systems may change from one generation to another, as revealed by work on historical change in syntactic systems. After all, all syntactic variation must originate in change.

In short, a common approach is to think of I-languages as consisting of invariant properties and a set of formal parameter settings; children evaluate numerous grammars against a corpus of data experienced and flick on/off switches on binary parameters. A better alternative might be to think of internal languages as consisting of invariant properties over a certain domain plus supplementary structures that are not invariant but required in order to parse what children hear, consistent with the invariant properties. Put differently, children parse the external language they hear and postulate specific I-language elements required for certain aspects of the parse. To do so, they make use of what UG makes available, notably the bottom-up procedures of Project and Merge. The aggregation of those elements constitutes the complete I-language. When external language shifts, children may parse differently and thus attain a new I-language, as revealed in work on syntactic change. Children discover variable properties of their I-languages through parsing with the available hierarchical structures; there is no evaluation of I-languages and no binary parameters.

This approach allows good understanding of why English has adopted so many structural innovations not shared by its closest historical relatives. Two examples briefly: first, in the early sixteenth century a change was completed whereby, after the loss of very rich verbal morphology, a set of preterite-present verbs came to be categorized as Inflection elements with a very different syntax from verbs, unlike in any other European language. Second, with the loss of case morphology on nouns, a set of forty or so psych-verbs underwent a kind of reversal of meaning and a new syntax, like changing from “please” to “enjoy,” repent changing from “cause sorrow” to “feel sorrow,” and a Theme subject changing to a Patient with both verbs. With the loss of morphology, children came to parse expressions differently, assigning different structures. Hence they acquired new I-languages that entail the new structures and semantics of these psych-verbs. It is hard to see how an explanation could be provided if children are evaluating multiple grammars and setting formal parameters.

This is a brief account, probably too brief, but it is time for an alternative to parameter setting as an approach to variable properties that matches our success with invariant properties. I suggest that, rather than being parameterized, UG is open.¹⁴

David Lightfoot

Letters to the Editors

More Letters for this Article