Galileo’s Other Challenge

Hajime Hoji

To the editors:

Chomsky’s conception of the language faculty, as described in “The Galilean Challenge,” is that its core property is the simplest computational operation (termed Merge) of forming a set out of two objects, which results in a purely hierarchical organization of its building blocks.¹ In the technical discussion, the purely hierarchical representation is called the LF representation and the basic structural relation is called c-command.² Chomsky considers the linear-precedence relation, necessitated by the need to externalize the purely hierarchical representation for the access of the sensorimotor system, to be of secondary importance and communication to be of tertiary importance to the language faculty. This conception, Chomsky suggests, provides a straightforward explanation for the universally-observed property of structure dependence as an innate property of the language faculty and is in harmony with, and hence renders support for, the thesis about the emergence of the language faculty being quite abrupt and the one about the language faculty being unique to the human species.³

Chomsky offers a reasonable account for the above-mentioned theses, and I am inclined to agree. One may, however, wonder if it is possible to put the claim in question to rigorous empirical test. If by empirical test we mean the testing of a definite prediction by experiment or observation, it is not immediately clear how we can do so. A more basic and fundamental question is how we can put hypotheses about the language faculty to a rigorous empirical test. In what follows, I would like to suggest how we can do so under Chomsky’s conception of the language faculty.⁴

Detecting the Effects of LF c-command

The idea that Merge is the only structure-building operation in the Computational System of the language faculty goes back to Chomsky in 1993, and the general idea that the LF is the basis of meaning, as it reflects the properties of the Computational System of the language faculty, has a longer history in the generative tradition.⁵ These ideas lead us to expect to detect LF c-command effects. But they do not tell us, by themselves, how such effects can be detected. We must formulate hypotheses, in terms of these ideas, so as to be able to deduce definite and testable predictions. To the extent that we successfully detect the effects of LF c-command precisely in accordance with our predictions, we obtain empirical evidence for the conception of the language faculty in question.

A major task for the language faculty scientist is, therefore, to design an experiment for detecting the effects of LF c-command. It took researchers many years to compute the exact consequences of Einstein’s General Relativity with respect to gravitational waves in a testable form, and it took them even longer to design and build a device that was able to detect gravitational waves in accordance with Einstein’s predictions.⁶ A successful detection of LF c-command effects likewise requires that we first predict observable effects of LF c-command, then design and conduct an experiment to test the predictions, while constantly trying to enhance the precision of the experimental device.

There is no a priori answer as to what form such an observable effect might take. Whatever form it might take (i.e., whatever we might decide to consider as evidence for or against our hypotheses about the language faculty), the definite effects of LF c-command would have to be predicted by a set of hypotheses about the language faculty and those that relate such observable effects to the hypothesized properties of the language faculty. Given the working hypothesis that the language faculty (more precisely, its steady state) underlies our ability to relate linguistic sounds/signs (henceforth “sounds,” to make the exposition simpler) and meaning,⁷ it may be a reasonable strategy to take an informant’s introspective judgments about the relation between sounds and meaning as evidence for or against our hypotheses about the language faculty.

Two things should be understood clearly. First, the language faculty is internal to an individual’s mind/brain. Our predictions must therefore be about an individual. If we consider an informant’s introspective judgments on the relation between sounds and meaning as evidence for or against our hypotheses about the language faculty, we must be making predictions about an individual informant’s judgments. Second, we should make our predictions as definite as possible so as to be able to compare them with observational/experimental results as definitively as possible. We should therefore be making definite predictions about an individual informant’s judgments.

In what follows, I will try to articulate a conceptual basis for answering the following two closely-related questions: first, how we can deduce a definite prediction about an individual informant’s linguistic judgments, and second, how we can aspire to obtain an experimental result precisely in accordance with such definite predictions in a reproducible manner. The first question is related not only to how we deduce definite predictions, but also to what definite experimental results we can expect to obtain. The second question concerns the nature of reproducibility when we deal with an individual informant’s introspective judgments.

The answers to these questions, to be outlined below, are based on Chomsky’s conception of the language faculty; the core properties are quite simple and universal, and observed differences among speakers, within a linguistic community or across linguistic communities, are due to externalization. It is hypothesized that in its initial state, the language faculty, the genetic endowment that underlies our ability to relate sounds and meaning, is uniform across the members of the species and that, in its steady state where its non-trivial growth has stopped, it varies in accordance with one’s linguistic experience, within the limit imposed by the genetic endowment. The language faculty in its steady state was termed “I-language” by Chomsky in 1986.⁸ Given that externalization results in differences among the different I-languages, within a linguistic community and across linguistic communities, it comes as no surprise that the informants’ introspective judgments—on the acceptability of a sentence under a specified interpretation—may exhibit a great deal of variation and fluctuation even within a single linguistic community.

This state of affairs has given rise to the concern that the informants’ introspective judgments do not form a reliable source of evidence for or against our hypotheses about the language faculty. One should thus naturally wonder how it is possible to obtain a definite experimental result in accordance with a definite prediction about an individual informant’s introspective judgments in a reproducible manner, within a single informant, within a linguistic community, and across linguistic communities.

One’s I-language is a result of modification of the initial state of the language faculty through one’s linguistic experience. Our definite predictions about an informant’s introspective judgments must, therefore, be based on hypotheses about the individual I-language, which must retain much of the initial state of the language faculty. The key to answering the two questions mentioned above is how we can extract the effects of the initial state of the language faculty from an individual informant’s introspective judgments, when the latter must be based not only on the initial state of the language faculty, but also on the properties acquired through one’s linguistic experience. It is by controlling the effects of the various factors due to externalization that we can consider an individual informant’s introspective judgments as a reflection of the universal properties of the language faculty. This is what makes it possible for us to aspire to pursue reproducibility among speakers of different I-languages.⁹

In order to detect the effects of LF c-command, we must consider how the c-command relation between two LF objects can be identified through something that the informants can observe and detect. Since we cannot directly observe the c-command relation among LF objects, we must put forth a hypothesis about the c-command relation at LF between what corresponds to two linguistic expressions in a given phonetic sequence, such as a sentence, along with a hypothesis about what kind of interpretation holding between the two linguistic expressions requires an LF c-command relation. For the purpose of pursuing testability, we therefore must make crucial reference to Chomsky’s 1993 model (or its variant) of the Computational System (CS), according to which the generative procedure of the CS of the language faculty takes a set of “lexical items” as its input, and the structure-building operation of Merge gives rise to two output representations, one of which serves as a basis for meaning and the other as a basis for what the sensorimotor system accesses. If we focus on the particular sensorimotor system that makes crucial use of sounds, as we do in our discussion here, the two output representations are, following Chomsky’s 1993 model of the CS, the LF representation and the PF representation.

We thus see that the experimental detection of the effects of LF c-command necessarily requires that we consider a phonetic sequence of linguistic expressions, that we focus on a certain interpretive relation between two linguistic expressions therein, and that we pursue the idea that the (un)availability of the interpretive relation that the informant can detect tells us about the presence or the absence of the c-command relation at LF between the two LF objects corresponding to the two expressions.

Consider the sequence of linguistic expressions as schematized in (1), where “…” represents any sequence of linguistic expressions, including nothing.

… A … B …

Let us refer to the interpretive relation holding between A and B as “R(A, B).” The relevant consequence of the hypotheses in question should be such that R(A, B) is possible only if what corresponds to A at LF (let us represent it as “LF(A)”) c-commands what corresponds to B at LF (let us represent it as “LF(B)”).

For any sequence of linguistic expressions that includes A and B, the LF representation corresponding to it should be such that either LF(A) c-commands LF(B) or it does not, provided that the CS indeed generates an LF representation corresponding to the sequence. For the detection of the effects of LF c-command, therefore, we must have hypotheses that have the consequence of determining whether LF(A) c-commands LF(B) in the LF representation corresponding to a given sequence of linguistic expressions that includes A and B. Given the definition of “c-command” (see endnote 2), a sequence such as (1) must be understood as in (2), for example, so that we can express LF(A) c-commanding LF(B) as a consequence of our hypotheses that LF(A) is Merged with LF(C).

… A [_C … B … ] …

This line of thinking leads us to consider a sentence in our experiment as an instantiation of a schema, as in (2), for example.

Let us now recognize that it is not possible to assign a numerical value to the degree of acceptability of a sentence. We can see this by asking ourselves how acceptable we can confidently judge a given sentence to be (under a specified interpretation). Depending upon the sentence (and the specified interpretation), we can perhaps say, with some degree of confidence, that the sentence is not acceptable at all or that it is fully acceptable. But it is highly doubtful that we can say, with any degree of confidence, that the sentence is, for example, 76.5% acceptable under the specified interpretation. Our predictions, therefore, should not be in terms of a numerical value as long as we take the individual informant’s introspective judgments as evidence for or against our hypotheses about the language faculty and try to deduce and test definite predictions about them.

One might think that our definite prediction about an individual informant’s judgment can be either “not acceptable at all” or “fully acceptable.” It turns out, however, that we can only predict complete unacceptability or the lack thereof.

There are two types of schemata, with regard to R(A, B). One type is such that one of the consequences of our hypotheses is that there is an LF representation corresponding to it where LF(A) c-commands LF(B). The other type is such that there is no LF representation corresponding to the schema in question where LF(A) c-commands LF(B), according to our hypotheses. Let us refer to the first type as “^okSchema” and the second as “*Schema.”

Once we have recognized that our predictions are about a schema and hence each schema represents an infinite number of sentences, we come to understand the following. The “clearly unacceptable” judgment should be due to the predicted absence of the LF c-command in the context of our current discussion, and if a particular sentence is predicted to be “clearly unacceptable,” it is so predicted regardless of how simple or complex and how natural or unnatural the sentence might be. The “clearly unacceptable” judgment on a sentence that is extremely complex or extremely unnatural might well be due to factors that are quite independent of the LF c-command relation in question. We should, therefore, try to make the sentence instantiating the *Schema in question as simple and as natural as possible. If our hypotheses that have led to the *Schema in question are valid, the sentence in question should still be judged completely unacceptable under the specified dependency interpretation in question. The acceptability status of the sentences instantiating a *Schema is thus not affected by the fact that there is no limit to the length and the complexity of a sentence that instantiates a schema. When it comes to an ^okSchema, on the other hand, we cannot guarantee the full acceptability of every sentence that instantiates it because it is always possible to construct a sentence that instantiates the ^okSchema that is extremely complex and/or extremely unnatural, thereby making it highly unlikely that the sentence in question will be judged fully acceptable.

We are thus led to the fundamental asymmetry between the two types of predictions.

The fundamental schematic asymmetry:
1. The *Schema-based prediction: every sentence that instantiates a *Schema is completely unacceptable under the specified dependency interpretation.
2. The ^okSchema-based prediction: it is not the case that every sentence that instantiates an ^okSchema is completely unacceptable under the specified dependency interpretation. (Hence, some sentences that instantiate the ^okSchema are acceptable at least to some extent under the specified dependency interpretation.)

In Language Faculty Science (2015), I argued that what constitutes a fact in language faculty is an experimental result in harmony with the two types of predictions in (3).¹⁰ Such a result is called a confirmed predicted schematic asymmetry. In the context of the present discussion, obtaining a confirmed predicted schematic asymmetry regarding R(A, B) in a reproducible manner would count as the detection of LF c-command effects. To exclude the possibility that the (un)availability of R(A, B) is due to the precedence relation (A preceding B), we make sure to consider an ^okSchema in which C precedes A as in (4), along with (2).¹¹

… [_C … B … ] … A …

Facts and Hypotheses

One might think that we consider something like (5) as a hypothesis about the initial state of the language faculty.¹²

R(A, B) is possible only if LF(A) c-commands LF(B), where A and B are particular linguistic expressions and R(A, B) is an interpretation pertaining to A and B.

This cannot be so because we should not be able to formulate a hypothesis about the initial state of the language faculty in terms of specific linguistic expressions. (5) should, therefore, be a consequence of a combination of a universal hypothesis that makes crucial reference to the c-command relation between two LF objects and some language-particular hypotheses about how R(A, B) can arise.

We can, for example, entertain a hypothesis about a formal relation at LF, which we may understand as an abstract object at LF, pertaining to two LF objects, a and b, such that its existence crucially depends upon a c-commanding b. The language faculty scientist should then be concerned with how this formal object manifests itself in an observable way (and what additional properties it may have). When judging the availability of R(A, B) in a given sentence containing A and B, the informant must be relating A and B to particular building blocks for a language of thought in the terms described by Chomsky in his essay. To the extent that a particular phonetic sequence can correspond to more than one building block (or more than one hierarchically-organized set of building blocks) and to the extent that R(A, B) can arise in more than one way,¹³ we cannot make a definite prediction about the availability of R(A, B) unless we can determine what choices of A and B make R(A, B) a good probe into the effects of LF c-command, independently of R(A, B), for a given informant at a given time.

One way to determine such choices of A and B is as follows. Suppose we consider another dependency interpretation pertaining to two linguistic expressions A’ and B, R2(A’, B), such that its availability is affected by the choice of B but not by the choice for A’. Suppose further that we can determine what choice for B makes R2(A’, B) a good probe for the detection of the effects of LF c-command for a given informant at a given time, by the criterion of what qualifies as a confirmed predicted schematic asymmetry. Suppose also that there is another dependency interpretation pertaining to two linguistic expressions A and B’, R3(A, B’), such that its availability is affected by the choice of A but not by the choice for B’, and that we can determine what choice for A makes R3(A, B’) a good probe for the detection of the effects of LF c-command for a given informant at a given time, by the same criterion. On the basis of the (un)availability of R2(A’, B) and that of R3(A, B’) in relation to various choices for B for R2(A’, B) and various choices for A for R3(A, B’), we can identify, for a given informant at a given time, the best choice for B and that for A for the purpose of detecting LF c-command effects. Notice that the good choices may differ among informants (and even within the single informant on different occasions insofar as pragmatic considerations may affect how the informant treats a given expression with regard to the interpretation in question).

Our prediction is then that if we use the B for R2(A’, B) and the A for R3(A, B’) identified in this way, R(A, B) will likely be a good probe for the detection of the effects of LF c-command for the informant in question. It can arise only if LF(A) c-commands LF(B). That informant’s judgments on the (un)availability of R(A, B) in sentences that instantiate the *Schema and those that instantiate the ^okSchema will then be revealing about the validity of the hypotheses that have given rise to the predicted schematic asymmetry, including one about the formal object whose existence crucially depends on LF c-command.¹⁴

The predictions in language faculty science, as described in a forthcoming publication, take the form of the predicted correlations of schematic asymmetries, rather than the form of predicted schematic asymmetries (see (3)).¹⁵ It is by focusing on the predicted correlations of schematic asymmetries that we can expect to be able to extract information about the initial state of the language faculty on the basis of an individual informant’s introspective judgments, which are necessarily tied to the informant’s specific I-language addressed in a given experiment. As Chomsky has emphasized over the years, linguistic phenomena are of interest to the language faculty scientist only when we have the hunch that they can serve as an effective probe into the properties of the language faculty, and more strictly speaking, only when we have a means to substantiate the hunch and test its validity.

Rigorous Testability

The language faculty is what is hypothesized to underlie our ability to relate linguistic sounds and meaning. Hypotheses in language faculty science are thus about what is hypothesized to exist. To deduce testable predictions, we must have hypotheses about some hypothesized objects pertaining to the language faculty. That, I understand, is the essential aspect of the abstract nature of theorizing in language faculty science that Chomsky has been stressing over the years.

What needs to be emphasized, especially in light of this abstract nature of the research program, is the importance of pursuing rigorous testability, as stressed by Richard Feynman, for example:

It does not make any difference how beautiful your guess is. It does not make any difference how smart you are, who made the guess, or what his name is—if it disagrees with the experiment, it is wrong. That’s all there is to it.¹⁶

Over the years, Chomsky has stressed the importance of pursuing theoretical unity and abstract theorizing. But he has not, as far as I am aware, directly addressed the importance of testability and reproducibility, especially since the early 1980s.

It is important to recognize again that we cannot measure or observe anything about the language faculty without hypotheses. It is, therefore, of the utmost importance that we establish what can be regarded, provisionally, as facts to be explained. Because of this inseparability of facts and hypotheses, the establishment of such facts requires that we put forth hypotheses and subject them to rigorous empirical testing. It has sometimes been asserted, in reference to works by Kuhn and Lakatos, for example, and in reference to physics, that the ultimate choice among alternative theories should not be determined by empirical considerations and that this should apply also to language faculty science.¹⁷ In physics, there is a vast accumulation of facts, many of which will remain facts even under (often radically) different ways of expressing them, as one can see by considering how much is empirically shared by, for example, the Newtonian conception of the universe and the Einsteinian conception.

What should be asked in relation to language faculty science is whether we have accumulated enough facts to the point where we can afford to deliberate on theory choices purely on the basis of theoretical considerations. People may have different answers here. But if we accept that what has to be established as a fact in language faculty science is a confirmed predicted correlation of schematic asymmetries, as outlined above, we should realize that we are still very much at a stage where we are trying to accumulate facts, with the clear proviso that the establishment of a fact requires hypotheses in language faculty science even at its earliest stages of development.¹⁸

Galileo is considered the father of modern science due to his recognition of the importance of the experimental/observational testing of hypotheses, which Feynman states as “The test of all knowledge is experiment. Experiment is the sole judge of scientific ‘truth.’ [emphasis original]”¹⁹ Galileo is also recognized for his realization and belief that nature follows what is expressible in terms of mathematical laws. Language faculty science has to proceed without the “service that mathematical physics may render us” in physics, as Henri Poincaré puts it, because our data, i.e., the individual informant’s introspective judgments, cannot be expressed in terms of numerical values of mathematical significance.²⁰ What has been suggested here is then a way to deduce and test definite predictions in language faculty science, to meet Galileo’s challenge, which is also Feynman’s, about the experimental and observational testing of the validity of our hypotheses.

As discussed, the detection of the effects of LF c-command requires that we formulate universal and language-particular hypotheses to make definite predictions and that we try to enhance the precision and reliability of our experimental device as much as possible. What corresponds in language faculty science to the observational device for detecting gravitational waves does not involve any physical equipment of significance, at least at the present time. It only involves the designing of an experiment on the basis of the hypotheses and assumptions. It is imperative that we formulate our hypotheses about the language faculty, design our experiment to test our predictions, and interpret the results, with as much checking and care as necessary.²¹

I have only been able to offer a rather abstract, and incomplete, sketch of how that can be done. But the preceding abstract sketch is based on actual hypotheses, their predictions, and a number of actual experiments that have been, and are currently being, conducted to test them. The points about the predicted schematic asymmetries are described in a 2015 book and those about predicted correlations of schematic asymmetries will be described in a forthcoming publication.²² The latter deals with R(A, B), R2(A’, B) and R3(A, B’) in Japanese while the former deals with only R(A, B) but in both English and in Japanese.

In those works, R(A, B) is a dependency interpretation between a singular-denoting expression (e.g., his in English) and a non-singular-denoting one (e.g., every boy in English) as in a sentence such as every boy praised his father. R2(A, B’) is a coreference relation. Japanese has expressions that serve the function of a third-person personal pronoun in English; one type must be used referentially while the other cannot be used referentially, apart from its deictic use, and the difference is made by the use of two distinct demonstrative prefixes. It would be like having two phonetically distinct he’s in English; one corresponding to He in He just left and the other corresponding to he in every boy thinks that he is a genius as expressing “every individual x that is a boy thinks that x is a genius.” Japanese phonetically distinguishes between the two while English does not.²³ This directly contributes to making R2(A’, B) a good probe for the detection of LF c-command in Japanese but not in English (unless we get into a more elaborate experimental design where we address the so–called sloppy-identity²⁴). R3(A, B’) is a scope-dependency interpretation that necessarily involves a “distributive reading” between A and B’, as in every teacher praised three students as it is interpreted as expressing each teacher in question praised a different set of three students. Such an interpretation is a subset of the EVERY>THREE readings as commonly discussed in the context of quantifier-scope discussion in relation to sentences like the one mentioned. We focus on the subset, and do not deal with SOME>EVERY, for example, to make R3(A, B’) a good probe for the detection of LF c-command.

The effectiveness of R(A, B) as a probe for the detection of LF c-command effects depends upon the choice of A and that of B, as noted above. The choice of B does not seem to have much impact in English with regard to the effectiveness of R(A, B). In Japanese, the choice of B greatly affects the effectiveness of R(A, B) and R2(A’, B), restricting ourselves to the expressions that cannot be used referentially. The choice of A’ seems to affect the effectiveness of R3(A, B’) and that of R(A, B) both in English and in Japanese.²⁵

When we conduct an experiment dealing with R(A, B), R2(A’, B) and R3(A, B’) and ask our informants to judge the availability of the relevant interpretation in a given sentence in question, it is crucial that we make sure that our informants understand what is meant by the interpretation in question and that they are paying close attention to our instructions. For each experiment on R(A, B), R2(A’, B) and R3(A, B’), we thus conduct sub-experiments to maximize the reliability of its result. As discussed above, the experiments on R2(A’, B) and those on R3(A, B’) are sub-experiments for the ones on R(A, B). It is by having a network of sub-experiments for our main experiment that we can expect the result of our main experiment to be as informative as possible with regard to what we want to find out.²⁶ In the absence of a physical experimental/observational device, that is how we try to enhance the precision of our experimental/observational device in language faculty science, at least at the moment.

In the methodology sketched above, we always consider whether we obtain a confirmed predicted schematic asymmetry, i.e., reproducible results of a combination of judgments of “complete impossibility” and the lack thereof; see (3). While what proves to be an effective probe for the detection of LF c-command effects may well differ among different informants, we expect uniform correlations of judgments in line with what is predicted by our hypotheses about the initial state of the language faculty. In order to successfully carry out such research, we must pay close attention to the properties of I-languages and “the complexity and variety of linguistic behavior” due to externalization.

Concluding Remarks

In Reflections on Language, Chomsky remarks that

it is not unreasonable to suppose that the study of this particular human achievement, the ability to speak and understand a human language, may serve as a suggestive model for inquiry into other domains of human competence and action that are not quite so amenable to direct investigation.²⁷

Here I have sketched one way to make it possible for the study of the language faculty to become a rigorous empirical science, thereby serving as such a model. As noted, a key to language faculty science as an exact science is that it aspires to deduce and test definite predictions about an individual, despite the widespread belief that it is not possible to do anything like that successfully. The hypothesized existence of the Computational System of the language faculty makes this possible. Chomsky’s conception has led us to pursue the detection of the effects of LF c-command in terms of categorical rather than numerical predictions. If other domains of the human mind/brain lack a similar Computational System, it may not be possible to pursue a scientific understanding of their subject matters, and they may remain a mystery, for a principled reason, as Chomsky pointed out at least as early as in 1975.

In conclusion, I agree with Chomsky’s conception of the language faculty. Even if, as Chomsky suggests, the externalization of language for our sensorimotor systems is a peripheral aspect of the language faculty, externalization still has an indispensable role to play in a research program aspiring to discover the properties of the language faculty by following what Feynman termed the “Guess-Compute-Compare” method. This is widely known as the hypothetico-deductive method, but with a particular emphasis on making and testing definite predictions. Without externalization, e.g., without making reference to how the LF representations are paired with a particular phonetic sequence, we would not be able to conduct experiments, at least until the human species evolves to the point where we can test our predictions about the language faculty without recourse to externalization.²⁸ If externalization processes are “the locus of the complexity and variety of linguistic behavior,” as Chomsky suggests, it is by carefully controlling the factors that contribute to the complexity and variety of linguistic behavior that we can hope to attain a better understanding of how to detect the effects of the initial state of the language faculty, including LF c-command effects. Conversely, with an improved device for detecting such effects, we can hope to attain a better understanding of the properties of particular steady states of the language faculty that result from particular linguistic experience. The effective device for detecting LF c-command effects will then be serving a role very much like a device for detecting gravitational waves; it will allow us to observe what we cannot see.²⁹

Hajime Hoji

Letters to the Editors

Detecting the Effects of LF c-command

Facts and Hypotheses

Rigorous Testability

Concluding Remarks

More Letters for this Article