
Biolinguistics and the human language faculty
In recent years linguists have gained new insight into human language capacities on the basis of results from linguistics and biology. The so-called biolinguistic enterprise aims to fill in the explanatory gap between language and biology, on both theoretical and experimental grounds, hoping to reach a deeper understanding of language as a phenomenon rooted in biology. This research program is taking its first steps, and it has already given rise to new insights on the human language capacity, as well as to controversies, echoing debates that go back to the earlier days of generative grammar. The present discussion piece provides a high-level characterization of biolinguistics. It highlights the main articulation of this research program and points to recent studies linking language and biology. It also compares the biolinguistic program, as defined in Chomsky 2005 and Di Sciullo & Boeckx 2011, to the view of the human language faculty presented in Jackendoff 2002 and Culicover & Jackendoff 2005, and to the discussion in Jackendoff 2011.*
biolinguistics, biology, human language faculty, universal grammar, Merge, language acquisition, minimalist program, symmetry, assymetries
‘The study of the biological basis for human language capacities may prove to be one of the most exciting frontiers of science in coming years.’
1. A high-level characterization of biolinguistics
Biolinguistics is the study of the biology of language. It aims to shed light on the biological nature of human language, focusing on foundational questions such as the following: What are the properties of the language phenotype? How does language ability grow and mature in individuals? How is language put to use? How is language implemented in the brain? What evolutionary processes led to human language? These questions have been on the agenda in generative grammar since its beginnings (Chomsky 1965, 1976, among others); the biolinguistic program brings them to the forefront.
How does one go about answering these questions? Let us take a simple and well-studied case. English has a rule (see discussion of Merge below) that can move an auxiliary verb to the beginning of the sentence in order to form a question.
(1) The child that is in the corner is happy.
(2) Is the child that is in the corner happy?
Interestingly, this rule cannot apply to the first is, as seen in 3.
(3) *Is the child that in the corner is happy?
Why is that? It appears that the rule is sensitive to the structure of the sentence. Note that the first is is embedded in the subject noun phrase the child that is in the corner.
(4) [The child that is in the corner] is happy.
The second is is not. The rule is somehow ‘structure-dependent’, even though one can easily imagine simpler rules such as one that just says ‘front the first is (or other auxiliary [End Page e205] verb)’. This rule takes only the linear order of the words in the sentence into account and ignores the sentence structure, and it is therefore much simpler from a computational point of view.
This is not an isolated example, but turns out to reflect a deep-seated property of human language. Many other rules in English besides this rule of question formation share the property that they take into account the hierarchical structure of the sentence: that is, they are structure-dependent and cannot be formulated solely on the basis of linear order. Moreover, every language that has been studied in depth appears to have structure-dependent grammatical rules.
So at a minimum, every language has (i) lexical items like child and happy, (ii) rules that combine phrases like the child that is in the corner and is happy, and (iii) rules like question formation that operate on sentence structures. We can ask how we can capture these properties in a computational system that is in some sense both simple and optimal. This system is a central component of what is called the language faculty.
Next we ask how the child learning language knows that the structure-dependent formulation of questions is the correct one, not the linear formulation in terms of the first is. Language acquisition studies show that children always choose the structure-dependent rule and do not use the structure-independent rule in error. Hence, children are never corrected by their language communities on this aspect of question formation. Since there is no data available to children that allow them to choose between the two formulations, this is sometimes referred to as the ‘poverty of the stimulus’. We conclude then that the fact that rules across languages are structure-dependent is part of our genetic endowment. ‘Structure-dependence’ is an example of what is referred to as universal grammar (UG).
However, there is also variation across languages. For example, the verb precedes the object in English, but follows it in Japanese. Some languages, like Italian, permit a subject pronoun not to be pronounced, so called pro(noun)-drop, while other languages, like French, do not. Learning a language has been compared to choosing from a menu. The child is born equipped with the general principles of UG, but certain choices (parameters) such as pro-drop are left open. The task for the language learner is to go through the menu of choices and pick the appropriate ones on the basis of the data presented.
Finally, we can ask how the faculty of language evolved. What properties of language are unique, and which are shared with other cognitive systems? What properties have antecedents in other species? Did language evolve slowly or rapidly? What genes are involved?
We started off by examining English questions and, in turn, were led to very partial answers for the following questions:
• What is knowledge of language?
• How does the child acquire language?
• How does language evolve?
We saw that, at a minimum, knowledge of language includes a system of computation that computes such structures as Is the child that is in the corner happy?. Furthermore, some properties of these computations, such as structure-dependence, appear to be part of our genetic endowment. So children are able to acquire language by (i) accessing their UG and (ii) processing data input with information in order to set the parameters for a specific language. Finally, we can inquire into the evolution of our genetic endowment for language by, for example, searching for and investigating genes associated with human language. In the following we provide a number of references for those who would like to pursue particular topics in more depth. [End Page e206]
• Structure of the language faculty: Several works discuss the properties of the architecture and the operations of the language faculty from a biolinguistic perspective (Chomsky 1995, 2005, 2008, 2013, 2015a, b, Jenkins 2000, 2004, Hauser et al. 2002, Di Sciullo et al. 2010, Berwick et al. 2013, Boeckx & Grohmann 2013, Piattelli-Palmarini & Vitiello 2015, Berwick & Chomsky 2016, among others).
• Animal communication: Biolinguistic research also covers experimental studies aiming to understand what differentiates human language from animal communication (Fitch & Hauser 2004, Jarvis 2004, Friederici 2009, Fitch 2010, Berwick et al. 2012, Bolhuis & Everaert 2013, among others).
• Neuroscience: Results from neuroscience point to the special properties of the human brain for language (Embick et al. 2000, Moro et al. 2001, Friedrich & Friederici 2009, 2013, Friederici et al. 2011, Albertini et al. 2012, Blanco-Elorrieta & Pylkkänen 2015, Lewis et al. 2015, Magrassi et al. 2015, Zaccarella & Friederici 2015, Xiao et al. 2016, among others).
• The genetic basis of normal and impaired language development: Studies on genetically based language impairments also fall into the realm of the biology of language (Wexler 2003, Ross & Bever 2004, Bishop et al. 2005, Hancock & Bever 2013, among others). Models of language acquisition can be tested in normally developing children and in children with language disorders, as in the case of the KE family, discussed below, as well as in children with so-called specific language impairments (Bishop et al. 1995, Wexler 2003, Bishop & Snowling 2004, Di Sciullo & Agüero-Bautista 2008, Bishop 2015, Männel et al. 2015).
• Language variation: Language variation is another important area of biolinguistic research. While the properties of the language faculty are stable, variation is pervasive crosslinguistically. This is not surprising, given that language is a biological object and variation is a constant in the biological world (Lewontin 1974, 2000, Cavalli-Sforza & Feldman 1981, Hallgrimsson & Hall 2005, among others). The principles and parameters model (Chomsky 1981) gave rise to a systematic approach to language variation (Borer 1984, Rizzi 2000, 2009, Cinque & Kayne 2005, Biberauer 2008, Cinque & Rizzi 2010, among others). According to this model, linguistic variation arises from language acquisition and languages in contact, and follows from the setting of a limited set of options left open in UG.
• Language phylogeny: More recent models of parametric syntax opened new avenues for the understanding of language phylogeny (Bever 1981, Longobardi & Guardiano 2011, Longobardi et al. 2013). Yet other works address the question of why parameters emerge and why resetting of parameters occurs, as well as take into account the role of factors external to the language faculty in language variation (Longobardi & Roberts 2010, Di Sciullo 2011, 2012a, Di Sciullo & Somesfalean 2013, 2015, Biberauer et al. 2014). Some inferences about language evolution can be made on the basis of comparative studies with other species on both the anatomical level (Sherwood et al. 2003, Fitch 2010, among others) and the genetic level (Sun & Walsh 2006).
• Language and dynamic systems: While the poverty of the stimulus (Chomsky 2013) and the critical period (Stromswold 2007, 2008, 2010) point to the biological nature of language, theoretical approaches to language development stemming from works on dynamic systems and population genetics (Nowak et al. 2001, Niyogi 2006, Niyogi & Berwick 2009, among others) opened new horizons for the study of language variation. Other studies address interesting issues related to deterministic/probabilistic theories of language learning (Yang 2002, 2004a, b, 2008, 2011, 2013, 2015). [End Page e207]
The topics and references provided above are by no means exhaustive. Nevertheless, they are indicative of the liveliness of biolinguistic research.
2. Biolinguistic investigations
We have already seen that both genetic endowment and experience play an important role in the growth of language in the individual. Chomsky noted that an additional factor is equally important, viz., ‘principles not specific to the faculty of language’ (2005:6).
The idea is that there may be external principles accounting for properties of the computational system of language that originate outside of the faculty of language, for example, in biology or physics. One such proposal is that there are principles of efficient computation. For example, the idea of principles reducing complexity has been part of the research agenda in the generative enterprise since the 1950s. Framed within biolinguistics, the principles of efficient computation are thought of as natural laws affecting the computation of the (narrow) language faculty (Chomsky 2005, 2011). They apply to syntactic derivations (no tampering condition, minimal search, phases) and to the externalization of the linguistic expressions at the sensorimotor (SM) interface (pronounce the minimum; Chomsky 2011) and at the conceptual-intentional (CI) interface (reference set (Reinhart 2006); local economy (Fox 1999)). One might also ask whether these principles relate to classical notions of complexity, including those of Kolmogorov 1965, and whether the more differentiated notions of internal and external complexity are needed (Di Sciullo 2012c, 2014).
Note that when one says that principles of efficient computation may come from outside the language faculty—for example, from other cognitive systems, from biology, or even physics—it must be understood that this is a part of a program of research. As we learn more about the conditions on computation internal to the language faculty, it might be found that these conditions are specific cases of more general laws. This holds true across all of the sciences. For example, ‘minimality’ principles have played an important role in the development of physics, although the terminology is different, for example, ‘principle of least action’. The law of refraction (Snell’s law), which is responsible for the bending of light when it passes from air into water and which is learned in high school, was originally an empirical observation. Later Fermat formulated it as a principle of least time, and a few more centuries passed before it was realized to be a special case of a least action principle in quantum physics. Other principles not specific to the faculty of language are principles such as symmetry, symmetry breaking, and asymmetry. These can often be analyzed mathematically (both quantitatively and qualitatively) with such concepts as symmetry groups, dynamical systems, (a)symmetrical relations, and so forth. See examples below as they apply to language.
Moreover, the unification between language, biology, and the other natural sciences is an important aspect of biolinguistics. The understanding of the world proceeds by solving smaller puzzles and in parallel trying to unify the answers. In this regard, principles of symmetry, symmetry breaking, and asymmetry may help to unify many areas of the sciences, as they are key concepts in biology, physics, and mathematics.
An example of unification in mathematics is the Erlangen program, initiated by Felix Klein in 1872, which classified geometries using the tools of group theory (Klein 2004 [1939]). In modern times we have the Langlands program, a body of mathematical conjectures, only a few of which have been proven, which seeks to unify apparently unrelated areas of mathematics (Gowers & Barrow-Green 2008). For example, the theory of elliptic curves (number theory) was shown to be connected to the theory of modular forms, as part of the proof of Fermat’s Last Theorem by Andrew Wiles and Richard [End Page e208] Taylor (Singh 1998). As in the case of the Erlangen program, the Langlands program makes crucial use of the tools of symmetry theory (including representation theory), relying on the basic notions of symmetry and asymmetry. There are many other areas of mathematics in which symmetry plays an important role in understanding and unification. In physics, Maxwell’s theories of electricity and magnetism, along with the theory of light, were unified in his theory of electromagnetism. Quantum mechanics in turn unified atomic physics with chemistry.
The last frontier in unification is in biology. Of course, there had already been much unification. For example, it was shown that no vital force is necessary to describe the animate world. It was also shown that the same laws of biochemistry that held for the inanimate world could be extended to the organic world. And, of course, many physical principles carry over to the biological domain (such as conservation of energy) and are studied in the field of biophysics. Thus, principles relying on symmetry, symmetry breaking, and asymmetry, as well as other kinds of principles, may help to unify many areas of biology, including the systems of the brain involved in language (Di Sciullo et al. 2010, Jenkins 2013a, b).
In sum, biolinguistics relies on advances in theoretical linguistics, as well as on results from language acquisition and variation. However, it goes beyond linguistics, to biology, physics, and chemistry, and asks the question of why linguistic phenomena are the way they are. Conversely, results from biology, physics, and chemistry serve as an impetus for the development of biolinguistically grounded theories of the language faculty. Biolinguistics aims to close the explanatory gap between language and other areas of biology by seeking to discover principles that unify the fields.
In the following sections we discuss the three core aspects of biolinguistic investigation and point to recent studies linking language and biology. In the last section we compare two approaches to the human language faculty. We contrast the biolinguistic approach developed in Chomsky 2005 and Di Sciullo & Boeckx 2011 with the view in Jackendoff 2002 and Culicover & Jackendoff 2005, and we identify some differing points of view emerging from the discussion in Jackendoff 2011.
3. The three factors
Biolinguistic investigations explore the biological basis of language, language development in ontogeny and in phylogeny, and the effects of external efficiency principles on linguistic derivations in order to understand the biological underpinnings of language. The following subsections provide further details on each of the three factors in language design.
3.1. Genetic endowment
FOXP2
The human capacity for language is part of the human genetic endowment; however, its genetic underpinning is yet to be discovered. This can be seen in the work on the FOXP2 gene and its mutation in the KE family. FOXP2 was the first gene associated with a language disorder that could be analyzed at the molecular level, and it is probably the most studied (Marcus & Fisher 2003, Fisher & Marcus 2006). However, it is important to point out the usual caveat when discussing genes and language: ‘the’ gene for language does not exist. We now know that most genetic disorders can result from a combination of interactions of many different genes and regulatory elements.
When dealing with a genetic disorder, it is important that the phenotype for the disorder be characterized. In the case of FOXP2, a speech impairment was noted in a family (called the KE family) in which the patients had problems in a number of areas, including pronunciation, syntax, and semantics (Hurst et al. 1990). Additional studies of the phenotype were carried out, including on the difficulties in syntax/morphology (Gopnik [End Page e209] & Crago 1991) and on the difficulties with articulation. It was found that some of the difficulties in articulation derived from problems with verbal sequencing (Alcock et al. 2000).
It was also found that the pattern of inheritance of the disorder was autosomal-dominant so that only one copy of the gene mutation was necessary to trigger the impairment. The next step was to determine, in parallel with studies of the phenotype, the locus of the gene, that is, what chromosome the gene was located on. Mapping studies led to the discovery of the gene locus on chromosome 7 (known as 7q31) (Lai et al. 2001). Once the gene was isolated, the DNA sequence of the gene could be determined. Now it was possible to ask whether the same or other mutations were found in other families. Additional mutations, including point mutations and translocations, were discovered (MacDermot et al. 2005). In addition, it was possible to deduce the protein sequence. It was found that the protein contains a particular kind of protein motif and belonged to a known family of proteins containing a forkhead box (FOX) domain, so the protein was named FOXP2 (the P2 is a subclass based on phylogenetic analysis). It was deduced that the function of the protein FOXP2 was that of a ‘transcription factor’, meaning that it was involved in controlling other genes. The next question was what the targets for FOXP2 might be; a number of candidate genes were identified (Konopka et al. 2009), and recent work proposes that FOXP2 interacts with the retinoic acid-signaling pathway involved in fine motor control and speech motor output (van Rhijn & Vernes 2015). Note that until the function of these candidate genes are known, the question of the phenotype for this disorder is still up in the air. It could involve both grammar and verbal sequencing.
In addition, studies were undertaken to determine in what areas of the brain FOXP2 was expressed (Lai et al. 2001). It is of interest to compare the FOXP2 gene and protein product in other species; this was done in some nonhuman primates, in the mouse, and in songbirds, among other species (Scharff & Haesler 2005). This allowed people to pose questions such as how strongly the gene was selected for in evolution (Enard et al. 2002). Hilliard and colleagues (2012:537) reported that they had ‘found ~2,000 singing-regulated genes … in area X, the basal ganglia subregion dedicated to learned vocalizations. These contained known targets of human FOXP2 and potential avian targets’.
Another disorder affecting language semantics was recently reported (Briscoe et al. 2012). Eight members of a family over four generations had difficulty with mapping word meanings to concepts, for example, substituting tripod for stool or evolving for breeding. Preliminary work indicates that the disorder could be due to a single genetic mutation. The family members reported that they had long had problems in school and at work and were aware that they could not easily follow the plot narration in books or on TV. Reduced gray matter was found in neuroimaging studies in ‘a brain area known to be involved in the interaction between language and semantic systems’ (the posterior inferior portion of the temporal lobe), and the researchers consider this case to be ‘the first example of a heritable, highly specific abnormality affecting semantic cognition in humans’ (Briscoe et al. 2012:3659). Genetic studies of the kind discussed earlier for FOXP2 are to be carried out.
Ultimately, of course, we wish to link work on the genetics of language to neural circuits in the brain. As we work bottom-up from the level of the gene, we simultaneously work top-down to understand the brain. From this point of view and for the time being, work in theoretical linguistics can reveal much more to us about the nature of the language faculty than FOXP2 can. In addition, one can learn much from the study of [End Page e210] language disorders, including genetic disorders, as we mentioned earlier—aphasia, dyslexia, and so forth. Other perspectives on the organization of brain and language are provided by work on sign language, pidgins and creoles, split brains, bilingual brains, savants, and computational modeling (e.g. parsing). One can combine linguistic studies with other tools, such as imaging (e.g. fMRI, MEG, diffusion tensor imaging, and so forth; Shapiro et al. 2006). Thus one can study language on different levels—the functional, anatomic, cytoarchitectonic, and molecular levels (Geschwind & Galaburda 1984, 1987, Grodzinsky & Amunts 2006, Hugdahl & Westerhausen 2010). For a review of research at the neural circuit level, with an emphasis on asymmetry, see Concha et al. 2012. Note that all of the types of studies above can be done as part of developmental studies, to answer the question about how language develops (or grows) in the child. Graham and Fisher (2015) provide a recent overview of genetic research on language.
Thus, FOXP2 was believed by some to be a ‘language gene’, until homologous genes were found in other species. However, the fact that the human species is the only one that developed natural language suggests that there are genetic properties, or combinations of properties, yet to be discovered, that are specific to human language. Moreover, given our knowledge of the initial stages of human embryogenesis, it is reasonable to think that the language ability grows and matures in individuals as a biological system. The fact that the critical period for language growth in the individual is anchored in time, around puberty (Stromswold 2007, 2008, 2010), also indicates that the language faculty is a biological system, with a determined time span for full development, under normal conditions. Finally, the poverty of the stimulus, which constrains the way children learn language, and the fact that they typically do not make ‘mistakes’ that violate core structure-dependency principles (Chomsky 2013) also point to the human biological predisposition for language growth.
Merge
We saw earlier that the computational system of the language faculty must at a minimum be able to generate sentence structures by combining lexical items into larger units. Research in the minimalist program (Chomsky 1995 and related works) has revealed that a core operation called Merge can account for many important syntactic properties of the computational system, for example, binary structure, recursion, and (a)symmetry. Before discussing Merge, let us say a few words about the architecture of the language faculty.
The architecture of the language faculty in this research program is represented in 5, where narrow syntax relates sounds, legible at the SM interface, and meaning, legible at the CI interface, in order to express complex thoughts.
(5)
Merge is the basic combinatorial operation capable of deriving the discrete infinity of language. It is necessarily a part of the computational procedure of the language faculty. Merge is a binary operation that takes two syntactic objects a and b and derives another syntactic object consisting of the two objects that have been merged. A binary operation is preferable to an n-ary operation on both theoretical and empirical grounds. It restricts the choices of combinations between syntactic objects to a minimum and derives constituents that are motivated by syntactic and prosodic properties. This is not the case for operations deriving n-ary structures. In 6, Merge (M) applies to the objects a and b and [End Page e211] derives the set {a,b}.1 This operation applies to objects that have not been merged in a previous step of a derivation. Call this instance of Merge ‘external Merge’ (7). Merge may also apply to objects that were already merged in previous stages of a derivation in order to remerge a given object. Call this instance of Merge ‘internal Merge’ (8).2
(6) M(a,b) : {a,b}
(7)
(8)
Merge has been proposed to derive morphological structures in distributed morphology (Halle & Marantz 1993, among others) and asymmetry morphology (Di Sciullo 2005, among others). In these theories the basic combinatorial operation, Merge, may also combine already derived trees. Thus the generative capacity of Merge, initially proposed for syntactic merger, extends to morphological merger in specific ways.
The human capacity for language must rely on a recursive procedure able to derive the discrete infinity of language. Recursion, as defined in 9, is a property of Merge, which may reapply to its own output, as the derivation in 10 illustrates for external Merge. Internal Merge implies the displacement of categories that will be properly included in the resulting binary-branching hierarchical structure. This is illustrated in 11, from Di Sciullo & Isac 2008, with the displacement of the DP subject in the specifier of vP to the specifier of TP. In 11, the proper inclusion relation also holds between the set of features of the items undergoing Merge. Thus, in the merger of Num with NP, the set [End Page e212] of features of Num is the superset, and the set of features of NP is the proper subset, and so on for the other steps of the derivation. The proper inclusion relation is an asymmetrical relation, since, if a is the proper subset of b, b is not the proper subset of a. This structural and feature asymmetry is part of syntactic derivations and is expected if properties of relations, including asymmetry, are core properties of the computational procedure of the language faculty.
(9) Recursion: the property of a rule to reapply to its own output.
(10) Merge (a,b) : {a,b}
Merge (c, {a,b}) : {c, {a,b}}
Merge (d {c, {a,b}}) : {d {c, {a,b}}}
(11)
Merge is especially important for the study of the biology of language, since the hierarchical structures derived by Merge are a core property of the human language pheno-type. This is a biolinguistic reason for why it is important to understand its properties.
According to Hauser, Chomsky, and Fitch (2002), unbounded recursion is unique to human language. There is in principle no limit on the number of words in a sentence. This view has been challenged in several works, including Jackendoff 2011. It has been claimed, for example, that recursion is also part of other human cognitive faculties, and that it is also found in communication systems in nonhuman primates. However, the generative capacity needed to express infinite complex thoughts may very well fall into a class of grammar with higher recursive capacities.
Chomsky’s (1956) hierarchy of formal grammars provides a ranking of the expressive power of grammars according to a scale of increasing complexity: (type 0 (Turing equivalent (context-sensitive (context-free (finite-state))))). Finite-state grammars occupy the lowest ranking in this scale. Such grammars have limited generative capacities. For example, they do not derive hierarchical structure, and they are more limited than phrase structure with respect to recursion. For example, a finite-state grammar may simulate recursion by iteration if a recursive node occurs at the right or the left edge of a phrase structure grammar, but not if a terminal node is located on both sides of the rule. [End Page e213]
Behavioral and neurological experiments have been conducted in order to test the learning ability of nonhuman primates as opposed to humans. For example, the results of behavioral experiments conducted by Fitch and Hauser (2004) with cotton-top tamarins indicate that nonhuman primates are able to learn finite-state grammars, which derive linear sequences, but not context-free grammars, which derive hierarchical structure. Neuroimaging experiments (Friederici et al. 2006, Friederici 2009, Makuuchi et al. 2009) point to the fact that specific areas in the human brain for processing language (BA44, BA45B) are also present in the macaque brain (Petrides & Pandya 1994), albeit with a more limited size and granularity:
… the human ability to process hierarchical structures may depend on the brain region which is not fully developed in monkeys but is fully developed in humans, and that this phylogenetically younger piece of cortex may be fundamentally relevant for the learning of the PSG.
Other neuroimaging experiments (e.g. Embick et al. 2000, Moro et al. 2001) have investigated whether recursive syntactic (hierarchical) computations activate a dedicated network in the human brain. Results from recent experiments reported in Chesi & Moro 2012 indicate that:
the theoretical distinction between recursive vs. non-recursive rules is reflected in brain activity. More specifically, the activity of (a deep component of) Broca’s area within a more complex network including subcortical elements such as the left nucleus caudatus appears to be sensitive to this distinction as the BOLD signal is increased in this area only when the subjects increase their performance in manipulating recursive rules.3
While neuroimaging studies show that the human brain is sensitive to recursive rules, it is unclear whether recursive processes can be observed at the cellular level. For example, iterative processes are observed in cell duplication and morphogenesis, whereby cells divide into two generally identical copies; see Figure 1.
Cells duplicate by dividing in half, with both halves containing all the necessary DNA information of the organism. Thus, one cell becomes two, which in turn divide to become four, eight, sixteen, thirty-two, sixty-four … cells. Figure via Wikicomons, Creative Commons attribution—Share Alike 4.0 international license.4
[End Page e214]
However, cell duplication and morphogenesis cannot be equated with recursion, as defined above in 9. After a cell divides into two, the two cells no longer form a unit of any sort. This is not the case for the recursive merger of two linguistic objects deriving a more complex object, as illustrated in 10 and 11 above. Moreover, it might be the case that phrasal constituents may not be merged directly, but only indirectly, by first merging with a functional head, as argued in Kayne 2011a and elsewhere. We illustrate this with the properties of complex numerals, such as twenty-one, which is an additive structure, and two hundred thousands, which is a multiplicative structure. In English, there is no legible functional head at the SM interface between the first and the second conjunct. In other languages, however, such functional elements can or must be pronounced in these structures. In Romanian additive structures, the coordinating conjunction şi ‘and’ intervenes between the first and the second numeral (12a). In multiplicative structures, the preposition de ‘of’ (12b) intervenes. In Modern Arabic, the coordinating conjunction wa ‘and’ and morphological case marking intervene between the parts of complex numerals, whether they are additive or multiplicative structures (13). These facts bring empirical support to the hypothesis that the recursion of maximal constituents is mediated by a functional projection. The representations in 14, from Di Sciullo 2012b:15, illustrate part of the derivation of complex numerals, where the functional projection is the locus of valued features, either the additive [ADD] or the [MULT] feature, and unvalued numeral features (uNUM) on the head need to be valued in the course of the derivations.
(12)
a.
b.
(13)
a.
b.
(14)
a.
b.
The derivations in 14 comply with Kayne’s (1994, 2011a) antisymmetry framework and the hypothesis that conjunctions are asymmetrical. In Chomsky’s problems of projection framework (2013, 2015b), endocentric and exocentric derivations can be derived by Merge. Under this view, the functional projection F would be merged later on in the derivation, and one or the other maximal constituent Num would be displaced higher up in the structure. In this framework (Chomsky 2013), the asymmetrical relation between the first and the second conjunct in structured coordinations requires additional steps in the derivations. Whether syntactic derivations can be freely exocentric is [End Page e215] subject to discussion. The fact remains, however, that complex numerals are an instance of the discrete infinity of language, which only a recursive mechanism may derive. The biological correlate of this recursive mechanism is yet to be discovered, but see some recent suggestions for a neural correlate of Merge in an fMRI study of some language areas of the brain, including Friederici et al. 2011 and Zaccarella & Friederici 2015.
We agree with Jackendoff’s (2011) statement that recursion is not unique to language. Indeed, it is not, but recursion in language has specific properties that may or may not be found elsewhere in the mind/brain or nature. It might be the case that recursion in language is specifically mediated by a functional category in syntactic derivations. This would not come as a surprise, since asymmetries are ubiquitous throughout many areas of biology. For example, according to Montell (2008:1505), asymmetry helps account for how cells move and divide, by constraining the dynamics: ‘It is probably generally the case that signalling pathways … function to localize mechanical forces asymmetrically within cells. By definition, an asymmetry in force will cause dynamics’. Montell’s lab has developed a new in vivo model for the study of cell motility and employs a powerful combination of molecular genetics, live imaging, and photo-manipulation techniques to decipher the molecular mechanisms that determine when, where, and how cells move. The asymmetry that brings about cell division and movement cannot readily be equated with the asymmetry of phrasal projections and displacements. It is unclear what language asymmetries might have in common with asymmetries such as these, given that their neural basis is as yet unknown, but the topic deserves further study.
Thus, binarity, recursion, and asymmetry are at the very core of language and biology. There is no one-to-one mapping between these properties in language and in other systems of biology, while homologies can be identified. Further understanding of these properties may help to elucidate possible relationships.
3.2. Evolution: gradualist and emergent views
Merge has been claimed to be at the root of the human capacity for language (Berwick 2011, Berwick & Chomsky 2016). This view has been challenged in several works, according to which the human capacity for language evolved gradually from simpler capacities. The following questions are at the center of the debate: Did the human capacity for language evolve gradually or all at once? Was it the result of an evolutionary leap? The following paragraphs briefly review the main claims of the gradualist and the emergent views on the topic. The role of experience is also considered.
According to the gradualist view (see e.g. Bickerton 1990, 1998, 2000, 2008, 2014), language evolved from proto-language, which is an intermediate step in the historical development of language; this is often represented in terms of linear precedence in historical stages.
(15) presyntactic stage > proto-syntax stage > modern syntax
While there is no direct evidence for these historical stages, there are several hypotheses about the properties of each of these stages apart from the simplistic view that the presyntactic stage consists of one-word expressions and the proto-syntactic stage of two-word expressions. According to Bickerton (1990), although words may have been uttered in short sequences, there were no rules in proto-language defining the well-formedness of strings, and therefore words in proto-language could not be said to belong to separate syntactic classes, such as Noun or Verb. Some theories of proto-language are related to the development of subject-predicate relations (Gil 2011). Other theories take proto-language to be limited to just concatenation of predicates. According to Hurford (2001), [End Page e216] proto-thought had something like the predicate calculus, but had no quantifiers or logical names. Jackendoff (1999, 2002) proposed that the relatively flat (nonhierarchical) structure of adjuncts, as well as the concatenation of compounds, still retains a bit of the flavor of proto-language. Progovac & Locke 2009 and Progovac 2010, 2015 proposed that English V-N compounds such as dare-devil should be analyzed as syntactic ‘fossils’ of a previous stage of syntax, now coexisting with more complex syntactic constructions.5 For Jackendoff (1999, 2002), minimal syntactic specification and extensive involvement of pragmatics are the hallmarks of what have been proposed to be syntactic fossils. Proto-language involves flat structure derived by concatenation or adjunction but not by binary-branching hierarchical structure. Basically, proto-language is a kind of communication system with no syntax.
According to the view that core properties of the language faculty evolved all at once (e.g. Chomsky 2008, 2011, Berwick & Chomsky 2011, Bolhuis et al. 2014, Hauser et al. 2014), there is no need to postulate a previous stage of ‘proto’ language. The emergence of core properties of the language faculty in humans could have resulted from minimal changes in the human brain, for example, in neural circuits. These neural circuits could have previously subserved nonlinguistic functions (Hauser et al. 2002). In the emergent view, there is no proto-language in language evolution, nor a preceding presyntactic (one-word) stage. The language faculty appeared late in historical development with the central operation of Merge. Merge is a binary operation deriving expressions that can be represented in terms of hierarchical branching structures. According to this view, language did not start from a simple stage and then evolve through more complex stages. In a recent review article, Hauser and colleagues (2014) argue that language origin and evolution are still a mystery, notwithstanding the forty years of research in the areas of comparative animal behavior, paleontology, archeology, molecular biology, and mathematical modeling. The authors point out that much of the so-called ‘progress’ in these areas is not supported by strong evidence offers no explanation for why and how human capacities for language evolved.
The two views of the origin and the development of language make different predictions for the properties of first language acquisition, as well as for language’s historical development. The gradualist view predicts that languages become more complex as they evolve over time (Hurford 2012, 2014). A different prediction is compatible with the emergent view of language, according to which the language faculty is stable. In this alternative view, given the effect of the principles of efficient computation that are external to the language faculty, there is a reduction of the computational load, which may result in the minimization of the length of derivations and the pronunciation of certain constituents (Chomsky 2005, 2013, Di Sciullo 2015, and related works). Evidence in favor of the second view is presented in §3.3.
Similarly, for language acquisition, the first hypothesis often presumes that the child’s knowledge of language develops mainly on the basis of exposure to data. The second hypothesis contends that children are genetically equipped to learn any natural language they are exposed to and almost always without formal instruction. In addition, the second hypothesis argues that the computational procedure of the language faculty is not occurrence- or string-dependent and that children will not typically make errors that contravene structure-dependent constraints, as could happen under the first hypothesis. [End Page e217]
Further support for the second hypothesis also comes from experimental results on the perception of functional elements by infants, including determiners and demonstratives, which indicate that infants have the ability to perceive functional structure, despite the fact they do not produce functional categories in their speech (Shi et al. 2006, Shi 2007, Shi & Lepage 2008). In the emergent view of language, the lack of overt functional elements in infants’ speech, as well as the absence of overt functional structure in certain ancient languages and in creoles, does not lead to the conclusion that functional structure evolves from a state where functional structure was lacking. Covert functional structure is already in place to begin with and is necessary to account for the properties that so-called ‘proto’ languages share with modern languages. For example, languages with apparent free word order at the clause level, such as Warlpiri, a central Australian Aboriginal language, were previously thought to be nonconfigurational languages, having a flat phrasal structure (16) instead of a hierarchical structure (17).
(16)
(17)
Several works showed, however, that Warlpiri’s clause-internal relations between anaphors and their antecedents are subject to the same configurational restrictions observed in other languages, including English (e.g. Hale 1983, Simpson 1991, Legate 2002). For example, nonfinite complementizers supplete according to the grammatical function of the controller of their PRO subject, as discussed in Hale 1983, Hale et al. 1995. Furthermore, both binding and control, defined on the basis of the asymmetrical c-command relation, as in Chomsky 1981, 1995, among others, indicate that Warlpiri’s syntax is not different from that of any other language with respect to configurationality.
As noted earlier, Merge recursively derives binary-branching hierarchical structures. It is simpler on formal grounds than operations deriving n-ary structures. It correctly derives the asymmetrical relations between the constituents of linguistic expressions. It is also motivated from an evolutionary developmental perspective. According to the emergent view of language, the language faculty is likely to have emerged all at once, quite recently in evolutionary terms, as a consequence of a minimal change in the wiring of the brain. It emerged possibly at a point in time when perceptual and motor mechanisms were already in place. From this perspective, Merge did not have proto-Merge, a concatenating operation, as its predecessor. The concatenation operation is formally distinct from Merge. Furthermore, if the language faculty is human-specific and nonhuman primates can learn to produce expressions equivalent to proto-language, viz., flat structures generated by finite-state grammars, then proto-language is not the predecessor of human language. Proto-language is not conceivable in the view that the language faculty emerged all at once.6 In contrast, Merge elegantly expresses the combinatorial capacity of the language [End Page e218] faculty, as represented by hierarchical structures. Experience, while a necessary factor in language growth in the individual and development over time, does not affect the core properties of the central operations of the language faculty.
3.3. Factors external to the language faculty
Chomsky (2005:6) suggested a few candidates for the so-called ‘third factor’, viz. ‘principles of data analysis that might be used in language acquisition and other domains, principles of structural architecture and developmental constraints that enter into canalization, organic form, and action over a wide range, including principles of efficient computation’. Since language is a computational system, Chomsky suggests that principles of efficient computation might ‘be of particular significance’.
Several works in mathematics and in computer sciences provide methods to measure the complexity of expressions, including strings of characters (K-complexity; Kolmogorov 1965). Linguistic expressions are not strings of characters, however, and their complexity goes beyond the number of characters, lexical items, or phrases they include. The question arises of whether standard complexity metrics are of any relevance for measuring the complexity of the expressions derived by the operations of the language faculty. Chomsky’s 1956 hierarchy of formal grammars provides a basic tool to evaluate the complexity of languages on the basis of generative capacity. Several works on human-animal studies use this hierarchy as a baseline. Earlier works in psycholinguistics (Fodor et al. 1974) focus on the computational load associated with the number of applications of the operations of the grammar (derivational complexity). The recursive operations of the language faculty bring about complexity that can be tractable by the human brain up to a certain limit imposed by other subsystems of the brain, including memory (e.g. Chomsky & Miller 1963, Miller & Chomsky 1963, Bever 1970, Kimball 1973). Other works discuss the notions of complexity in terms of the number of steps (decision points) necessary to acquire a grammar (e.g. Yang 2002, Zeijlstra 2008, de Villiers & Roeper 2011). Current research (Chomsky 2011, 2013, 2015b) suggests that principles of efficient computation may very well be reduced to natural laws, operative in other natural systems.
We would like to suggest symmetry breaking as another third-factor candidate. Symmetry breaking is not specific to language or to any other cognitive domain for that matter. It is ubiquitous throughout the natural sciences. In physics, symmetry is an invariance property of a system under a set of transformations. Anderson (1972:394) describes it as ‘the existence of different viewpoints from which the system appears the same’. For example, human faces have approximate reflection symmetry, because humans look approximately the same in a photograph as in a mirror. A sphere has rotational symmetry because it looks the same no matter how it is rotated. Symmetry breaking is the process by which such uniformity/invariance is broken, or the number of points to view invariance is reduced, in order to generate a more structured asymmetrical state. Symmetry breaking is a prevalent process in biology, because organismal survival depends critically on well-defined structures and patterns at both microscopic and macroscopic scales. At the subcellular level it can lead to the establishment of a persistent polarity of growth to generate the distinct cell shapes required for such processes as cell division and cell fusion. [End Page e219] Li and Bowerman (2010:4) define it as ‘a result of the interplay between the system dynamics and the internal or external cues that initiate and/or orient the eventual outcome’, a ‘modern take’ on thoughts by Thompson (1942). Kuroda (2015) reviews investigations aimed at elucidating the molecular basis of left-right symmetry breaking in snails, whose chirality is determined by a single gene locus.
In addition to its applications in physics and biology, symmetry breaking has applications in language. It also might very well contribute to the explication of principles of language in terms of principles operative elsewhere in biology. Let us illustrate this with a few examples. Moro (2000) proposed that points of symmetry can be derived in syntax, such as in the case of direct and inverse copular constructions, where one or the other constituent of the small clause in the domain of the copula must be displaced in order to break the symmetry. Di Sciullo (2005) showed that points of symmetry never arise in morphology, since morphological operations combine objects, called ‘minimal trees’ (i.e. trees with one complement and one specifier only), whose internal structures are already asymmetrical. Thus, parts of words cannot be reordered without destroying the integrity of their structure. This might be possible in the syntax, however, where syntactic operations do not necessarily combine minimal trees. In syntactic derivations, points of symmetry, in the sense of Moro (2000), can be derived, and the reordering of constituents would be integrity preserving. Chomsky (2013, 2015b) relies on symmetry breaking to derive the effect of the extended projection principle (EPP), according to which the DP-subject generated within the verbal projection vP must raise to the specifier of TP, as a consequence of the labeling algorithm. It has also been proposed that symmetry breaking contributes to reducing the complexity that arises in diachronic language variation under the influence of language acquisition, languages in contact, and pragmatic factors. The directional asymmetry principle (18), from Di Sciullo 2011, has been proposed on the basis of the historical development of possessive pronouns in Greek and in Italian. This principle, which has correlates in evolutionary developmental biology, reduces the complexity that arises in the development of functional elements in the extended nominal projection, alongside other principles of efficient computation.7
(18) Directional asymmetry principle: Language development is symmetry breaking.
The predictions of the directional asymmetry principle have been validated on the basis of the historical development of the definite determiner from Old to Modern Romanian (Di Sciullo & Somesfalean 2013, 2015), as well as on the basis of the historical development of prepositions in Indo-European languages (Di Sciullo & Nicolis 2012, Di Sciullo et al. 2017). For example, in the development of both English and Italian, fluctuation in the pre- vs. post-position of the pronominal complement of the preposition is observed in earlier stages of these languages, whereas only the prepositional variant remains in Modern English and Italian. While both orders 19a and 19b are attested in Old English, this is no longer the case in Middle English or in Modern English, where a pronoun may only follow a prepositional head.
(19)
a.
[End Page e220]
b.
Likewise, the analysis of Boccaccio’s Decameron and of a thirteenth-century Old Florentine corpus TLIO reveals that P uniformly precedes its complement except in the case of the preposition con, where monosyllabic personal pronouns are cliticized onto the preposition (meco, teco, seco; examples 20, 21). Instances where con precedes a monosyllabic personal pronoun are also attested (22). In this diachronic phase (thirteenth- and fourteenth-centuries), meco, teco, and seco appear more often without con than with con (23). Modern Italian attests exclusive pronominal use of con, for example, con me, con te, con se.
(20) …. e per li compagnoni che teco fuggiro, per li dei … (Brunetto, Rettorica)
(21) neiente de lo mondo; con te le tue, parole voria conte avere … (Rinuccino, Sonetti)
(22) E perciò ch’ io so bene ch’ assai val meglio che tu parli con teco, che né io né altri, sì fo io fine alla mia diceria. (Brunetto, ProLigario)
(23) Non ti dar malinconia, figliuola, no, che egli si fa bene anche qua; Neerbale ne servira bene con esso teco Domenedio. (Boccaccio, The Decameron)
The directional asymmetry principle is not a global principle predicting the overall directionality of diachronic variation. It is a local principle applying to micro feature structures, for example, the microstructure consisting of a functional head and its complement. Once an asymmetrical stage is attained—that is, a stage where a choice point arises in the derivation of a given microstructure—the directional asymmetry principle predicts that this point of symmetry will gradually be eliminated. For example, while there is fluctuation in the position of the pronominal complement with respect to its comitative prepositional head in Old Italian (21–23), only the prepositional structure survives in Modern Italian, as discussed in Di Sciullo et al. 2017 and summarized here. We assume that the language faculty is stable, that languages vary given contact with the environment, and that linguistic variation in word order is the consequence of a change in the properties of grammatical features, triggering or not the displacement of a constituent. Thus, PPs are universally head-initial (Kayne 1994, 2005); all object DPs move to F to check [uD] on F, where [uD] is plausibly Case. Further movement of DP is attested in postpositional languages, as the following P shells illustrate.
(24)
a.
b.
We have the dynamics of historical variation on the one hand, parametric pressure enforced by the principle of preservation, and on the other hand, principles reducing complexity. Thus two P heads, con me and meco, are part of the P-shell (25a,b). Both P heads are pronounced in a given linguistic expression at a given point of the historical development of Italian, con meco (25c); this derivation is too costly because it goes against the economy principle ‘pronounce the minimum’; as a consequence, con meco is eliminated. Finally, meco is eliminated because of the economy principle of preservation; [End Page e221] Modern Italian will thus display only one option: P DP, con me. This follows ultimately from the directional asymmetry principle, a biologically grounded principle external to the language faculty, which drives evolution from a fluctuating asymmetry phase R(a,b) & R(b,a) to a directional asymmetry phase R(a,b), where R is a head-complement relation in this case.
(25)
a.
b.
c.
d.
While the computational procedure of the narrow language faculty is stable and reduced to a minimum, complexity may arise from experience (language acquisition, language contact, etc.), giving rise to choice points (symmetry) in functional feature structure, as illustrated in 26 where a set of features of a functional element includes both a valued and an unvalued feature variant of feature F, having the consequence of enlarging the set of possible derivations. Economy principles, falling into the third factor, will eliminate the complexity by breaking the symmetry brought about by experience.
(26)
a. F: {[F], [uF]}
b.
Complexity-reducing principles, such as the directional asymmetry principle, will manifest themselves overtly whenever grammatical principles stop mandating certain operations. Whenever the choice between two competing structures is not mandated by formal grammar principles, third-factor principles will exert their pressure, reshaping the system to reduce choice points (e.g. points of symmetry).
The directional asymmetry principle contrasts with Greenberg’s (1966) absolute and implicational universals, such as the ones for prepositions, as well as more recent proposals, including Biberauer, Holmberg, and Roberts’s (2014) proposal on head-directionality and complementation in extended projections, and Kayne’s (2011b) proposal on head-directionality and Probe-goal search. The directional asymmetry principle is a developmental universal, which provides a new approach to language variation (see Di Sciullo 2012a, Di Sciullo et al. 2017 for discussion).
As mentioned previously, symmetry breaking may help us understand the biological bases of language. There is an interesting parallel in the dynamics of variance in the [End Page e222] form of bipartite organisms and in the form of functional projections, suggesting that the language faculty is subject to external evolutionary developmental laws that may affect the external shape of the objects it generates. We mention two examples of such parallelisms. Palmer (1996, 2004, 2009, 2012) identifies phylogenetic patterns of variance in the evolution of bilateral asymmetric species. Three stages in evolution and change are identified. First is the symmetric stage, in which there is no left or right difference in the organism. The following antisymmetric stage presents random prominence of the right or the left side of organisms. In the last stage, the asymmetric stage, the prominence is observed only of the right or only of the left side of the organism. Palmer (2004) illustrates the evolution and development of claw asymmetry in male fiddler crabs. In evolutionary developmental biology, random asymmetry (or antisymmetry) is the stage where right- and left-handed are equally frequent in a species, whereas fixed asymmetry or directional asymmetry is the following stage where there are only right- or only left-handed forms in a species. Further work on the genetics and the evolution of floral morphology also indicates the development of asymmetries from a primary symmetry-breaking step.
Levin and Palmer (2007) show a case of floral bending of the style, in which at an earlier stage of evolution, the flowers of individual plants had both right- and left-bending styles (antisymmetric, not genetically determined). Then, at a later stage of evolution, all flowers on an individual plant bent in the same direction. In a given population, around fifty percent bent to the right and fifty percent to the left. In this latter scenario, the development of the floral bending was shown to be under genetic control.
Symmetry breaking may also play a role in language acquisition. Language develops in the child because of the unique properties of the language faculty, enabling humans to naturally develop the grammar of the language(s) they are exposed to, notwithstanding the scarcity of the stimulus (Chomsky 1986, 2011, Berwick et al. 2011). The study of the relations between language development in the child and the historical development of languages has been a topic of research since the beginning of generative grammar (Lightfoot 1984, 1991). The idea is that by looking at historical change, it is possible to reconstruct what children at different times must have gone through. Lenneberg (1967) observed that irrespective of the language children are exposed to, they will develop that language, going through the same biologically determined steps that coincide with the development of motion. The relations between ontogeny (individual development) and phylogeny (evolution of species and lineages) in biology may further our understanding of the development of language in the child and language’s historical development. However, as discussed in Gould 1977, 2002, among other works, the view that ontogeny recapitulates phylogeny has been challenged. It might be the case instead that innovation is a central aspect of variation. This view may offer support to theories of language acquisition that account for the fact that children’s language is not identical to the languages they are exposed to. Assuming that the elements of linguistic variation are those that determine the growth of language in the individual (Chomsky 2005, 2007), antisymmetric stages are also part of language development. For example, new compounds can be coined in any language that has them. Children produce these forms quite early, around age two or three (Clark & Barron 1988, Hiramatsu et al. 2000, Nicoladis 2007), sometimes with meanings that they are unlikely to have heard before, and, as far as one can tell, without any formal instruction. Around three, children consistently produce compounds of the type V-N instead of N-V-er, and they go through an intermediate V-N-er stage, for example, bounce-ball, bounce-baller, ball-bouncer. Data from language development show that these stages in the acquisition of compounds could also be understood as undergoing a familiar biologically based symmetry breaking. [End Page e223]
Thus, although language appears to have unique properties, it remains an object of the natural world, and as such, it is subject to natural laws, including symmetry breaking, that are external to the language faculty. The evolutionary-developmental approach to language’s historical variation may contribute to the understanding of language as an object of the natural world, which is exposed to natural laws and may lead to the discovery of new sorts of universals accounting for the residue that binary parameters do not cover. It may also contribute to our understanding of why parameters emerge and why they can be reset over time, and thus help to unify principles of biolinguistics with other principles of the natural sciences.
4. Two views of the language faculty
It might be instructive to compare slightly different approaches from the biolinguistics perspective. We compare a biolinguistics program incorporating the minimalist program with the proposal outlined in Jackendoff 2011. We would argue that these two approaches are research variants within the biolinguistics framework.
Biolinguistics is the study of the biology of language. The main research areas of the field are knowledge of language, language acquisition, and evolution of language. This is true for any approach within the biolinguistics framework. Even when we look into more specific assumptions, we find that Jackendoff shares many assumptions with researchers working within the minimalist program (and other approaches): for example, that there is a faculty of language; that there are systems of syntax, semantics, lexicon, and phonology and mappings between them; that there is a UG of some kind; and that a useful distinction is that between the ‘broad language faculty’ and the ‘narrow language faculty’ defined in Hauser et al. 2002.
However, there are common misconceptions about biolinguistics that are worth mentioning. One of them is that the psychological/functional and neural levels are the only way biolinguists make connections between language and biology. We would insist that work on formal linguistics, including all of the work that Jackendoff has done in this area, from the extended standard theory on, is doing biology. For example, Jackendoff provides an analysis of sentences like Every acorn grew into an oak and other syntactic structures to construct arguments for notions like UG, structure-dependence, and poverty of stimulus (which he calls the paradox of language acquisition). He thus demonstrates that one can quite reasonably give arguments for innate structure (genetic endowment) based on linguistic structures without needing to bring in further speculations about FOXP2 or neural circuits, if they do not add anything to the argument. In doing so, biolinguists are outlining properties that the language faculty must have (in the narrow or broad sense) and are ‘doing biology’. This activity is analogous to Gregor Mendel showing what the internal properties of plants (and other organisms) must be to account for their inheritance patterns.
Thus, both the minimalist program and other frameworks in generative grammar assume that core research areas are knowledge of language, acquisition, and evolution and presuppose a (narrow/broad) faculty of language, systems of syntax, semantics, lexicon, and phonology, some variant of UG, genetic endowment, and poverty of stimulus (Jackendoff’s paradox of language acquisition). These frameworks may differ, however, on the architecture of the language faculty. Note that the issue of whether the architecture of the language faculty is ‘parallel’ does not automatically distinguish minimalist approaches from others. It is useful to underline that different proposals are available for the architecture of the language faculty/UG in generative grammar, starting with syntactic structure, the standard theory, the extended standard theory, government and binding, and the minimalist program. Different architectures have been [End Page e224] proposed with minimalism, including a linear model (Bobaljik 2012), a clash model (Uriagereka 2012), and a workspace model (Di Sciullo 2014). Furthermore, different implementations in a parallel architecture are available, including Jackendoff 2002 and Culicover & Jackendoff 2005, as well in lexical-functional grammar, autolexical syntax, and role-and-reference grammar. Thus, the architecture of the language faculty has been part of the research agenda since its beginnings.
The minimalist program has put forth extensive proposals in the literature about the computations required to account for syntactic phenomena—such as syntactic constraints, structure dependencies, and locality—that involve the architecture of the faculty of language, principles of efficient computation, and so forth. The goal is to reduce the technical machinery of the grammar to a minimum. Other approaches, by contrast, have adopted constraint-driven unification in their frameworks, for example, incorporating the computational operation of Unification (Shieber 1985): ‘The brain’s characteristic combinatorial operation is Unification rather than Merge’ (Jackendoff 2011: 603). Although we cannot do a step-by-step comparison of derivations in the space here, we note that there are many studies that have pointed out that the Unification operation is too powerful and less restrictive in various ways (see e.g. Berwick & Weinberg 1984, Johnson 1988, Kobele 2006). Jackendoff (2011:603) also disputes the role of recursion in language, arguing that ‘recursion is not the defining characteristic of language; it is found everywhere in higher cognition’. However, the claim that there are recursive mechanisms in other cognitive modules, such as vision (but see Ullman 1979, 1996, 2006 for a different view), is not incompatible with the minimalist program. But evidence is required to show that the recursive mechanisms are the same across different cognitive domains. Pinker and Jackendoff (2005) have also noted the claims that a particular language might not employ recursion (e.g. Everett 2004, 2005), though see Nevins et al. 2009 for convincing counterarguments. Most work in the minimalist program proposes that recursive mechanisms be available in UG, part of the genetic endowment. Jackendoff also argues for the need for redundancy in the grammar and contrasts this to other works in the minimalist program. But his arguments apply primarily to the lexicon, while the minimalist work he is criticizing pertains to syntactic derivations, where it has been argued that in many cases what appeared to be a syntactic redundancy disappeared upon closer examination with a simpler reformulation of the syntactic computation.
Jackendoff’s 2011 proposal, however, does differ from some other specific proposals in the minimalist program, such as those in Chomsky 2005 and Di Sciullo & Boeckx 2011 regarding principles not specific to the faculty of language, the so-called third-factor principles. He does seem to have objections to what Chomsky 2005 calls third-factor principles, what Jackendoff calls ‘first principles’—principles not specific to the faculty of language. He regards this as misguided effort to ‘eliminate the role of natural selection’:
The situation parallels streamlining in dolphins. The advantage of streamlining is a consequence of natural law. But dolphins still had to evolve this shape, presumably through natural selection. In other words, in these cases, natural law does not eliminate the role of natural selection, as (I think) Chomsky is suggesting; rather, it shapes the adaptive landscape within which natural selection operates.
(2011:605)
Jackendoff appears to be objecting to introducing considerations of natural law into the study of biology of language: ‘The biolinguistic approach seeks to derive properties of language from what is called “natural law” or “third factor considerations”, so that they are somehow not a burden on natural selection’ (p. 604). What is meant here by ‘burden on natural selection’? Jackendoff elaborates with an example, noting that natural law (or physics) is involved in digitizing vocal signals: ‘But notice that this does not take [End Page e225] the burden off of evolution’ (p. 604). He seems to believe that the reason physics laws are introduced into biology is to take the burden off evolution. In other words, if we have an explanation in terms of physical laws, then we can ‘eliminate natural selection’, even evolution, from our explanatory accounts. Nothing could be further from the truth: what the biolinguist and the biologist more generally are trying to do is not to eliminate natural selection, much less evolution, but rather to understand evolution, and to do that one must understand how various principles from physics, chemistry, and biology interact with one another.
Finally, contrary to common assumptions, we consider that the search for the principles of evolutionary and developmental biology that could have led to a language faculty is not premature, as links have already been identified between what we know of the nature of linguistic structure and what is known about the genetic basis of biological development. The unification of linguistics with biology and physics is often misunderstood. By introducing considerations of physics and mathematics (‘the Galilean method’) into linguistics and other areas of biology, it will be possible to derive the properties of language from deep and simple principles. The program initiated by Thompson (1942) and Turing (1952), among others, which noted the importance of physical factors in understanding the mechanisms, development, and evolution of biological organisms, has become increasingly important for the analysis of biological systems as a whole and are under intensive study in a number of areas, now familiar as ‘systems biology’, ‘self-organization’, and so forth.
Although, as we have seen, there are differences between the two ‘views’ above of the language faculty, they are not as divergent as one might think at first glance. In fact, when comparing any two approaches to the biology of language, a good starting point is always to ask the same high-level questions of each approach. Does each approach try to answer the standard questions asked about any biological system—what is its structure/function, how does it develop (ontogeny), and how does it evolve (phylogeny)? And for the faculty of language, one can ask questions about its neural (and genetic) basis: What is the internal structure of the language faculty that underlies the traditional mapping between sound and meaning? Further, does the approach try to account for the disparity between the richness of language attained and the paucity of experience (‘poverty of stimulus’) by some mechanisms deriving from genetic endowment (e.g. UG)? Does the language faculty have commonalities with other cognitive systems or with biological systems in other species? Are there principles unifying these systems, or even originating from natural sciences other than biology, such as physics?
This is not meant to be a comprehensive list. But the rule of thumb is that biolinguistics asks the same kinds of questions in the study of the biology of language that one can ask for any other biological system. We see that the views under discussion all fall within the biolinguistics framework. They have similar answers to many of the questions asked above, but differ in other respects, pointing the way to further investigation.
5. Conclusion
Biolinguistics is in our view a most promising field, bridging discoveries in linguistics and the natural sciences in order to further our understanding of the human language faculty as a unique biological object. By focusing on core aspects of this field, identifying the relevance of core notions in current generative grammar to the biological study of language, and bringing to the fore new developments, we hope to foster further contributions to this research program.
In the past sixty years or so, we have seen an explosion of interdisciplinary work in the various subfields of biolinguistics (a partial list): theoretical linguistics (syntax, semantics, [End Page e226] morphology, lexicon, phonology, phonetics, pragmatics, etc.), computational linguistics (parsing, etc.), child language acquisition, multilingual (bilingual, etc.) acquisition, perceptual studies, language change, comparative linguistics (typology, etc.), sign language, language contact (pidgins, creoles, etc.), speech disorders (dyslexia, developmental verbal dyspraxia, specific language impairment, etc.), language savants, language neurology (function, anatomy, architectonics, etc.), cross-species comparative work (nonhuman primates, songbirds, etc.), mathematical modeling and simulation (language change, development, evolution, etc.), and other cognitive domains (mathematics, vision, music, etc.). All of these areas are currently foci of active research.
Many years ago (1976), Chomsky said the following, at a symposium in honor of Eric Lenneberg: ‘The study of the biological basis for human language capacities may prove to be one of the most exciting frontiers of science in coming years’. Although we have only been able to briefly point the reader toward some of these exciting avenues of biolinguistic research, we hope that s/he will have the interest and opportunity to further explore this frontier.
[di_sciullo.anne-marie@uqam.ca]
[ljenkins2@comcast.net]
revision invited 9 06June 2013;
revision received 14 October 2014;
conditionally accepted 29 June 2015;
accepted 13 January 2016]
REFERENCES
Footnotes
* We wish to thank the participants of the Biolinguistic Conferences organized by Anna Maria Di Sciullo in 2010 and 2011 at UQAM, in 2013 at the GLOW meeting in Lund, and in 2015 at the IUSS Institute in Pavia, as well as three anonymous referees for valuable comments on an earlier version of this paper. This work has been supported in part by funding from the Social Sciences and Humanities Research Council of Canada to the Major Collaborative Research Initiative on Interface Asymmetries (http://www.interfaceasymmetry.uqam.ca/) and the Fonds Québécois de la recherche sur la société et la culture for the research on Dynamic Interfaces; see also http://www.biolinguistics.uqam.ca/.
1. Whether or not the application of Merge is constrained is subject to discussion. See Chomsky 1995, 2011, 2015a, b, Frampton & Gutman 2002, Di Sciullo & Isac 2008, Kayne 2011a, Zwart 2011, and Boeckx 2015, among other works.
2. The formal simplicity of the central operation of the language faculty can be appreciated by contrast with the rules proposed in earlier models in generative grammar. In Chomsky 1955, 1957, a set of phrase structure rules derived kernel sentences. Transformational rules, such as passive and affix hopping, were applied to kernel sentences and the former were combined using generalized transformations. The aspect model (Chomsky 1965), the ‘standard theory’, developed into the extended standard theory (Chomsky 1970) and still included several sorts of syntactic rules (see Emonds’s 1976 typology of transformations). In the government and binding model (Chomsky 1981), phrase structure rules were subsumed under the two metarules of X-bar theory, and the transformations were subsumed under Move NP and Move wh, and unified further into ‘Move α’, where α is a category. With the minimalist program (Chomsky 1995, 2000, 2005, 2013, 2015b), X-bar theory and the Move α modules were subsumed under Merge, along with the other modules of the grammar.
3. Blood-oxygen-level dependent contrast imaging (BOLD) is a method used in functional magnetic resonance imaging (fMRI) and measures the oxygen in blood-flow response when neurons are active.
5. See Di Sciullo 2013 and Nóberga & Miyagawa 2015 for arguments against a flat analysis of exocentric deverbal compounds in English and in other languages.
6. There is an alternative view of proto-language mentioned by a referee. According to this view, the modern language faculty developed not gradually but in three semi-discrete leaps forward—for example, external Merge of a predicate and its arguments, then clausal embedding, then movement/internal Merge. That could, according to the referee, be still more like an emergent picture than a gradualist picture, but it would yield a well-defined sense of proto-language: the language that resulted after the first one or two leaps forward would be proto-language. The alternative the referee suggests is a more articulated notion of proto-language that has already been proposed by linguists. However, it does not seem plausible on simplicity grounds. Restricting Merge to external Merge in previous historical stages of language evolution is a complication, an unnecessary stipulation.
7. These principles include Minimal link: Limit the search space (Chomsky 1995), Pronounce the minimum: Limit the externalization (Chomsky 2011), Minimize length of derivations: Limit the computation (Di Sciullo 2012c), and Minimize symmetry: Limit the choice points (Moro 2000, Di Sciullo 2005).