PNAS
BARBARA H PARTEE | Change Password | Change User Info | CiteTrack Alerts | Access Rights | Subscription Help | Sign Out
HelpSubscriptionsFeedbackSign In

Science, Vol 298, Issue 5598, 1569-1579 , 22 November 2002
Abstract of this Article
PDF Version of this Article
dEbates: Submit a response to this article
 
Related commentary and articles in Science products
 
Download to Citation Manager
Alert me when:
new articles cite this article
 
Search for similar articles in:
  Science Online
  ISI Web of Science
  PubMed
Search Medline for articles by:
Hauser, M. D. || Fitch, W. T.
Search for citing articles in:
  ISI Web of Science (55)
  HighWire Press Journals
 
This article appears in the following Subject Collections:
Neuroscience
[DOI: 10.1126/science.298.5598.1569]

 Previous Article Table of Contents  Next Article 

NEUROSCIENCE:
The Faculty of Language: What Is It, Who Has It, and How Did It Evolve?

Marc D. Hauser,1* Noam Chomsky,2 W. Tecumseh Fitch1

We argue that an understanding of the faculty of language requires substantial interdisciplinary cooperation. We suggest how current developments in linguistics can be profitably wedded to work in evolutionary biology, anthropology, psychology, and neuroscience. We submit that a distinction should be made between the faculty of language in the broad sense (FLB) and in the narrow sense (FLN). FLB includes a sensory-motor system, a conceptual-intentional system, and the computational mechanisms for recursion, providing the capacity to generate an infinite range of expressions from a finite set of elements. We hypothesize that FLN only includes recursion and is the only uniquely human component of the faculty of language. We further argue that FLN may have evolved for reasons other than language, hence comparative studies might look for evidence of such computations outside of the domain of communication (for example, number, navigation, and social relations).

1 Department of Psychology, Harvard University, Cambridge, MA 02138, USA.
2 Department of Linguistics and Philosophy, Massachusetts Institute of Technology, Cambridge, MA 02138, USA.
*   To whom correspondence should be addressed. E-mail: mdhauser@wjh.harvard.edu


If a martian graced our planet, it would be struck by one remarkable similarity among Earth's living creatures and a key difference. Concerning similarity, it would note that all living things are designed on the basis of highly conserved developmental systems that read an (almost) universal language encoded in DNA base pairs. As such, life is arranged hierarchically with a foundation of discrete, unblendable units (codons, and, for the most part, genes) capable of combining to create increasingly complex and virtually limitless varieties of both species and individual organisms. In contrast, it would notice the absence of a universal code of communication (Fig. 1).
Fig. 1. The animal kingdom has been designed on the basis of highly conserved developmental systems that read an almost universal language coded in DNA base pairs. This system is shown on the left in terms of a phylogenetic tree. In contrast, animals lack a common universal code of communication, indicated on the right by unconnected animal groups. [Illustration: John Yanson] [View Larger Version of this Image (43K GIF file)]

If our martian naturalist were meticulous, it might note that the faculty mediating human communication appears remarkably different from that of other living creatures; it might further note that the human faculty of language appears to be organized like the genetic code--hierarchical, generative, recursive, and virtually limitless with respect to its scope of expression. With these pieces in hand, this martian might begin to wonder how the genetic code changed in such a way as to generate a vast number of mutually incomprehensible communication systems across species while maintaining clarity of comprehension within a given species. The martian would have stumbled onto some of the essential problems surrounding the question of language evolution, and of how humans acquired the faculty of language.

In exploring the problem of language evolution, it is important to distinguish between questions concerning language as a communicative system and questions concerning the computations underlying this system, such as those underlying recursion. As we argue below, many acrimonious debates in this field have been launched by a failure to distinguish between these problems. According to one view (1), questions concerning abstract computational mechanisms are distinct from those concerning communication, the latter targeted at problems at the interface between abstract computation and both sensory-motor and conceptual-intentional interfaces. This view should not, of course, be taken as a claim against a relationship between computation and communication. It is possible, as we discuss below, that key computational capacities evolved for reasons other than communication but, after they proved to have utility in communication, were altered because of constraints imposed at both the periphery (e.g., what we can hear and say or see and sign, the rapidity with which the auditory cortex can process rapid temporal and spectral changes) and more central levels (e.g., conceptual and cognitive structures, pragmatics, memory limitations).

At least three theoretical issues cross-cut the debate on language evolution. One of the oldest problems among theorists is the "shared versus unique" distinction. Most current commentators agree that, although bees dance, birds sing, and chimpanzees grunt, these systems of communication differ qualitatively from human language. In particular, animal communication systems lack the rich expressive and open-ended power of human language (based on humans' capacity for recursion). The evolutionary puzzle, therefore, lies in working out how we got from there to here, given this apparent discontinuity. A second issue revolves around whether the evolution of language was gradual versus saltational; this differs from the first issue because a qualitative discontinuity between extant species could have evolved gradually, involving no discontinuities during human evolution. Finally, the "continuity versus exaptation" issue revolves around the problem of whether human language evolved by gradual extension of preexisting communication systems, or whether important aspects of language have been exapted away from their previous adaptive function (e.g., spatial or numerical reasoning, Machiavellian social scheming, tool-making).

Researchers have adopted extreme or intermediate positions regarding these basically independent questions, leading to a wide variety of divergent viewpoints on the evolution of language in the current literature. There is, however, an emerging consensus that, although humans and animals share a diversity of important computational and perceptual resources, there has been substantial evolutionary remodeling since we diverged from a common ancestor some 6 million years ago. The empirical challenge is to determine what was inherited unchanged from this common ancestor, what has been subjected to minor modifications, and what (if anything) is qualitatively new. The additional evolutionary challenge is to determine what selectional pressures led to adaptive changes over time and to understand the various constraints that channeled this evolutionary process. Answering these questions requires a collaborative effort among linguists, biologists, psychologists, and anthropologists.

One aim of this essay is to promote a stronger connection between biology and linguistics by identifying points of contact and agreement between the fields. Although this interdisciplinary marriage was inaugurated more than 50 years ago, it has not yet been fully consummated. We hope to further this goal by, first, helping to clarify the biolinguistic perspective on language and its evolution (2-7). We then review some promising empirical approaches to the evolution of the language faculty, with a special focus on comparative work with nonhuman animals, and conclude with a discussion of how inquiry might profitably advance, highlighting some outstanding problems.

We make no attempt to be comprehensive in our coverage of relevant or interesting topics and problems. Nor is it our goal to review the history of the field. Rather, we focus on topics that make important contact between empirical data and theoretical positions about the nature of the language faculty. We believe that if explorations into the problem of language evolution are to progress, we need a clear explication of the computational requirements for language, the role of evolutionary theory in testing hypotheses of character evolution, and a research program that will enable a productive interchange between linguists and biologists.

Defining the Target: Two Senses of the Faculty of Language

The word "language" has highly divergent meanings in different contexts and disciplines. In informal usage, a language is understood as a culturally specific communication system (English, Navajo, etc.). In the varieties of modern linguistics that concern us here, the term "language" is used quite differently to refer to an internal component of the mind/brain (sometimes called "internal language" or "I-language"). We assume that this is the primary object of interest for the study of the evolution and function of the language faculty. However, this biologically and individually grounded usage still leaves much open to interpretation (and misunderstanding). For example, a neuroscientist might ask: What components of the human nervous system are recruited in the use of language in its broadest sense? Because any aspect of cognition appears to be, at least in principle, accessible to language, the broadest answer to this question is, probably, "most of it." Even aspects of emotion or cognition not readily verbalized may be influenced by linguistically based thought processes. Thus, this conception is too broad to be of much use. We therefore delineate two more restricted conceptions of the faculty of language, one broader and more inclusive, the other more restricted and narrow (Fig. 2).
Fig. 2. A schematic representation of organism-external and -internal factors related to the faculty of language. FLB includes sensory-motor, conceptual-intentional, and other possible systems (which we leave open); FLN includes the core grammatical computations that we suggest are limited to recursion. See text for more complete discussion. [View Larger Version of this Image (47K GIF file)]

Faculty of language--broad sense (FLB). FLB includes an internal computational system (FLN, below) combined with at least two other organism-internal systems, which we call "sensory-motor" and "conceptual-intentional." Despite debate on the precise nature of these systems, and about whether they are substantially shared with other vertebrates or uniquely adapted to the exigencies of language, we take as uncontroversial the existence of some biological capacity of humans that allows us (and not, for example, chimpanzees) to readily master any human language without explicit instruction. FLB includes this capacity, but excludes other organism-internal systems that are necessary but not sufficient for language (e.g., memory, respiration, digestion, circulation, etc.).

Faculty of language--narrow sense (FLN). FLN is the abstract linguistic computational system alone, independent of the other systems with which it interacts and interfaces. FLN is a component of FLB, and the mechanisms underlying it are some subset of those underlying FLB.

Others have agreed on the need for a restricted sense of "language" but have suggested different delineations. For example, Liberman and his associates (8) have argued that the sensory-motor systems were specifically adapted for language, and hence should be considered part of FLN. There is also a long tradition holding that the conceptual-intentional systems are an intrinsic part of language in a narrow sense. In this article, we leave these questions open, restricting attention to FLN as just defined but leaving the possibility of a more inclusive definition open to further empirical research.

The internal architecture of FLN, so conceived, is a topic of much current research and debate (4). Without prejudging the issues, we will, for concreteness, adopt a particular conception of this architecture. We assume, putting aside the precise mechanisms, that a key component of FLN is a computational system (narrow syntax) that generates internal representations and maps them into the sensory-motor interface by the phonological system, and into the conceptual-intentional interface by the (formal) semantic system; adopting alternatives that have been proposed would not materially modify the ensuing discussion. All approaches agree that a core property of FLN is recursion, attributed to narrow syntax in the conception just outlined. FLN takes a finite set of elements and yields a potentially infinite array of discrete expressions. This capacity of FLN yields discrete infinity (a property that also characterizes the natural numbers). Each of these discrete expressions is then passed to the sensory-motor and conceptual-intentional systems, which process and elaborate this information in the use of language. Each expression is, in this sense, a pairing of sound and meaning. It has been recognized for thousands of years that language is, fundamentally, a system of sound-meaning connections; the potential infiniteness of this system has been explicitly recognized by Galileo, Descartes, and the 17th-century "philosophical grammarians" and their successors, notably von Humboldt. One goal of the study of FLN and, more broadly, FLB is to discover just how the faculty of language satisfies these basic and essential conditions.

The core property of discrete infinity is intuitively familiar to every language user. Sentences are built up of discrete units: There are 6-word sentences and 7-word sentences, but no 6.5-word sentences. There is no longest sentence (any candidate sentence can be trumped by, for example, embedding it in "Mary thinks that ..."), and there is no nonarbitrary upper bound to sentence length. In these respects, language is directly analogous to the natural numbers (see below).

At a minimum, then, FLN includes the capacity of recursion. There are many organism-internal factors, outside FLN or FLB, that impose practical limits on the usage of the system. For example, lung capacity imposes limits on the length of actual spoken sentences, whereas working memory imposes limits on the complexity of sentences if they are to be understandable. Other limitations--for example, on concept formation or motor output speed--represent aspects of FLB, which have their own evolutionary histories and may have played a role in the evolution of the capacities of FLN. Nonetheless, one can profitably inquire into the evolution of FLN without an immediate concern for these limiting aspects of FLB. This is made clear by the observation that, although many aspects of FLB are shared with other vertebrates, the core recursive aspect of FLN currently appears to lack any analog in animal communication and possibly other domains as well. This point, therefore, represents the deepest challenge for a comparative evolutionary approach to language. We believe that investigations of this capacity should include domains other than communication (e.g., number, social relationships, navigation).

Given the distinctions between FLB and FLN and the theoretical distinctions raised above, we can define a research space as sketched in Fig. 3. This research space identifies, as viable, problems concerning the evolution of sensory-motor systems, of conceptual-intentional systems, and of FLN. The comparative approach, to which we turn next, provides a framework for addressing questions about each of these components of the faculty of language.


Fig. 3. Investigations into the evolution of the faculty of language are confronted with a three-dimensional research space that includes three comparative-evolutionary problems cross-cut by the core components of the faculty of language. Thus, for each problem, researchers can investigate details of the sensory-motor system, the conceptual-intentional system, FLN, and the interfaces among these systems. [View Larger Version of this Image (39K GIF file)]

The Comparative Approach to Language Evolution

The empirical study of the evolution of language is beset with difficulties. Linguistic behavior does not fossilize, and a long tradition of analysis of fossil skull shape and cranial endocasts has led to little consensus about the evolution of language (7, 9). A more tractable and, we think, powerful approach to problems of language evolution is provided by the comparative method, which uses empirical data from living species to draw detailed inferences about extinct ancestors (3, 10-12). The comparative method was the primary tool used by Darwin (13, 14) to analyze evolutionary phenomena and continues to play a central role throughout modern evolutionary biology. Although scholars interested in language evolution have often ignored comparative data altogether or focused narrowly on data from nonhuman primates, current thinking in neuroscience, molecular biology, and developmental biology indicates that many aspects of neural and developmental function are highly conserved, encouraging the extension of the comparative method to all vertebrates (and perhaps beyond). For several reasons, detailed below, we believe that the comparative method should play a more central role in future discussions of language evolution.

An overarching concern in studies of language evolution is with whether particular components of the faculty of language evolved specifically for human language and, therefore (by extension), are unique to humans. Logically, the human uniqueness claim must be based on data indicating an absence of the trait in nonhuman animals and, to be taken seriously, requires a substantial body of relevant comparative data. More concretely, if the language evolution researcher wishes to make the claim that a trait evolved uniquely in humans for the function of language processing, data indicating that no other animal has this particular trait are required.

Although this line of reasoning may appear obvious, it is surprisingly common for a trait to be held up as uniquely human before any appropriate comparative data are available. A famous example is categorical perception, which when discovered seemed so finely tuned to the details of human speech as to constitute a unique human adaptation (15, 16). It was some time before the same underlying perceptual discontinuities were discovered in chinchillas and macaques (17, 18), and even birds (19), leading to the opposite conclusion that the perceptual basis for categorical perception is a primitive vertebrate characteristic that evolved for general auditory processing, as opposed to specific speech processing. Thus, a basic and logically ineliminable role for comparative research on language evolution is this simple and essentially negative one: A trait present in nonhuman animals did not evolve specifically for human language, although it may be part of the language faculty and play an intimate role in language processing. It is possible, of course, that a trait evolved in nonhuman animals and humans independently, as analogs rather than homologs. This would preserve the possibility that the trait evolved for language in humans but evolved for some other reason in the comparative animal group. In cases where the comparative group is a nonhuman primate, and perhaps especially chimpanzees, the plausibility of this evolutionary scenario is weaker. In any case, comparative data are critical to this judgment.

Despite the crucial role of homology in comparative biology, homologous traits are not the only relevant source of evolutionary data. The convergent evolution of similar characters in two independent clades, termed "analogies" or "homoplasies," can be equally revealing (20). The remarkably similar (but nonhomologous) structures of human and octopus eyes reveal the stringent constraints placed by the laws of optics and the contingencies of development on an organ capable of focusing a sharp image onto a sheet of receptors. Detailed analogies between the parts of the vertebrate and cephalopod eye also provide independent evidence that each component is an adaptation for image formation, shaped by natural selection. Furthermore, the discovery that remarkably conservative genetic cascades underlie the development of such analogous structures provides important insights into the ways in which developmental mechanisms can channel evolution (21). Thus, although potentially misleading for taxonomists, analogies provide critical data about adaptation under physical and developmental constraints. Casting the comparative net more broadly, therefore, will most likely reveal larger regularities in evolution, helping to address the role of such constraints in the evolution of language.

An analogy recognized as particularly relevant to language is the acquisition of song by birds (12). In contrast to nonhuman primates, where the production of species-typical vocalizations is largely innate (22), most songbirds learn their species-specific song by listening to conspecifics, and they develop highly aberrant song if deprived of such experience. Current investigation of birdsong reveals detailed and intriguing parallels with speech (11, 23, 24). For instance, many songbirds pass through a critical period in development beyond which they produce defective songs that no amount of acoustic input can remedy, reminiscent of the difficulty adult humans have in fully mastering new languages. Further, and in parallel with the babbling phase of vocalizing or signing human infants (25), young birds pass through a phase of song development in which they spontaneously produce amorphous versions of adult song, termed "subsong" or "babbling." Although the mechanisms underlying the acquisition of birdsong and human language are clearly analogs and not homologs, their core components share a deeply conserved neural and developmental foundation: Most aspects of neurophysiology and development--including regulatory and structural genes, as well as neuron types and neurotransmitters--are shared among vertebrates. That such close parallels have evolved suggests the existence of important constraints on how vertebrate brains can acquire large vocabularies of complex, learned sounds. Such constraints may essentially force natural selection to come up with the same solution repeatedly when confronted with similar problems.

Testing Hypotheses About the Evolution of the Faculty of Language

Given the definitions of the faculty of language, together with the comparative framework, we can distinguish several plausible hypotheses about the evolution of its various components. Here, we suggest two hypotheses that span the diversity of opinion among current scholars, plus a third of our own.

Hypothesis 1: FLB is strictly homologous to animal communication. This hypothesis holds that homologs of FLB, including FLN, exist (perhaps in less developed or otherwise modified form) in nonhuman animals (3, 10, 26). This has historically been a popular hypothesis outside of linguistics and closely allied fields, and has been defended by some in the speech sciences. According to this hypothesis, human FLB is composed of the same functional components that underlie communication in other species.

Hypothesis 2: FLB is a derived, uniquely human adaptation for language. According to this hypothesis, FLB is a highly complex adaptation for language, on a par with the vertebrate eye, and many of its core components can be viewed as individual traits that have been subjected to selection and perfected in recent human evolutionary history. This appears to represent the null hypothesis for many scholars who take the complexity of language seriously (27, 28). The argument starts with the assumption that FLB, as a whole, is highly complex, serves the function of communication with admirable effectiveness, and has an ineliminable genetic component. Because natural selection is the only known biological mechanism capable of generating such functional complexes [the argument from design (29)], proponents of this view conclude that natural selection has played a powerful role in shaping many aspects of FLB, including FLN, and, further, that many of these are without parallel in nonhuman animals. Although homologous mechanisms may exist in other animals, the human versions have been modified by natural selection to the extent that they can be reasonably seen as constituting novel traits, perhaps exapted from other contexts [e.g., social intelligence, tool-making (7, 30-32)].

Hypothesis 3: Only FLN is uniquely human. On the basis of data reviewed below, we hypothesize that most, if not all, of FLB is based on mechanisms shared with nonhuman animals (as held by hypothesis 1). In contrast, we suggest that FLN--the computational mechanism of recursion--is recently evolved and unique to our species (33, 34). According to this hypothesis, much of the complexity manifested in language derives from complexity in the peripheral components of FLB, especially those underlying the sensory-motor (speech or sign) and conceptual-intentional interfaces, combined with sociocultural and communicative contingencies. FLB as a whole thus has an ancient evolutionary history, long predating the emergence of language, and a comparative analysis is necessary to understand this complex system. By contrast, according to recent linguistic theory, the computations underlying FLN may be quite limited. In fact, we propose in this hypothesis that FLN comprises only the core computational mechanisms of recursion as they appear in narrow syntax and the mappings to the interfaces. If FLN is indeed this restricted, this hypothesis has the interesting effect of nullifying the argument from design, and thus rendering the status of FLN as an adaptation open to question. Proponents of the idea that FLN is an adaptation would thus need to supply additional data or arguments to support this viewpoint.

The available comparative data on animal communication systems suggest that the faculty of language as a whole relies on some uniquely human capacities that have evolved recently in the approximately 6 million years since our divergence from a chimpanzee-like common ancestor (35). Hypothesis 3, in its strongest form, suggests that only FLN falls into this category (34). By this hypothesis, FLB contains a wide variety of cognitive and perceptual mechanisms shared with other species, but only those mechanisms underlying FLN--particularly its capacity for discrete infinity--are uniquely human. This hypothesis suggests that all peripheral components of FLB are shared with other animals, in more or less the same form as they exist in humans, with differences of quantity rather than kind (9, 34). What is unique to our species is quite specific to FLN, and includes its internal operations as well as its interface with the other organism-internal systems of FLB.

Each of these hypotheses is plausible to some degree. Ultimately, they can be distinguished only by empirical data, much of which is currently unavailable. Before reviewing some of the relevant data, we briefly consider some key distinctions between them. From a comparative evolutionary viewpoint, an important question is whether linguistic precursors were involved in communication or in something else. Proponents of both hypotheses 1 and 2 posit a direct correspondence, by descent with modification, between some trait involved in FLB in humans and a similar trait in another species; these hypotheses differ in whether the precursors functioned in communication.Although many aspects of FLB very likely arose in this manner, the important issue for these hypotheses is whether a series of gradual modifications could lead eventually to the capacity of language for infinite generativity. Despite the inarguable existence of a broadly shared base of homologous mechanisms involved in FLB, minor modifications to this foundational system alone seem inadequate to generate the fundamental difference--discrete infinity--between language and all known forms of animal communication. This claim is one of several reasons why we suspect that hypothesis 3 may be a productive way to characterize the problem of language evolution.

A primary issue separating hypotheses 2 and 3 is whether the uniquely human capacities of FLN constitute an adaptation. The viewpoint stated in hypothesis 2, especially the notion that FLN in particular is a highly evolved adaptation, has generated much enthusiasm recently [e.g., (36)], especially among evolutionary psychologists (37, 38). At present, however, we see little reason to believe either that FLN can be anatomized into many independent but interacting traits, each with its own independent evolutionary history, or that each of these traits could have been strongly shaped by natural selection, given their tenuous connection to communicative efficacy (the surface or phenotypic function upon which selection presumably acted).

We consider the possibility that certain specific aspects of the faculty of language are "spandrels"--by-products of preexisting constraints rather than end products of a history of natural selection (39). This possibility, which opens the door to other empirical lines of inquiry, is perfectly compatible with our firm support of the adaptationist program. Indeed, it follows directly from the foundational notion that adaptation is an "onerous concept" to be invoked only when alternative explanations fail (40). The question is not whether FLN in toto is adaptive. By allowing us to communicate an endless variety of thoughts, recursion is clearly an adaptive computation. The question is whether particular components of the functioning of FLN are adaptations for language, specifically acted upon by natural selection--or, even more broadly, whether FLN evolved for reasons other than communication.

An analogy may make this distinction clear. The trunk and branches of trees are near-optimal solutions for providing an individual tree's leaves with access to sunlight. For shrubs and small trees, a wide variety of forms (spreading, spherical, multistalked, etc.) provide good solutions to this problem. For a towering rainforest canopy tree, however, most of these forms are rendered impossible by the various constraints imposed by the properties of cellulose and the problems of sucking water and nutrients up to the leaves high in the air. Some aspects of such trees are clearly adaptations channeled by these constraints; others (e.g., the popping of xylem tubes on hot days, the propensity to be toppled in hurricanes) are presumably unavoidable by-products of such constraints.

Recent work on FLN (4, 41-43) suggests the possibility that at least the narrow-syntactic component satisfies conditions of highly efficient computation to an extent previously unsuspected. Thus, FLN may approximate a kind of "optimal solution" to the problem of linking the sensory-motor and conceptual-intentional systems. In other words, the generative processes of the language system may provide a near-optimal solution that satisfies the interface conditions to FLB. Many of the details of language that are the traditional focus of linguistic study [e.g., subjacency, Wh- movement, the existence of garden-path sentences (4, 44)] may represent by-products of this solution, generated automatically by neural/computational constraints and the structure of FLB--components that lie outside of FLN. Even novel capacities such as recursion are implemented in the same type of neural tissue as the rest of the brain and are thus constrained by biophysical, developmental, and computational factors shared with other vertebrates. Hypothesis 3 raises the possibility that structural details of FLN may result from such preexisting constraints, rather than from direct shaping by natural selection targeted specifically at communication. Insofar as this proves to be true, such structural details are not, strictly speaking, adaptations at all. This hypothesis and the alternative selectionist account are both viable and can eventually be tested with comparative data.

Comparative Evidence for the Faculty of Language

Study of the evolution of language has accelerated in the past decade (45, 46). Here, we offer a highly selective review of some of these studies, emphasizing animal work that seems particularly relevant to the hypotheses advanced above; many omissions were necessary for reasons of space, and we firmly believe that a broad diversity of methods and perspectives will ultimately provide the richest answers to the problem of language evolution. For this reason, we present a broader sampler of the field's offerings in Table 1.

Table 1. A sampler of empirical approaches to understanding the evolution of the faculty of language, including both broad (FLB) and narrow (FLN) components.


Empirical problem Examples References

FLB--sensory-motor system
Vocal imitation and invention Tutoring studies of songbirds, analyses of vocal dialects in whales, spontaneous imitation of artificially created sounds in dolphins (11, 12, 24, 65)
Neurophysiology of action-perception systems Studies assessing whether mirror neurons, which provide a core substrate for the action-perception system, may subserve gestural and (possibly) vocal imitation (67, 68, 71)
Discriminating the sound patterns of language Operant conditioning studies of the prototype magnet effect in macaques and starlings (52, 120)
Constraints imposed by vocal tract anatomy Studies of vocal tract length and formant dispersion in birds and primates (54-61)
Biomechanics of sound production Studies of primate vocal production, including the role of mandibular oscillations (121, 122)
Modalities of language production and perception Cross-modal perception and sign language in humans versus unimodal communication in animals (3, 25, 123)
FLB--conceptual-intentional system
Theory of mind, attribution of mental states Studies of the seeing/knowing distinction in chimpanzees (84, 86-89)
Capacity to acquire nonlinguistic conceptual representations Studies of rhesus monkeys and the object/kind concept (10, 76, 77, 124)
Referential vocal signals Studies of primate vocalizations used to designate predators, food, and social relationships (3, 78, 90, 91, 93, 94, 97)
Imitation as a rational, intentional system Comparative studies of chimpanzees and human infants suggesting that only the latter read intentionality into action, and thus extract unobserved rational intent (125-127)
Voluntary control over signal production as evidence of intentional communication Comparative studies that explore the relationship between signal production and the composition of a social audience (3, 10, 92, 128)
FLN--recursion
Spontaneous and training methods designed to uncover constraints on rule learning Studies of serial order learning and finite-state grammars in tamarins and macaques (114, 116, 117, 129)
Sign or artificial language in trained apes and dolphins Studies exploring symbol sequencing and open-ended combinatorial manipulation (130, 131)
Models of the faculty of language that attempt to uncover the necessary and sufficient mechanisms Game theory models of language acquisition, reference, and universal grammar (72-74)
Experiments with animals that explore the nature and content of number representation Operant conditioning studies to determine whether nonhuman primates can represent number, including properties such as ordinality and cardinality, using such representations in conjunction with mathematical operands (e.g., add, divide) (102-106, 132)
Shared mechanisms across different cognitive domains Evolution of musical processing and structure, including analyses of brain function and comparative studies of music perception (133-135)

How "special" is speech? Comparative study of the sensory-motor system. Starting with early work on speech perception, there has been a tradition of considering speech "special," and thus based on uniquely human mechanisms adapted for speech perception and/or production [e.g., (7, 8, 47, 48)]. This perspective has stimulated a vigorous research program studying animal speech perception and, more recently, speech production. Surprisingly, this research has turned up little evidence for uniquely human mechanisms special to speech, despite a persistent tendency to assume uniqueness even in the absence of relevant animal data.

On the side of perception, for example, many species show an impressive ability to both discriminate between and generalize over human speech sounds, using formants as the critical discriminative cue (17-19, 49-51). These data provide evidence not only of categorical perception, but also of the ability to discriminate among prototypical exemplars of different phonemes (52). Further, in the absence of training, nonhuman primates can discriminate sentences from two different languages on the basis of rhythmic differences between them (53).

On the side of production, birds and nonhuman primates naturally produce and perceive formants in their own species-typical vocalizations (54-59). The results also shed light on discussions of the uniquely human structure of the vocal tract and the unusual descended larynx of our species (7, 48, 60), because new evidence shows that several other mammalian species also have a descended larynx (61). Because these nonhuman species lack speech, a descended larynx clearly has nonphonetic functions; one possibility is exaggerating apparent size. Although this particular anatomical modification undoubtedly plays an important role in speech production in modern humans, it need not have first evolved for this function. The descended larynx may thus be an example of classic Darwinian preadaptation.

Many phenomena in human speech perception have not yet been investigated in animals [e.g., the McGurk effect, an illusion in which the syllable perceived from a talking head represents the interaction between an articulatory gesture seen and a different syllable heard; see (62)]. However, the available data suggest a much stronger continuity between animals and humans with respect to speech than previously believed. We argue that the continuity hypothesis thus deserves the status of a null hypothesis, which must be rejected by comparative work before any claims of uniqueness can be validated. For now, this null hypothesis of no truly novel traits in the speech domain appears to stand.

There is, however, a striking ability tied to speech that has received insufficient attention: the human capacity for vocal imitation (63, 64). Imitation is obviously a necessary component of the human capacity to acquire a shared and arbitrary lexicon, which is itself central to the language capacity. Thus, the capacity to imitate was a crucial prerequisite of FLB as a communicative system. Vocal imitation and learning are not uniquely human. Rich multimodal imitative capacities are seen in other mammals (dolphins) and some birds (parrots), with most songbirds exhibiting a well-developed vocal imitative capacity (65). What is surprising is that monkeys show almost no evidence of visually mediated imitation, with chimpanzees showing only slightly better capacities (66). Even more striking is the virtual absence of evidence for vocal imitation in either monkeys or apes (3). For example, intensively trained chimpanzees are incapable of acquiring anything but a few poorly articulated spoken words, whereas parrots can readily acquire a large vocal repertoire. With respect to their own vocalizations, there are few convincing studies of vocal dialects in primates, thereby suggesting that they lack a vocal imitative capacity (3, 65). Evidence for spontaneous visuomanual imitation in chimpanzees is not much stronger, although with persistent training they can learn several hundred hand signs. Further, even in cases where nonhuman animals are capable of imitating in one modality (e.g., song copying in songbirds), only dolphins and humans appear capable of imitation in multiple modalities. The detachment from modality-specific inputs may represent a substantial change in neural organization, one that affects not only imitation but also communication; only humans can lose one modality (e.g., hearing) and make up for this deficit by communicating with complete competence in a different modality (i.e., signing).

Our discussion of limitations is not meant to diminish the impressive achievements of monkeys and apes, but to highlight how different the mechanisms underlying the production of human and nonhuman primate gestures, either vocally expressed or signed, must be. After all, the average high school graduate knows up to 60,000 words, a vocabulary achieved with little effort, especially when contrasted with the herculean efforts devoted to training animals. In sum, the impressive ability of any normal human child for vocal imitation may represent a novel capacity that evolved in our recent evolutionary history, some time after the divergence from our chimpanzee-like ancestors. The existence of analogs in distantly related species, such as birds and cetaceans, suggests considerable potential for the detailed comparative study of vocal imitation. There are, however, potential traps that must be avoided, especially with respect to explorations of the neurobiological substrates of imitation. For example, although macaque monkeys and humans are equipped with so-called "mirror neurons" in the premotor cortex that respond both when an individual acts in a particular way and when the same individual sees someone else act in this same way (67, 68), these neurons are not sufficient for imitation in macaques, as many have presumed: As mentioned, there is no convincing evidence of vocal or visual imitation in monkeys. Consequently, as neuroimaging studies continue to explore the neural basis of imitation in humans (69-71), it will be important to distinguish between the necessary and sufficient neural correlates of imitation. This is especially important, given that some recent attempts to model the evolution of language begin with a hypothetical organism that is equipped with the capacity for imitation and intentionality, as opposed to working out how these mechanisms evolved in the first place [see below; (72-74)]. If a deeper evolutionary exploration is desired, one dating back to a chimpanzee-like ancestor, then we need to explain how and why such capacities emerged from an ancestral node that lacked such abilities (75) (Fig. 4).


Fig. 4. The distribution of imitation in the animal kingdom is patchy. Some animals such as songbirds, dolphins, and humans have evolved exceptional abilities to imitate; other animals, such as apes and monkeys, either lack such abilities or have them in a relatively impoverished form. [Illustration: John Yanson] [View Larger Version of this Image (82K GIF file)]

The conceptual-intentional systems of nonlinguistic animals. A wide variety of studies indicate that nonhuman mammals and birds have rich conceptual representations (76, 77). Surprisingly, however, there is a mismatch between the conceptual capacities of animals and the communicative content of their vocal and visual signals (78, 79). For example, although a wide variety of nonhuman primates have access to rich knowledge of who is related to whom, as well as who is dominant and who is subordinate, their vocalizations only coarsely express such complexities.

Studies using classical training approaches as well as methods that tap spontaneous abilities reveal that animals acquire and use a wide range of abstract concepts, including tool, color, geometric relationships, food, and number (66, 76-82). More controversially, but of considerable relevance to intentional aspects of language and conditions of felicitous use, some studies claim that animals have a theory of mind (83-85), including a sense of self and the ability to represent the beliefs and desires of other group members. On the side of positive support, recent studies of chimpanzees suggest that they recognize the perceptual act of seeing as a proxy for the mental state of knowing (84, 86, 87). These studies suggest that at least chimpanzees, but perhaps no other nonhuman animals, have a rudimentary theory of mind. On the side of negative support, other studies suggest that even chimpanzees lack a theory of mind, failing, for example, to differentiate between ignorant and knowledgeable individuals with respect to intentional communication (88, 89). Because these experiments make use of different methods and are based on small sample sizes, it is not possible at present to derive any firm conclusions about the presence or absence of mental state attribution in animals. Independently of how this controversy is resolved, however, the best evidence of referential communication in animals comes not from chimpanzees but from a variety of monkeys and birds, species for which there is no convincing evidence for a theory of mind.

The classic studies of vervet monkey alarm calls (90) have now been joined by several others, each using comparable methods, with extensions to different species (macaques, Diana monkeys, meerkats, prairie dogs, chickens) and different communicative contexts (social relationships, food, intergroup aggression) (91-97). From these studies we can derive five key points relevant to our analysis of the faculty of language. First, individuals produce acoustically distinctive calls in response to functionally important contexts, including the detection of predators and the discovery of food. Second, the acoustic morphology of the signal, although arbitrary in terms of its association with a particular context, is sufficient to enable listeners to respond appropriately without requiring any other contextual information. Third, the number of such signals in the repertoire is small, restricted to objects and events experienced in the present, with no evidence of creative production of new sounds for new situations. Fourth, the acoustic morphology of the calls is fixed, appearing early in development, with experience only playing a role in refining the range of objects or events that elicit such calls. Fifth, there is no evidence that calling is intentional in the sense of taking into account what other individuals believe or want.

Early interpretations of this work suggested that when animals vocalize, they are functionally referring to the objects and events that they have encountered. As such, vervet alarm calls and rhesus monkey food calls, to take two examples, were interpreted as word-like, with callers referring to different kinds of predators or different kinds of food. More recent discussions have considerably weakened this interpretation, suggesting that if the signal is referential at all, it is in the mind of the listener who can extract information about the signaler's current context from the acoustic structure of the call alone (78, 95). Despite this evidence that animals can extract information from the signal, there are several reasons why additional evidence is required before such signals can be considered as precursors for, or homologs of, human words.

Roughly speaking, we can think of a particular human language as consisting of words and computational procedures ("rules") for constructing expressions from them. The computational system has the recursive property briefly outlined earlier, which may be a distinct human property. However, key aspects of words may also be distinctively human. There are, first of all, qualitative differences in scale and mode of acquisition, which suggest that quite different mechanisms are involved; as pointed out above, there is no evidence for vocal imitation in nonhuman primates, and although human children may use domain-general mechanisms to acquire and recall words (98, 99), the rate at which children build the lexicon is so massively different from nonhuman primates that one must entertain the possibility of an independently evolved mechanism. Furthermore, unlike the best animal examples of putatively referential signals, most of the words of human language are not associated with specific functions (e.g., warning cries, food announcements) but can be linked to virtually any concept that humans can entertain. Such usages are often highly intricate and detached from the here and now. Even for the simplest words, there is typically no straightforward word-thing relationship, if "thing" is to be understood in mind-independent terms. Without pursuing the matter here, it appears that many of the elementary properties of words--including those that enter into referentiality--have only weak analogs or homologs in natural animal communication systems, with only slightly better evidence from the training studies with apes and dolphins. Future research must therefore provide stronger support for the precursor position, or it must instead abandon this hypothesis, arguing that this component of FLB (conceptual-intentional) is also uniquely human.

Discrete infinity and constraints on learning. The data summarized thus far, although far from complete, provide overall support for the position of continuity between humans and other animals in terms of FLB. However, we have not yet addressed one issue that many regard as lying at the heart of language: its capacity for limitless expressive power, captured by the notion of discrete infinity. It seems relatively clear, after nearly a century of intensive research on animal communication, that no species other than humans has a comparable capacity to recombine meaningful units into an unlimited variety of larger structures, each differing systematically in meaning. However, little progress has been made in identifying the specific capabilities that are lacking in other animals.

The astronomical variety of sentences any natural language user can produce and understand has an important implication for language acquisition, long a core issue in developmental psychology. A child is exposed to only a small proportion of the possible sentences in its language, thus limiting its database for constructing a more general version of that language in its own mind/brain. This point has logical implications for any system that attempts to acquire a natural language on the basis of limited data. It is immediately obvious that given a finite array of data, there are infinitely many theories consistent with it but inconsistent with one another. In the present case, there are in principle infinitely many target systems (potential I-languages) consistent with the data of experience, and unless the search space and acquisition mechanisms are constrained, selection among them is impossible. A version of the problem has been formalized by Gold (100) and more recently and rigorously explored by Nowak and colleagues (72-75). No known "general learning mechanism" can acquire a natural language solely on the basis of positive or negative evidence, and the prospects for finding any such domain-independent device seem rather dim. The difficulty of this problem leads to the hypothesis that whatever system is responsible must be biased or constrained in certain ways. Such constraints have historically been termed "innate dispositions," with those underlying language referred to as "universal grammar." Although these particular terms have been forcibly rejected by many researchers, and the nature of the particular constraints on human (or animal) learning mechanisms is currently unresolved, the existence of some such constraints cannot be seriously doubted. On the other hand, other constraints in animals must have been overcome at some point in human evolution to account for our ability to acquire the unlimited class of generative systems that includes all natural languages. The nature of these latter constraints has recently become the target of empirical work. We focus here on the nature of number representation and rule learning in nonhuman animals and human infants, both of which can be investigated independently of communication and provide hints as to the nature of the constraints on FLN.

More than 50 years of research using classical training studies demonstrates that animals can represent number, with careful controls for various important confounds (80). In the typical experiment, a rat or pigeon is trained to press a lever x number of times to obtain a food reward. Results show that animals can hit the target number to within a closely matched mean, with a standard deviation that increases with magnitude: As the target number increases, so does variation around the mean. These results have led to the idea that animals, including human infants and adults, can represent number approximately as a magnitude with scalar variability (101, 102). Number discrimination is limited in this system by Weber's law, with greater discriminability among small numbers than among large numbers (keeping distances between pairs constant) and between numbers that are farther apart (e.g., 7 versus 8 is harder than 7 versus 12). The approximate number sense is accompanied by a second precise mechanism that is limited to values less than 4 but accurately distinguishes 1 from 2, 2 from 3, and 3 from 4; this second system appears to be recruited in the context of object tracking and is limited by working memory constraints (103). Of direct relevance to the current discussion, animals can be trained to understand the meaning of number words or Arabic numeral symbols. However, these studies reveal striking differences in how animals and human children acquire the integer list, and provide further evidence that animals lack the capacity to create open-ended generative systems.


Fig. 5. Human and nonhuman animals exhibit the capacity to compute numerosities, including small precise number quantification and large approximate number estimation. Humans may be unique, however, in the ability to show open-ended, precise quantificational skills with large numbers, including the integer count list. In parallel with the faculty of language, our capacity for number relies on a recursive computation. [Illustration: John Yanson] [View Larger Version of this Image (49K GIF file)]

Boysen and Matsuzawa have trained chimpanzees to map the number of objects onto a single Arabic numeral, to correctly order such numerals in either an ascending or descending list, and to indicate the sums of two numerals (104-106). For example, Boysen shows that a chimpanzee seeing two oranges placed in one box, and another two oranges placed in a second box, will pick the correct sum of four out of a lineup of three cards, each with a different Arabic numeral. The chimpanzees' performance might suggest that their representation of number is like ours. Closer inspection of how these chimpanzees acquired such competences, however, indicates that the format and content of their number representations differ fundamentally from those of human children. In particular, these chimpanzees required thousands of training trials, and often years, to acquire the integer list up to nine, with no evidence of the kind of "aha" experience that all human children