1 Department of
Psychology, Harvard University, Cambridge, MA 02138, USA.
2 Department of
Linguistics and Philosophy, Massachusetts Institute of Technology,
Cambridge, MA 02138, USA.
* To whom correspondence
should be addressed. E-mail: mdhauser@wjh.harvard.edu
If a martian graced our planet, it would be struck by one remarkable
similarity among Earth's living creatures and a key difference.
Concerning similarity, it would note that all living things
are designed on the basis of highly conserved
developmental systems that read an (almost) universal
language encoded in DNA base pairs. As such, life is
arranged hierarchically with a foundation of discrete,
unblendable units (codons, and, for the most part, genes)
capable of combining to create increasingly complex and
virtually limitless varieties of both species and
individual organisms. In contrast, it would notice the
absence of a universal code of communication (Fig.
1).
Fig. 1. The animal kingdom
has been designed on the basis of highly conserved developmental
systems that read an almost universal language coded in DNA base
pairs. This system is shown on the left in terms of a phylogenetic
tree. In contrast, animals lack a common universal code of
communication, indicated on the right by unconnected animal groups.
[Illustration: John Yanson] [View
Larger Version of this Image (43K GIF file)]
If our martian naturalist were meticulous, it might note that the
faculty mediating human communication appears remarkably
different from that of other living creatures; it might
further note that the human faculty of language appears
to be organized like the genetic code--hierarchical,
generative, recursive, and virtually limitless with
respect to its scope of expression. With these pieces in
hand, this martian might begin to wonder how the genetic
code changed in such a way as to generate a vast number
of mutually incomprehensible communication systems across
species while maintaining clarity of comprehension within
a given species. The martian would have stumbled onto
some of the essential problems surrounding the question
of language evolution, and of how humans acquired the
faculty of language.
In exploring the problem of language evolution, it is important
to distinguish between questions concerning language as a
communicative system and questions concerning the
computations underlying this system, such as those
underlying recursion. As we argue below, many acrimonious
debates in this field have been launched by a failure to
distinguish between these problems. According to one view
(1),
questions concerning abstract computational mechanisms
are distinct from those concerning communication, the
latter targeted at problems at the interface between
abstract computation and both sensory-motor and
conceptual-intentional interfaces. This view should not,
of course, be taken as a claim against a relationship
between computation and communication. It is possible, as
we discuss below, that key computational capacities
evolved for reasons other than communication but, after they
proved to have utility in communication, were altered
because of constraints imposed at both the periphery
(e.g., what we can hear and say or see and sign, the
rapidity with which the auditory cortex can process rapid
temporal and spectral changes) and more central levels
(e.g., conceptual and cognitive structures, pragmatics,
memory limitations).
At least three theoretical issues cross-cut the debate on
language evolution. One of the oldest problems among theorists
is the "shared versus unique" distinction. Most current
commentators agree that, although bees dance, birds sing,
and chimpanzees grunt, these systems of communication
differ qualitatively from human language. In particular,
animal communication systems lack the rich expressive and
open-ended power of human language (based on humans'
capacity for recursion). The evolutionary puzzle, therefore,
lies in working out how we got from there to here, given
this apparent discontinuity. A second issue revolves
around whether the evolution of language was gradual
versus saltational; this differs from the first issue
because a qualitative discontinuity between extant
species could have evolved gradually, involving no
discontinuities during human evolution. Finally, the
"continuity versus exaptation" issue revolves around the
problem of whether human language evolved by gradual
extension of preexisting communication systems, or
whether important aspects of language have been exapted
away from their previous adaptive function (e.g., spatial or
numerical reasoning, Machiavellian social scheming,
tool-making).
Researchers have adopted extreme or intermediate positions
regarding these basically independent questions, leading to a
wide variety of divergent viewpoints on the evolution of
language in the current literature. There is, however, an
emerging consensus that, although humans and animals
share a diversity of important computational and
perceptual resources, there has been substantial
evolutionary remodeling since we diverged from a common
ancestor some 6 million years ago. The empirical
challenge is to determine what was inherited unchanged
from this common ancestor, what has been subjected to
minor modifications, and what (if anything) is
qualitatively new. The additional evolutionary challenge is
to determine what selectional pressures led to adaptive
changes over time and to understand the various
constraints that channeled this evolutionary process.
Answering these questions requires a collaborative effort
among linguists, biologists, psychologists, and
anthropologists.
One aim of this essay is to promote a stronger connection between
biology and linguistics by identifying points of contact
and agreement between the fields. Although this
interdisciplinary marriage was inaugurated more than
50 years ago, it has not yet been fully consummated.
We hope to further this goal by, first, helping to
clarify the biolinguistic perspective on language and its
evolution (2-7).
We then review some promising empirical approaches to the
evolution of the language faculty, with a special focus
on comparative work with nonhuman animals, and conclude
with a discussion of how inquiry might profitably
advance, highlighting some outstanding problems.
We make no attempt to be comprehensive in our coverage of
relevant or interesting topics and problems. Nor is it our goal
to review the history of the field. Rather, we focus on
topics that make important contact between empirical data
and theoretical positions about the nature of the
language faculty. We believe that if explorations into
the problem of language evolution are to progress, we
need a clear explication of the computational
requirements for language, the role of evolutionary theory
in testing hypotheses of character evolution, and a
research program that will enable a productive
interchange between linguists and biologists.
Defining the Target: Two Senses of the Faculty of
Language
The word "language" has highly divergent meanings in
different contexts and disciplines. In informal usage, a language
is understood as a culturally specific communication
system (English, Navajo, etc.). In the varieties of
modern linguistics that concern us here, the term
"language" is used quite differently to refer to an
internal component of the mind/brain (sometimes called
"internal language" or "I-language"). We assume that this
is the primary object of interest for the study of the
evolution and function of the language faculty. However,
this biologically and individually grounded usage still
leaves much open to interpretation (and misunderstanding).
For example, a neuroscientist might ask: What components of
the human nervous system are recruited in the use of
language in its broadest sense? Because any aspect of
cognition appears to be, at least in principle,
accessible to language, the broadest answer to this
question is, probably, "most of it." Even aspects of emotion
or cognition not readily verbalized may be influenced by
linguistically based thought processes. Thus, this
conception is too broad to be of much use. We therefore
delineate two more restricted conceptions of the faculty
of language, one broader and more inclusive, the other
more restricted and narrow (Fig.
2).
Fig. 2. A schematic
representation of organism-external and -internal factors related to
the faculty of language. FLB includes sensory-motor,
conceptual-intentional, and other possible systems (which we leave
open); FLN includes the core grammatical computations that we
suggest are limited to recursion. See text for more complete
discussion. [View
Larger Version of this Image (47K GIF file)]
Faculty of language--broad sense (FLB). FLB includes an
internal computational system (FLN, below) combined with at least
two other organism-internal systems, which we call
"sensory-motor" and "conceptual-intentional." Despite
debate on the precise nature of these systems, and about
whether they are substantially shared with other
vertebrates or uniquely adapted to the exigencies of
language, we take as uncontroversial the existence of some
biological capacity of humans that allows us (and not,
for example, chimpanzees) to readily master any human
language without explicit instruction. FLB includes this
capacity, but excludes other organism-internal systems
that are necessary but not sufficient for language (e.g.,
memory, respiration, digestion, circulation, etc.).
Faculty of language--narrow sense (FLN). FLN is the
abstract linguistic computational system alone, independent of the
other systems with which it interacts and interfaces. FLN
is a component of FLB, and the mechanisms underlying it
are some subset of those underlying FLB.
Others have agreed on the need for a restricted sense of
"language" but have suggested different delineations. For
example, Liberman and his associates (8)
have argued that the sensory-motor systems were
specifically adapted for language, and hence should be
considered part of FLN. There is also a long tradition
holding that the conceptual-intentional systems are an
intrinsic part of language in a narrow sense. In this article,
we leave these questions open, restricting attention to FLN
as just defined but leaving the possibility of a more
inclusive definition open to further empirical
research.
The internal architecture of FLN, so conceived, is a topic of
much current research and debate (4).
Without prejudging the issues, we will, for concreteness,
adopt a particular conception of this architecture. We
assume, putting aside the precise mechanisms, that a key
component of FLN is a computational system (narrow
syntax) that generates internal representations and maps
them into the sensory-motor interface by the phonological
system, and into the conceptual-intentional interface by
the (formal) semantic system; adopting alternatives that
have been proposed would not materially modify the
ensuing discussion. All approaches agree that a core
property of FLN is recursion, attributed to narrow syntax
in the conception just outlined. FLN takes a finite set
of elements and yields a potentially infinite array of
discrete expressions. This capacity of FLN yields
discrete infinity (a property that also characterizes the
natural numbers). Each of these discrete expressions is
then passed to the sensory-motor and
conceptual-intentional systems, which process and elaborate
this information in the use of language. Each expression is,
in this sense, a pairing of sound and meaning. It has
been recognized for thousands of years that language is,
fundamentally, a system of sound-meaning connections; the
potential infiniteness of this system has been explicitly
recognized by Galileo, Descartes, and the 17th-century
"philosophical grammarians" and their successors, notably
von Humboldt. One goal of the study of FLN and, more broadly,
FLB is to discover just how the faculty of language
satisfies these basic and essential conditions.
The core property of discrete infinity is intuitively familiar to
every language user. Sentences are built up of discrete
units: There are 6-word sentences and 7-word sentences, but
no 6.5-word sentences. There is no longest sentence (any
candidate sentence can be trumped by, for example,
embedding it in "Mary thinks that ..."), and there is no
nonarbitrary upper bound to sentence length. In these
respects, language is directly analogous to the natural
numbers (see below).
At a minimum, then, FLN includes the capacity of recursion. There
are many organism-internal factors, outside FLN or FLB,
that impose practical limits on the usage of the system. For
example, lung capacity imposes limits on the length of
actual spoken sentences, whereas working memory imposes
limits on the complexity of sentences if they are to be
understandable. Other limitations--for example, on
concept formation or motor output speed--represent aspects of
FLB, which have their own evolutionary histories and may
have played a role in the evolution of the capacities of
FLN. Nonetheless, one can profitably inquire into the
evolution of FLN without an immediate concern for these
limiting aspects of FLB. This is made clear by the
observation that, although many aspects of FLB are shared
with other vertebrates, the core recursive aspect of FLN
currently appears to lack any analog in animal communication
and possibly other domains as well. This point,
therefore, represents the deepest challenge for a
comparative evolutionary approach to language. We believe
that investigations of this capacity should include
domains other than communication (e.g., number, social
relationships, navigation).
Given the distinctions between FLB and FLN and the theoretical
distinctions raised above, we can define a research space as
sketched in Fig.
3. This research space identifies, as viable,
problems concerning the evolution of sensory-motor systems,
of conceptual-intentional systems, and of FLN. The
comparative approach, to which we turn next, provides a
framework for addressing questions about each of these
components of the faculty of language.
Fig. 3. Investigations
into the evolution of the faculty of language are confronted with a
three-dimensional research space that includes three
comparative-evolutionary problems cross-cut by the core components
of the faculty of language. Thus, for each problem, researchers can
investigate details of the sensory-motor system, the
conceptual-intentional system, FLN, and the interfaces among these
systems. [View
Larger Version of this Image (39K GIF file)]
The Comparative Approach to Language Evolution
The empirical
study of the evolution of language is beset with difficulties.
Linguistic behavior does not fossilize, and a long
tradition of analysis of fossil skull shape and cranial
endocasts has led to little consensus about the evolution
of language (7,
9).
A more tractable and, we think, powerful approach to
problems of language evolution is provided by the comparative
method, which uses empirical data from living species to
draw detailed inferences about extinct ancestors (3,
10-12).
The comparative method was the primary tool used by Darwin (13,
14)
to analyze evolutionary phenomena and continues to play a
central role throughout modern evolutionary biology.
Although scholars interested in language evolution have
often ignored comparative data altogether or focused
narrowly on data from nonhuman primates, current thinking
in neuroscience, molecular biology, and developmental
biology indicates that many aspects of neural and
developmental function are highly conserved, encouraging
the extension of the comparative method to all vertebrates
(and perhaps beyond). For several reasons, detailed
below, we believe that the comparative method should play
a more central role in future discussions of language
evolution.
An overarching concern in studies of language evolution is with
whether particular components of the faculty of language
evolved specifically for human language and, therefore
(by extension), are unique to humans. Logically, the
human uniqueness claim must be based on data indicating
an absence of the trait in nonhuman animals and, to be
taken seriously, requires a substantial body of relevant
comparative data. More concretely, if the language
evolution researcher wishes to make the claim that a trait
evolved uniquely in humans for the function of language
processing, data indicating that no other animal has this
particular trait are required.
Although this line of reasoning may appear obvious, it is
surprisingly common for a trait to be held up as uniquely human
before any appropriate comparative data are available. A
famous example is categorical perception, which when
discovered seemed so finely tuned to the details of human
speech as to constitute a unique human adaptation (15,
16).
It was some time before the same underlying perceptual
discontinuities were discovered in chinchillas and
macaques (17,
18),
and even birds (19),
leading to the opposite conclusion that the perceptual
basis for categorical perception is a primitive
vertebrate characteristic that evolved for general auditory
processing, as opposed to specific speech processing.
Thus, a basic and logically ineliminable role for
comparative research on language evolution is this simple
and essentially negative one: A trait present in nonhuman
animals did not evolve specifically for human language,
although it may be part of the language faculty and play an
intimate role in language processing. It is possible, of
course, that a trait evolved in nonhuman animals and
humans independently, as analogs rather than homologs.
This would preserve the possibility that the trait
evolved for language in humans but evolved for some other
reason in the comparative animal group. In cases where
the comparative group is a nonhuman primate, and perhaps
especially chimpanzees, the plausibility of this
evolutionary scenario is weaker. In any case, comparative
data are critical to this judgment.
Despite the crucial role of homology in comparative biology,
homologous traits are not the only relevant source of
evolutionary data. The convergent evolution of similar
characters in two independent clades, termed "analogies"
or "homoplasies," can be equally revealing (20).
The remarkably similar (but nonhomologous) structures of
human and octopus eyes reveal the stringent constraints placed
by the laws of optics and the contingencies of development
on an organ capable of focusing a sharp image onto a
sheet of receptors. Detailed analogies between the parts
of the vertebrate and cephalopod eye also provide
independent evidence that each component is an adaptation
for image formation, shaped by natural selection. Furthermore,
the discovery that remarkably conservative genetic cascades
underlie the development of such analogous structures
provides important insights into the ways in which
developmental mechanisms can channel evolution (21).
Thus, although potentially misleading for taxonomists,
analogies provide critical data about adaptation under
physical and developmental constraints. Casting the comparative
net more broadly, therefore, will most likely reveal larger
regularities in evolution, helping to address the role of
such constraints in the evolution of language.
An analogy recognized as particularly relevant to language is the
acquisition of song by birds (12).
In contrast to nonhuman primates, where the production of
species-typical vocalizations is largely innate (22),
most songbirds learn their species-specific song by
listening to conspecifics, and they develop highly
aberrant song if deprived of such experience. Current
investigation of birdsong reveals detailed and intriguing
parallels with speech (11,
23,
24).
For instance, many songbirds pass through a critical period
in development beyond which they produce defective songs
that no amount of acoustic input can remedy, reminiscent
of the difficulty adult humans have in fully mastering
new languages. Further, and in parallel with the babbling
phase of vocalizing or signing human infants (25),
young birds pass through a phase of song development in
which they spontaneously produce amorphous versions of
adult song, termed "subsong" or "babbling." Although the
mechanisms underlying the acquisition of birdsong and
human language are clearly analogs and not homologs,
their core components share a deeply conserved neural and
developmental foundation: Most aspects of neurophysiology
and development--including regulatory and structural
genes, as well as neuron types and neurotransmitters--are
shared among vertebrates. That such close parallels have
evolved suggests the existence of important constraints
on how vertebrate brains can acquire large vocabularies
of complex, learned sounds. Such constraints may
essentially force natural selection to come up with the
same solution repeatedly when confronted with similar
problems.
Testing Hypotheses About the Evolution of the Faculty of
Language
Given the definitions of the faculty of language,
together with the comparative framework, we can distinguish several
plausible hypotheses about the evolution of its various
components. Here, we suggest two hypotheses that span the
diversity of opinion among current scholars, plus a third
of our own.
Hypothesis 1: FLB is strictly homologous to animal
communication. This hypothesis holds that homologs of FLB,
including FLN, exist (perhaps in less developed or
otherwise modified form) in nonhuman animals (3,
10,
26).
This has historically been a popular hypothesis outside
of linguistics and closely allied fields, and has been
defended by some in the speech sciences. According to
this hypothesis, human FLB is composed of the same
functional components that underlie communication in
other species.
Hypothesis 2: FLB is a derived, uniquely human adaptation for
language. According to this hypothesis, FLB is a highly
complex adaptation for language, on a par with the
vertebrate eye, and many of its core components can be
viewed as individual traits that have been subjected to
selection and perfected in recent human evolutionary
history. This appears to represent the null hypothesis
for many scholars who take the complexity of language
seriously (27,
28).
The argument starts with the assumption that FLB, as a
whole, is highly complex, serves the function of
communication with admirable effectiveness, and has an
ineliminable genetic component. Because natural selection
is the only known biological mechanism capable of generating
such functional complexes [the argument from design (29)],
proponents of this view conclude that natural selection has
played a powerful role in shaping many aspects of FLB,
including FLN, and, further, that many of these are
without parallel in nonhuman animals. Although homologous
mechanisms may exist in other animals, the human versions
have been modified by natural selection to the extent
that they can be reasonably seen as constituting novel
traits, perhaps exapted from other contexts [e.g., social
intelligence, tool-making (7,
30-32)].
Hypothesis 3: Only FLN is uniquely human. On the basis of
data reviewed below, we hypothesize that most, if not all, of
FLB is based on mechanisms shared with nonhuman animals
(as held by hypothesis 1). In contrast, we suggest that
FLN--the computational mechanism of recursion--is
recently evolved and unique to our species (33,
34).
According to this hypothesis, much of the complexity
manifested in language derives from complexity in the
peripheral components of FLB, especially those underlying
the sensory-motor (speech or sign) and
conceptual-intentional interfaces, combined with
sociocultural and communicative contingencies. FLB as a
whole thus has an ancient evolutionary history, long
predating the emergence of language, and a comparative
analysis is necessary to understand this complex system.
By contrast, according to recent linguistic theory, the
computations underlying FLN may be quite limited. In
fact, we propose in this hypothesis that FLN comprises
only the core computational mechanisms of recursion as
they appear in narrow syntax and the mappings to the
interfaces. If FLN is indeed this restricted, this
hypothesis has the interesting effect of nullifying the
argument from design, and thus rendering the status of
FLN as an adaptation open to question. Proponents of the
idea that FLN is an adaptation would thus need to supply
additional data or arguments to support this viewpoint.
The available comparative data on animal communication systems
suggest that the faculty of language as a whole relies on some
uniquely human capacities that have evolved recently in the
approximately 6 million years since our divergence
from a chimpanzee-like common ancestor (35).
Hypothesis 3, in its strongest form, suggests that
only FLN falls into this category (34).
By this hypothesis, FLB contains a wide variety of cognitive
and perceptual mechanisms shared with other species, but
only those mechanisms underlying FLN--particularly its
capacity for discrete infinity--are uniquely human. This
hypothesis suggests that all peripheral components of FLB
are shared with other animals, in more or less the same
form as they exist in humans, with differences of
quantity rather than kind (9,
34).
What is unique to our species is quite specific to FLN,
and includes its internal operations as well as its
interface with the other organism-internal systems of
FLB.
Each of these hypotheses is plausible to some degree. Ultimately,
they can be distinguished only by empirical data, much of
which is currently unavailable. Before reviewing some of the
relevant data, we briefly consider some key distinctions
between them. From a comparative evolutionary viewpoint,
an important question is whether linguistic precursors
were involved in communication or in something else.
Proponents of both hypotheses 1 and 2 posit a
direct correspondence, by descent with modification, between
some trait involved in FLB in humans and a similar trait in
another species; these hypotheses differ in whether the
precursors functioned in communication.Although many
aspects of FLB very likely arose in this manner, the
important issue for these hypotheses is whether a series
of gradual modifications could lead eventually to the
capacity of language for infinite generativity. Despite the
inarguable existence of a broadly shared base of
homologous mechanisms involved in FLB, minor
modifications to this foundational system alone seem
inadequate to generate the fundamental difference--discrete
infinity--between language and all known forms of animal
communication. This claim is one of several reasons why
we suspect that hypothesis 3 may be a productive way
to characterize the problem of language evolution.
A primary issue separating hypotheses 2 and 3 is
whether the uniquely human capacities of FLN constitute an
adaptation. The viewpoint stated in hypothesis
2, especially the notion that FLN in particular is a
highly evolved adaptation, has generated much enthusiasm
recently [e.g., (36)],
especially among evolutionary psychologists (37,
38).
At present, however, we see little reason to believe
either that FLN can be anatomized into many independent
but interacting traits, each with its own independent
evolutionary history, or that each of these traits could
have been strongly shaped by natural selection, given their
tenuous connection to communicative efficacy (the surface or
phenotypic function upon which selection presumably
acted).
We consider the possibility that certain specific aspects of the
faculty of language are "spandrels"--by-products of preexisting
constraints rather than end products of a history of natural
selection (39).
This possibility, which opens the door to other empirical
lines of inquiry, is perfectly compatible with our firm
support of the adaptationist program. Indeed, it follows
directly from the foundational notion that adaptation is
an "onerous concept" to be invoked only when alternative
explanations fail (40).
The question is not whether FLN in toto is adaptive. By
allowing us to communicate an endless variety of
thoughts, recursion is clearly an adaptive computation.
The question is whether particular components of the
functioning of FLN are adaptations for language,
specifically acted upon by natural selection--or, even more
broadly, whether FLN evolved for reasons other than
communication.
An analogy may make this distinction clear. The trunk and
branches of trees are near-optimal solutions for providing an
individual tree's leaves with access to sunlight. For
shrubs and small trees, a wide variety of forms
(spreading, spherical, multistalked, etc.) provide good
solutions to this problem. For a towering rainforest
canopy tree, however, most of these forms are rendered
impossible by the various constraints imposed by the
properties of cellulose and the problems of sucking water
and nutrients up to the leaves high in the air. Some
aspects of such trees are clearly adaptations channeled
by these constraints; others (e.g., the popping of xylem
tubes on hot days, the propensity to be toppled in
hurricanes) are presumably unavoidable by-products of
such constraints.
Recent work on FLN (4,
41-43)
suggests the possibility that at least the narrow-syntactic
component satisfies conditions of highly efficient
computation to an extent previously unsuspected. Thus,
FLN may approximate a kind of "optimal solution" to the
problem of linking the sensory-motor and
conceptual-intentional systems. In other words, the generative
processes of the language system may provide a near-optimal
solution that satisfies the interface conditions to FLB.
Many of the details of language that are the traditional
focus of linguistic study [e.g., subjacency, Wh-
movement, the existence of garden-path sentences (4,
44)]
may represent by-products of this solution, generated
automatically by neural/computational constraints and the
structure of FLB--components that lie outside of FLN.
Even novel capacities such as recursion are implemented
in the same type of neural tissue as the rest of the brain
and are thus constrained by biophysical, developmental,
and computational factors shared with other vertebrates.
Hypothesis 3 raises the possibility that structural
details of FLN may result from such preexisting
constraints, rather than from direct shaping by natural
selection targeted specifically at communication. Insofar as
this proves to be true, such structural details are not,
strictly speaking, adaptations at all. This hypothesis
and the alternative selectionist account are both viable
and can eventually be tested with comparative data.
Comparative Evidence for the Faculty of Language
Study of
the evolution of language has accelerated in the past decade (45,
46).
Here, we offer a highly selective review of some of these
studies, emphasizing animal work that seems particularly
relevant to the hypotheses advanced above; many omissions
were necessary for reasons of space, and we firmly
believe that a broad diversity of methods and perspectives
will ultimately provide the richest answers to the problem
of language evolution. For this reason, we present a
broader sampler of the field's offerings in Table
1.
Table 1. A sampler of empirical
approaches to understanding the evolution of the faculty of
language, including both broad (FLB) and narrow (FLN)
components.
|
| Empirical problem |
Examples |
References |
|
| FLB--sensory-motor system |
| Vocal imitation and
invention |
Tutoring studies of
songbirds, analyses of vocal dialects in whales,
spontaneous imitation of artificially created sounds in
dolphins |
(11,
12,
24,
65) |
| Neurophysiology of
action-perception systems |
Studies assessing whether
mirror neurons, which provide a core substrate for the
action-perception system, may subserve gestural and
(possibly) vocal imitation |
(67,
68,
71) |
| Discriminating the sound
patterns of language |
Operant conditioning
studies of the prototype magnet effect in macaques and
starlings |
(52,
120) |
| Constraints imposed by
vocal tract anatomy |
Studies of vocal tract
length and formant dispersion in birds and primates |
(54-61) |
| Biomechanics of sound
production |
Studies of primate vocal
production, including the role of mandibular
oscillations |
(121,
122) |
| Modalities of language
production and perception |
Cross-modal perception
and sign language in humans versus unimodal
communication in animals |
(3,
25,
123) |
| FLB--conceptual-intentional system
|
| Theory of mind,
attribution of mental states |
Studies of the
seeing/knowing distinction in chimpanzees |
(84,
86-89) |
| Capacity to acquire
nonlinguistic conceptual representations |
Studies of rhesus monkeys
and the object/kind concept |
(10,
76,
77,
124) |
| Referential vocal
signals |
Studies of primate
vocalizations used to designate predators, food, and
social relationships |
(3,
78,
90,
91,
93,
94,
97) |
| Imitation as a rational,
intentional system |
Comparative studies of
chimpanzees and human infants suggesting that only the
latter read intentionality into action, and thus extract
unobserved rational intent |
(125-127) |
| Voluntary control over
signal production as evidence of intentional
communication |
Comparative studies that
explore the relationship between signal production and
the composition of a social audience |
(3,
10,
92,
128) |
| FLN--recursion |
| Spontaneous and training
methods designed to uncover constraints on rule
learning |
Studies of serial order
learning and finite-state grammars in tamarins and
macaques |
(114,
116,
117,
129) |
| Sign or artificial
language in trained apes and dolphins |
Studies exploring symbol
sequencing and open-ended combinatorial manipulation |
(130,
131) |
| Models of the faculty of
language that attempt to uncover the necessary and
sufficient mechanisms |
Game theory models of
language acquisition, reference, and universal
grammar |
(72-74) |
| Experiments with animals
that explore the nature and content of number
representation |
Operant conditioning
studies to determine whether nonhuman primates can
represent number, including properties such as
ordinality and cardinality, using such representations
in conjunction with mathematical operands (e.g., add,
divide) |
(102-106,
132) |
| Shared mechanisms across
different cognitive domains |
Evolution of musical
processing and structure, including analyses of brain
function and comparative studies of music perception |
(133-135) | |
How "special" is speech? Comparative study of the
sensory-motor system. Starting with early work on speech
perception, there has been a tradition of considering
speech "special," and thus based on uniquely human
mechanisms adapted for speech perception and/or
production [e.g., (7,
8,
47,
48)].
This perspective has stimulated a vigorous research
program studying animal speech perception and, more
recently, speech production. Surprisingly, this research
has turned up little evidence for uniquely human
mechanisms special to speech, despite a persistent
tendency to assume uniqueness even in the absence of
relevant animal data.
On the side of perception, for example, many species show an
impressive ability to both discriminate between and generalize
over human speech sounds, using formants as the critical
discriminative cue (17-19,
49-51).
These data provide evidence not only of categorical
perception, but also of the ability to discriminate among
prototypical exemplars of different phonemes (52).
Further, in the absence of training, nonhuman primates
can discriminate sentences from two different languages
on the basis of rhythmic differences between them (53).
On the side of production, birds and nonhuman primates naturally
produce and perceive formants in their own species-typical
vocalizations (54-59).
The results also shed light on discussions of the
uniquely human structure of the vocal tract and the
unusual descended larynx of our species (7,
48,
60),
because new evidence shows that several other mammalian
species also have a descended larynx (61).
Because these nonhuman species lack speech, a descended
larynx clearly has nonphonetic functions; one possibility
is exaggerating apparent size. Although this particular
anatomical modification undoubtedly plays an important
role in speech production in modern humans, it need not
have first evolved for this function. The descended
larynx may thus be an example of classic Darwinian
preadaptation.
Many phenomena in human speech perception have not yet been
investigated in animals [e.g., the McGurk effect, an illusion
in which the syllable perceived from a talking head
represents the interaction between an articulatory
gesture seen and a different syllable heard; see (62)].
However, the available data suggest a much stronger
continuity between animals and humans with respect to
speech than previously believed. We argue that the
continuity hypothesis thus deserves the status of a null
hypothesis, which must be rejected by comparative work
before any claims of uniqueness can be validated. For
now, this null hypothesis of no truly novel traits in the
speech domain appears to stand.
There is, however, a striking ability tied to speech that has
received insufficient attention: the human capacity for vocal
imitation (63,
64).
Imitation is obviously a necessary component of the human
capacity to acquire a shared and arbitrary lexicon, which
is itself central to the language capacity. Thus, the
capacity to imitate was a crucial prerequisite of FLB as
a communicative system. Vocal imitation and learning are
not uniquely human. Rich multimodal imitative capacities are
seen in other mammals (dolphins) and some birds (parrots),
with most songbirds exhibiting a well-developed vocal
imitative capacity (65).
What is surprising is that monkeys show almost no
evidence of visually mediated imitation, with chimpanzees
showing only slightly better capacities (66).
Even more striking is the virtual absence of evidence for
vocal imitation in either monkeys or apes (3).
For example, intensively trained chimpanzees are
incapable of acquiring anything but a few poorly
articulated spoken words, whereas parrots can readily
acquire a large vocal repertoire. With respect to their
own vocalizations, there are few convincing studies of
vocal dialects in primates, thereby suggesting that they
lack a vocal imitative capacity (3,
65).
Evidence for spontaneous visuomanual imitation in
chimpanzees is not much stronger, although with persistent
training they can learn several hundred hand signs. Further,
even in cases where nonhuman animals are capable of
imitating in one modality (e.g., song copying in
songbirds), only dolphins and humans appear capable of
imitation in multiple modalities. The detachment from
modality-specific inputs may represent a substantial
change in neural organization, one that affects not only
imitation but also communication; only humans can lose
one modality (e.g., hearing) and make up for this deficit
by communicating with complete competence in a different
modality (i.e., signing).
Our discussion of limitations is not meant to diminish the
impressive achievements of monkeys and apes, but to highlight
how different the mechanisms underlying the production of
human and nonhuman primate gestures, either vocally
expressed or signed, must be. After all, the average high
school graduate knows up to 60,000 words, a
vocabulary achieved with little effort, especially when
contrasted with the herculean efforts devoted to training
animals. In sum, the impressive ability of any normal human
child for vocal imitation may represent a novel capacity
that evolved in our recent evolutionary history, some
time after the divergence from our chimpanzee-like
ancestors. The existence of analogs in distantly related
species, such as birds and cetaceans, suggests
considerable potential for the detailed comparative study of
vocal imitation. There are, however, potential traps that
must be avoided, especially with respect to explorations
of the neurobiological substrates of imitation. For
example, although macaque monkeys and humans are equipped
with so-called "mirror neurons" in the premotor cortex
that respond both when an individual acts in a particular
way and when the same individual sees someone else act in
this same way (67,
68),
these neurons are not sufficient for imitation in
macaques, as many have presumed: As mentioned, there is
no convincing evidence of vocal or visual imitation in
monkeys. Consequently, as neuroimaging studies continue
to explore the neural basis of imitation in humans (69-71),
it will be important to distinguish between the necessary
and sufficient neural correlates of imitation. This is
especially important, given that some recent attempts to
model the evolution of language begin with a hypothetical
organism that is equipped with the capacity for imitation
and intentionality, as opposed to working out how these
mechanisms evolved in the first place [see below; (72-74)].
If a deeper evolutionary exploration is desired, one
dating back to a chimpanzee-like ancestor, then we need
to explain how and why such capacities emerged from an
ancestral node that lacked such abilities (75)
(Fig.
4).
Fig. 4. The distribution
of imitation in the animal kingdom is patchy. Some animals such as
songbirds, dolphins, and humans have evolved exceptional abilities
to imitate; other animals, such as apes and monkeys, either lack
such abilities or have them in a relatively impoverished form.
[Illustration: John Yanson] [View
Larger Version of this Image (82K GIF file)]
The conceptual-intentional systems of nonlinguistic
animals. A wide variety of studies indicate that nonhuman
mammals and birds have rich conceptual representations
(76,
77).
Surprisingly, however, there is a mismatch between the
conceptual capacities of animals and the communicative
content of their vocal and visual signals (78,
79).
For example, although a wide variety of nonhuman primates
have access to rich knowledge of who is related to whom,
as well as who is dominant and who is subordinate, their
vocalizations only coarsely express such complexities.
Studies using classical training approaches as well as methods
that tap spontaneous abilities reveal that animals acquire
and use a wide range of abstract concepts, including tool,
color, geometric relationships, food, and number (66,
76-82).
More controversially, but of considerable relevance to
intentional aspects of language and conditions of
felicitous use, some studies claim that animals have a
theory of mind (83-85),
including a sense of self and the ability to represent the
beliefs and desires of other group members. On the side
of positive support, recent studies of chimpanzees
suggest that they recognize the perceptual act of seeing
as a proxy for the mental state of knowing (84,
86,
87).
These studies suggest that at least chimpanzees, but
perhaps no other nonhuman animals, have a rudimentary
theory of mind. On the side of negative support, other
studies suggest that even chimpanzees lack a theory of mind,
failing, for example, to differentiate between ignorant and
knowledgeable individuals with respect to intentional
communication (88,
89).
Because these experiments make use of different methods
and are based on small sample sizes, it is not possible
at present to derive any firm conclusions about the presence
or absence of mental state attribution in animals.
Independently of how this controversy is resolved,
however, the best evidence of referential communication
in animals comes not from chimpanzees but from a variety
of monkeys and birds, species for which there is no
convincing evidence for a theory of mind.
The classic studies of vervet monkey alarm calls (90)
have now been joined by several others, each using comparable
methods, with extensions to different species (macaques,
Diana monkeys, meerkats, prairie dogs, chickens) and
different communicative contexts (social relationships,
food, intergroup aggression) (91-97).
From these studies we can derive five key points relevant to
our analysis of the faculty of language. First,
individuals produce acoustically distinctive calls in
response to functionally important contexts, including
the detection of predators and the discovery of food.
Second, the acoustic morphology of the signal, although
arbitrary in terms of its association with a particular
context, is sufficient to enable listeners to respond
appropriately without requiring any other contextual
information. Third, the number of such signals in the
repertoire is small, restricted to objects and events
experienced in the present, with no evidence of creative
production of new sounds for new situations. Fourth, the
acoustic morphology of the calls is fixed, appearing
early in development, with experience only playing a role
in refining the range of objects or events that elicit
such calls. Fifth, there is no evidence that calling is
intentional in the sense of taking into account what
other individuals believe or want.
Early interpretations of this work suggested that when animals
vocalize, they are functionally referring to the objects and
events that they have encountered. As such, vervet alarm
calls and rhesus monkey food calls, to take two examples,
were interpreted as word-like, with callers referring to
different kinds of predators or different kinds of food.
More recent discussions have considerably weakened this
interpretation, suggesting that if the signal is
referential at all, it is in the mind of the listener who
can extract information about the signaler's current
context from the acoustic structure of the call alone (78,
95).
Despite this evidence that animals can extract information
from the signal, there are several reasons why additional
evidence is required before such signals can be
considered as precursors for, or homologs of, human
words.
Roughly speaking, we can think of a particular human language as
consisting of words and computational procedures ("rules")
for constructing expressions from them. The computational
system has the recursive property briefly outlined
earlier, which may be a distinct human property. However,
key aspects of words may also be distinctively human.
There are, first of all, qualitative differences in scale
and mode of acquisition, which suggest that quite
different mechanisms are involved; as pointed out above,
there is no evidence for vocal imitation in nonhuman
primates, and although human children may use
domain-general mechanisms to acquire and recall words (98,
99),
the rate at which children build the lexicon is so
massively different from nonhuman primates that one must
entertain the possibility of an independently evolved
mechanism. Furthermore, unlike the best animal examples
of putatively referential signals, most of the words of
human language are not associated with specific functions
(e.g., warning cries, food announcements) but can be linked
to virtually any concept that humans can entertain. Such
usages are often highly intricate and detached from the
here and now. Even for the simplest words, there is
typically no straightforward word-thing relationship, if
"thing" is to be understood in mind-independent terms.
Without pursuing the matter here, it appears that many of
the elementary properties of words--including those that enter
into referentiality--have only weak analogs or homologs in
natural animal communication systems, with only slightly
better evidence from the training studies with apes and
dolphins. Future research must therefore provide stronger
support for the precursor position, or it must instead
abandon this hypothesis, arguing that this component of
FLB (conceptual-intentional) is also uniquely human.
Discrete infinity and constraints on learning. The data
summarized thus far, although far from complete, provide overall
support for the position of continuity between humans and
other animals in terms of FLB. However, we have not yet
addressed one issue that many regard as lying at the
heart of language: its capacity for limitless expressive
power, captured by the notion of discrete infinity. It
seems relatively clear, after nearly a century of
intensive research on animal communication, that no species
other than humans has a comparable capacity to recombine
meaningful units into an unlimited variety of larger
structures, each differing systematically in meaning.
However, little progress has been made in identifying the
specific capabilities that are lacking in other animals.
The astronomical variety of sentences any natural language user
can produce and understand has an important implication for
language acquisition, long a core issue in developmental
psychology. A child is exposed to only a small proportion
of the possible sentences in its language, thus limiting
its database for constructing a more general version of
that language in its own mind/brain. This point has
logical implications for any system that attempts to
acquire a natural language on the basis of limited data. It
is immediately obvious that given a finite array of data,
there are infinitely many theories consistent with it but
inconsistent with one another. In the present case, there
are in principle infinitely many target systems
(potential I-languages) consistent with the data of
experience, and unless the search space and acquisition
mechanisms are constrained, selection among them is
impossible. A version of the problem has been formalized
by Gold (100)
and more recently and rigorously explored by Nowak and
colleagues (72-75).
No known "general learning mechanism" can acquire a
natural language solely on the basis of positive or
negative evidence, and the prospects for finding any such
domain-independent device seem rather dim. The difficulty
of this problem leads to the hypothesis that whatever system
is responsible must be biased or constrained in certain
ways. Such constraints have historically been termed
"innate dispositions," with those underlying language
referred to as "universal grammar." Although these
particular terms have been forcibly rejected by many
researchers, and the nature of the particular constraints
on human (or animal) learning mechanisms is currently
unresolved, the existence of some such constraints cannot
be seriously doubted. On the other hand, other
constraints in animals must have been overcome at some
point in human evolution to account for our ability to
acquire the unlimited class of generative systems that includes
all natural languages. The nature of these latter
constraints has recently become the target of empirical
work. We focus here on the nature of number
representation and rule learning in nonhuman animals and
human infants, both of which can be investigated independently
of communication and provide hints as to the nature of the
constraints on FLN.
More than 50 years of research using classical training
studies demonstrates that animals can represent number, with
careful controls for various important confounds (80).
In the typical experiment, a rat or pigeon is trained to
press a lever x number of times to obtain a food
reward. Results show that animals can hit the target
number to within a closely matched mean, with a standard
deviation that increases with magnitude: As the target
number increases, so does variation around the mean. These
results have led to the idea that animals, including
human infants and adults, can represent number
approximately as a magnitude with scalar variability (101,
102).
Number discrimination is limited in this system by
Weber's law, with greater discriminability among small
numbers than among large numbers (keeping distances
between pairs constant) and between numbers that are farther
apart (e.g., 7 versus 8 is harder than
7 versus 12). The approximate number sense is
accompanied by a second precise mechanism that is limited
to values less than 4 but accurately distinguishes
1 from 2, 2 from 3, and 3 from 4;
this second system appears to be recruited in the context
of object tracking and is limited by working memory
constraints (103).
Of direct relevance to the current discussion, animals
can be trained to understand the meaning of number words
or Arabic numeral symbols. However, these studies reveal
striking differences in how animals and human children
acquire the integer list, and provide further evidence
that animals lack the capacity to create open-ended
generative systems.
Fig. 5. Human and nonhuman
animals exhibit the capacity to compute numerosities, including
small precise number quantification and large approximate number
estimation. Humans may be unique, however, in the ability to show
open-ended, precise quantificational skills with large numbers,
including the integer count list. In parallel with the faculty of
language, our capacity for number relies on a recursive computation.
[Illustration: John Yanson] [View
Larger Version of this Image (49K GIF file)]
Boysen and Matsuzawa have trained chimpanzees to map the number
of objects onto a single Arabic numeral, to correctly order
such numerals in either an ascending or descending list, and
to indicate the sums of two numerals (104-106).
For example, Boysen shows that a chimpanzee seeing two
oranges placed in one box, and another two oranges placed
in a second box, will pick the correct sum of four out of
a lineup of three cards, each with a different Arabic
numeral. The chimpanzees' performance might suggest that
their representation of number is like ours. Closer
inspection of how these chimpanzees acquired such
competences, however, indicates that the format and content
of their number representations differ fundamentally from
those of human children. In particular, these chimpanzees
required thousands of training trials, and often years,
to acquire the integer list up to nine, with no evidence
of the kind of "aha" experience that all human children