As indicated in the previous chapter, there were many important changes in hominid brain anatomy that were not the result of some sort of simple scaling effect (either brain size with body size, or neuroanatomical region with brain size). However, how can we be confident that they are not simply the result of genetic drift, having no adaptive significance whatsoever? It is true that the 3 to 4 fold increase in hominid brain size in ~3 million years is not as dramatic as size changes in other anatomical characteristics (e.g., antler size) that have been documented for other Pleistocene mammals (Smith, 1990, Gingerich 1983). However, this kind of comparison is misleading. The rate of brain size increase in hominids is apparently unique in vertebrate evolution (Jerison 1985), though to be fair, we lack the kind of detailed fossil record for other lineages that we have for hominids.
Fossil prosimians from the Eocene and Oligocene had estimated brain weights of between ~1.5 g (Tetonius homunculus; Lower Eocene) and ~9.6 g (Adapis parisiensis; Eocene), but estimated body weights were small as well, ranging from ~80 g (Tetonius homunculus) to 1600 g (Adapis parisiensis) (Jerison 1973, brain data originally from Radinsky 1970). Estimated EQ's for the early prosimians range from .56 (Smilodectes gracilis) to 2.17 (Rooneyia viejaensis).[1] This latter value is quite high, falling at the higher end of modern monkeys. The oldest catarrhine (old world monkey or ape), Aegyptopithecus zeuxis (dated to the late Oligocene, ~33 MYA), had a brain volume of between 27 cc to 34 cc (Radinsky 1973, 1977) and a body weight of ~4.5-7.5 kg. (Gingerich 1977). This would give it an EQ of between .52 to .97, thus placing it in the low end of modern prosimians.[2] The next major jump in absolute brain size can be found in the Miocene Proconsul (dated to ~18 MYA), which had a brain size estimated to be ~167 cc (Walker et al. 1983). Body size estimates vary widely, but Gingerich (1977), based on molar tooth dimensions, puts it at ~27.4 kg. This would give it an EQ of 1.29 (though the confidence intervals on this are quite large). Unfortunately, we simply do not have a detailed record (as we do with hominids) of the transitions either between the early prosimians and the first monkeys, or between the first monkeys and Proconsul. We therefore do not, as yet, have a good idea how fast absolute brain size changed during these periods. We can simply say that there is no evidence of any primate (or other mammalian) lineage adding anything remotely close to ~1000 cc of brain, let alone adding it in less than three million years, as has been documented in hominids.
There are indications in the fossil record that brain size increase generally requires direct selection. Primates have apparently always been relatively large-brained mammals, but in addition there has been a general increase in relative (and absolute) brain size in most primate lineages over their evolutionary history (Jerison 1973, 1985). General increases in relative and absolute brain size have also characterized many other mammalian groups since the Eocene (e.g., carnivores and ungulates; Jerison 1973). If brain size changes occurred because of random chance, we would not expect to find any such consistent trends towards larger brains in major groups of mammals during such long periods of evolutionary history.
However, there are even more compelling reasons to doubt that genetic drift played a major role. There appear to be severe evolutionary "costs" associated with maintaining brains as large as ours. Costs in this context refer to correlated effects that ultimately translate into decreased numbers of offspring produced per unit time. These costs would therefore be paid every generation. Smith (1990) outlines several specific costs relating to increasing brain size in hominids. First, it is known that humans spend approximately 20% of their basal metabolic energy to keep the brain operating. This compares with only about 9% in chimpanzees and 2% in the average marsupial (Hofman 1983). This could be offset to some extent by increasing body size, since metabolic rate scales only to the .75 power of body weight (meaning that a larger animal spends less energy per unit of weight than a smaller animal). However, a larger body still has absolutely larger energy requirements, and this has concomitant ecological costs. Second, there is a trade-off in bipedal hominids between locomotor efficiency and ease of childbirth: a narrow pelvis lessens the need to shift weight back and forth each step but increases problems associated with childbirth (Lovejoy 1975). Increasing brain size and body size would both exacerbate this problem. Third, increasing brain size is known to be correlated very strongly with longer gestation periods, an increased period of infant dependency, and delayed reproduction (Harvey and Clutton-Brock 1985). All of these decrease the number of offspring an individual can produce per unit time. Fourth, there is a potentially a problem of cooling a larger brain (Falk 1990).
Smith (1990) concludes that: "The magnitude of the costs taken together emphasizes that a life taking advantage of this brain must have been immensely profitable"(p. 366, her italics). If there were no counterbalancing advantages to larger amounts of brain tissue, any individual that had a smaller brain would, by definition, have a selective advantage over his inefficient peers. Since mutations are continuously and randomly occurring at all loci, it is guaranteed that such variations would be fed into the population even if they were not already present. In fact, there is a tremendous range in brain size in modern populations. The 95% confidence intervals on brain weight in modern humans span a range of 440 to 480 cc's (Holloway 1980), which is enough brain mass to run a chimpanzee, gorilla, or orangutan. This indicates that the requisite variation is available for natural selection to operate on. Unless there were benefits to larger brains, brain size should decrease.
By far the most likely explanation for the increase in hominid brain size (and the other changes outlined above) is that it is the result of natural selection. But what are the benefits? Since the brain is the seat of behavior, it is entirely reasonable to hypothesize (as a starting assumption subject to empirical testing) that the benefit(s) are related to some kind of behavioral difference. Generally, biological anthropologists (among others) make exactly this assumption Ð at least implicitly. In the following section, I will review the hypotheses that have been suggested to explain the evolution of the human brain.
Most explanations suggest that the benefit relates in some undefined and not clearly understood way to our intuitive concept of "intelligence." Though this concept means slightly different things to different people, I will take it to refer to the fact that individuals differ with respect to their abilities to solve problems, to generalize concepts from specific events, to think analytically, to use language, and to learn. Whatever it is that is causing variation in these abilities, we will label as "intelligence."
At the same time, some have explicitly maintained that brain size is irrelevant to behavior within humans (particularly whenever there is some acknowledgment of individual and group differences in brain size; e.g., Gould 1981; Hanske-Petitpierre and Chen 1985). However, there is an important evolutionary biological principle that is being glossed over by such statements. If natural selection is the cause of a change in a characteristic over time, there must have been a statistical advantage (in terms of surviving offspring) for those individuals in a given population which were above the average in some characteristic (or below the average if the characteristic has been getting smaller). It is not necessary for the advantage to have been consistent and gradual at all times during the entire history of the evolutionary change, but there must have been, on average, an advantage. Furthermore, this advantage must have been present in each generation during which evolutionary change occurred. This is no less true for brain size than it is for antler size in deer, or canine size in saber tooth cats, or any other characteristic that has changed over evolutionary time in the face of evolutionary (reproductive) costs.
It is of course possible that brain size is irrelevant to "intelligence" (broadly defined) or to other cognitive behaviors for which humans differ from our closest relatives. This is ultimately an empirical question. However, we must keep in mind that if brain size is not related to behavior, we then cannot explain the most conspicuous evolutionary adaptation in the history of life.
At the same time, it is clear that we did not evolve larger amounts of brain tissue simply to do well on modern cognitive tests. While it is reasonable to expect some behavioral benefit of increasing brain size, we should keep in mind that, to the extent that any modern cognitive tests correlate with brain size in modern humans, they would likely do so because they tap into other, perhaps more fundamental cognitive abilities that presumably were selected for in human evolution.
An alternative perspective on brain size is summarized in Jerison (1985). He argues that differences in brain organization between species represent species-specific adaptations designed to construct an internal model of external reality. Species of equivalent encephalization probably share the same overall complexity of this internal model, yet do not necessarily share exactly the same brain organization and may therefore emphasize different aspects of that external reality. Different sensory projection systems in the brain may be dominant in different species, such that one might make greater use of visual information while another makes more use of auditory information. Jerison (1985) suggests that:
"Grades of encephalization presumably correspond to grades of complexity of information processing. These, in turn, correspond in some way to the complexity of the reality created by the brain, which may be another way to describe intelligence." (p. 30).
Thus, it is possible that brain size in hominids was selected for not because of the benefits of any specific ability per se, but rather because there was an advantage in having an increasingly rich and diverse internal representation of external reality. This presumably would be useful for a range of behavioral situations, assuming the increase in brain size could be paid for. In the sections that follow, I review various behavioral dimensions that have been suggested as possible "prime movers" (Falk 1987a ) of brain size increase in hominids.
Humans are different from our closest living relatives in a number of interesting ways. These differences are important starting points in the search for explanations of the neuroanatomical changes that have been outlined above. Hypotheses of human brain size evolution have focused on the possible effects of tool use, language, and social complexity, all of which were probably involved synergistically. Because of the extent to which language is a hallmark of human behavior, and the fact that language use involves a wide range of neural structures, this chapter will be devoted primarily to questions surrounding the evolution of language. However, other dimensions of human behavior which may have played important roles in hominid neuroanatomical evolution will be discussed, including sociality, spatial ability, and memory.[3]
Language is one of the most obvious behavioral differences between humans and our closest living relatives. Because of this, it is one of the most obvious candidates for explaining various aspects of human neuroanatomical evolution outlined above. The evidence suggests that the effects of language evolution on hominid neuroanatomy have been wide-ranging.
A major controversy in language evolution has to do with the extent to which language is a unique adaptation, in which its constituent parts have no meaningful homologies or analogies with any other behaviors found in the animal world. Chomsky (e.g., 1972, 1980), Bickerton (1984, 1990) and Pinker (1994) argue that the most crucial aspect of language, which for them is generative grammar, has no meaningful continuities with other species. Others (e.g., Wang 1984, 1991, Lieberman 1984) argue that there are obvious continuities in all aspects of language. A review of the evidence for continuity suggests that the language faculty evolved piecemeal, through the modification of pre-existing cognitive functions and (presumably) neuroanatomical structures and pathways.
Language requires a web of interlocking abilities, which can be grouped into three general components: phonology, semantics, and syntax. Each of these components involve different neuroanatomical systems.
One of the key features distinguishing human language from the communication of other animals is that (with the exception of a few examples of onomatopoeia) meaningful information is not intrinsic to the individual sounds that make up words (Hockett 1960). Instead, meaning is derived from the specific pattern of these sounds. Hockett (1960) refers to this as "duality of patterning." These individually-meaningless units of sound are known as phonemes, and they can be defined as the smallest segments of sound that may change the meaning of a word if replaced (Fromkin and Rodman 1983). We know, for example that the sounds [I] and [a] are different phonemes because the words [hIt] (pronounced "hit") and [hat] (pronounced "hot") have different meanings.
The neuroanatomical correlates of the ability to both produce and detect these basic kinds of sounds, as well as the ability to separate the continuous acoustic speech signal into meaningful units (words) are crucial components to language. Lieberman (1983,1984,1988) has been a prominent proponent of the idea that there exist unique species-specific adaptations of the generation and detection mechanisms of the speech signal. Nevertheless, the differences between humans and non-humans can be explained through a process of modification of pre-existing features and abilities and do not require the creation of new neurocognitive mechanisms. This conclusion is supported by the fossil evidence for the evolution of the vocal tract, as well as a comparative analysis of the acoustic and communicative abilities of other animals.
The tones that make up speech sounds are produced through the rapid opening and closing of the vocal cords (located in the larynx) as air is pushed out of the lungs. This changes the otherwise steady stream of air into a series of rapid puffs, creating sound waves. The rate at which the vocal cords open and close is known as the fundamental frequency of phonation (F0), and it is determined both by the pressure of the air stream rushing past the larynx and the force by which the vocal cords are pulled closed by the intrinsic muscles of the larynx (Denes and Pinson 1963).[4] The vocal cords also impart other vibrations on the air pressure, known as harmonics, which occur at integer intervals up from F0.[5] The amplitude (volume) of each successive harmonic decreases as its frequency increases.
Even though a full range of harmonics is produced by the action of the vocal cords, the supralaryngeal vocal tract serves as a kind of filter, allowing some harmonics through while absorbing others. The points along the frequency spectrum at which the greatest amounts of acoustic energy are allowed to pass are known as formant frequencies. The relative frequencies at which these formants occur can be changed quite significantly by changing the shape of the supralaryngeal vocal tract, which is accomplished primarily by changing the shape and position of the tongue and lips in concert with changes in the position of the mandible.
It turns out that the relationship between the first two formants (F1 and F2 ) is particularly crucial for determining the identity of different vowels. By plotting F1 against F2 for a large number of vowel sounds, one finds that the various vowels occupy different regions of the plot (with some degree of overlap) (Peterson and Barney 1952). Consonants, the other major class of phonemes, are not distinguished in the same manner as vowels. Consonants are formed by obstructing the flow of air through the vocal tract in various ways.
Evidence for the Evolutionary Continuity of Speech Sounds
The fundamental characteristics of the production and comprehension of speech sounds either show clear evolutionary continuity with closely related species, or have been used analogously by other species for communicative purposes. For example, the anatomy of our larynx is not fundamentally different from that of other mammals (Negus 1949), and further, animals which use their forelimbs for climbing have well developed larynges (Denes and Pinson 1963).[6] Humans are most closely related to the modern apes, with whom we share an upper body anatomy adapted to brachiation[7], which means that pre-linguistic hominids had probably already inherited a well developed larynx from their proto-ape ancestors.
Also, there is nothing particularly unique about the production of the basic signal in humans as outlined above. It is known, for example, that in the bullfrog species Rana catesbeiana formant frequencies produced by their vocal tract (which is different from ours) play a key role in their mating calls (Capranica 1965, Lieberman 1984). Capranica (1965) has shown that these bullfrogs will join in a chorus with a synthesized version of their mating call only if it has concentrations of acoustic energy at either F1 or F2, or both. Thus, other species are known to produce acoustic signals with some of the important features used in human speech, as well as to use these features to communicate.
It is also true that different animals, including humans, use similar sound characteristics to communicate the same kinds of underlying meanings. Other animals use pitch to indicate relative submissiveness (high frequency bias) vs. dominance/ aggressiveness (low frequency bias), and this bias is also found cross-linguistically in humans (Kingston 1991; Ohala 1983).
It is also important to note that some aspects of human speech production are clearly derived from the basic physiology of an organism that breathes air. For instance, Lieberman 1984) points out that the cues that signal the end of a spoken sentence (falling fundamental frequencies of phonation and falling amplitude) "...follow from the physiology of the larynx and the segmenting of speech into episodes of expiration. At the end of an expiration the physiologic conditions that are necessary to initiate inspiration, the opening and detensioning of the vocal cords and the switch from positive to negative air pressure in the lungs, automatically generate the salient acoustic cues of the breath-group," (p. 122). This is an excellent example of how something may be found to be unique to human forms of communication, yet still not require a unique explanation, or "special mechanism."
Therefore, when we say that human speech is unique, we are not referring to the specific acoustic characteristics of individual sounds, or even necessarily of the meaning given particular signals. What is truly unique about speech is the manner in which these sounds are strung together into rapidly changing patterns.
In order to produce the complex sound sequences of language, humans have evolved remarkable neural control over the muscles of the face, larynx, pharynx, tongue, mandible, diaphragm, and ribs. These muscles are innervated by motor portions of several cranial nerves: 1) the mandibular division of the trigeminal (Vth) which controls the muscles of mastication (i.e. the movement of the lower jaw), 2) the facial (VIIth) which controls the muscles of facial expression, 3) the glossopharyngeal (IXth) which controls the stylopharyngeus muscle (and may also innervate portions of the superior pharyngeal constrictor muscle), 4) the vagus (Xth) which controls the levator veli palatini, middle and inferior pharyngeal constrictors, salpingopharyngeus, and all the laryngeal muscles, and 5) the hypoglossal (XIIth) which controls the muscles of the tongue (Carpenter and Sutin 1983). These motor fibers arise from various motor nuclei in the brainstem and constitute what may be considered the most basic level of speech control.
The motor nuclei are in turn connected to various other neuroanatomical regions. The motor nuclei for muscles of the face, jaw, and tongue receive direct projections from the various motor regions of the cerebral cortex, as well as indirect connections (via the reticular formation and central gray regions of the brainstem) with the prefrontal cortex, cingulate cortex (considered part of the limbic system), and diencephalon (Deacon 1989). The laryngeal musculature and the diaphragm, on the other hand, receive only indirect innervation (again, via the reticular formation and central gray regions of the brainstem) from the cingulate cortex and the diencephalon (Deacon 1989).
A number of other areas of the cortex are crucial for speech production, though they are not exclusively language related. Premotor and primary motor areas of the frontal lobe, for example, are involved in any conscious motor activity, whether or not these activities are language related. In general, the production of basic speech sounds involves a broad range of neuroanatomical components.
Evidence for the Evolutionary Continuity of Speech Production
The basic patterns of neural connections that control the musculature involved in vocalization are the same in humans and other primates (and mammals). The differences occur only in the relative proportions and emphases of the different connections (Deacon 1989). It is in this sense that use of the term "reorganization" in reference to evolutionary changes in the human brain is misleading (recall discussion in chapter 2). The basic rudiments of human neural connections are thought to be extremely old.
While we are a long way from fully comprehending the higher cortical control and origin of speech production, we do know that the basic cortical connections inferred from human clinical and electrical stimulation studies match connections found in axonal tracer studies of monkey cortical connections (Deacon 1988a, 1989; Galaburda and Pandya 1982). For example, in humans two areas involved in language processing in the cortex, the posterior inferior frontal convexity ("Broca's area") and Wernicke's area (located at the posterior third portion of the superior temporal gyrus), are connected by a major tract known as the arcuate fasciculus. This tract presumably allows for communication between the cortical region that mediates linguistic meaning and word forms (Wernicke's area) with the area that mediates speech production and grammatical competence (Broca's). Deacon (1984) has shown that the homologous areas of the macaque cortex share the same direct connections that we see in humans. Deacon (1988) notes that "...all of the major pathways presumed to link language areas in humans are predicted by monkey tracer data," (p. 368). As far as we are able to determine, monkeys have the same basic set of neural connections even though they do not have similar behavioral abilities.
It is generally assumed that speech perception in humans has required some sort of neuroanatomical evolutionary change. Lieberman (1984) argues that the keys to differentiating vowel sounds, formant frequencies, are not always clearly present in the acoustic signal. That is, formant frequencies are a function of the supralaryngeal vocal tract, but the harmonics produced by phonation do not necessarily closely match them. There may well be no harmonics produced that correspond exactly to the peaks of optimum resonance as determined by the shape of the supralaryngeal vocal tract for a particular vowel. Lieberman believes that humans are able to estimate the formant frequencies from the imperfect acoustic signal by a process of "analysis-by-synthesis" in which the listener has some inherent understanding of the possible combinations of formant frequencies that are available. This may indeed be true, but I believe it somewhat overstates the problem. The anatomical structure of the peripheral auditory system is constructed in a way that would appear to make the determination of formant frequencies from the typical human speech signal relatively simple.
The anatomical device responsible for the translating acoustic energy into nerve impulses is the Organ of Corti, located in the cochlea of the inner ear (Denes and Pinson 1963). Acoustic energy is funneled down the external auditory meatus to the ear drum, which is connected via three small bones (malleus, incus and stapes) to a small membrane of the cochlea called the oval window. Vibration of the ear drum causes vibration of the oval window, which in turn causes viscous fluid in the cochlea to vibrate.
The cochlea is composed of two long tube-like, fluid-filled portions (scala vestibuli and scala tympani) separated by a membranous structure known as the cochlear partition. The Organ of Corti lies on the basilar membrane (one of two membranes separating the scala vestibuli from the scala tympani along their length). The basilar membrane is narrow (in breadth) near the oval window and broad at the other end (where the scala vestibuli and the scala tympani communicate). This arrangement causes different portions of the membrane to respond differently to the frequency of vibration at the oval window; the highest frequency vibrations cause the largest membrane displacement near the oval window, while lowest frequency vibrations cause the largest displacement at the far end of the scala vestibuli. There are nerve fibers attached to hairs along the length of the Organ of Corti, such that different nerve fibers will be maximally stimulated by different frequencies of sound.
Since the basilar membrane is continuous, sound transmitted at one frequency (thereby maximally displacing the membrane at a point specific to that frequency) will automatically stimulate adjacent nerve fibers. Thus it is not crucial for the exact frequency of a particular formant frequency to be produced by the speaker's vocal tract: It can be estimated automatically given the frequencies that are represented in the acoustic signal. In other words, the method of translating air pressure fluctuations (i.e., sound) into nerve impulses is "designed" to sample different frequency maxima. The areas of maximum displacement on the basilar membrane will therefore occur at locations that correspond to the formant frequencies of any vowel sounds that are transmitted to the cochlea. Our peripheral auditory system would appear to be ideally constructed to operate as a sound spectrogram analyzer.
Approximately 28,000 receptor neurons in each cochlea have fibers in the organ of Corti. Their cell bodies are located in the spiral ganglion of the cochlea. Axons from these neurons comprise part of cranial nerve VIII (Vestibulocochlear), which enters the brain stem at the level of the medulla. These axons synapse here at the cochlear nucleus. Each receptor neuron in the cochlea synapses with about 75 to 100 cells in the cochlear nucleus of the medulla, though there are only about three times as many cells in the cochlear nucleus as there are incoming fibers from the cochlea itself (Denes and Pinson 1963).
Axons from the cochlear nuclei synapse either directly on the inferior colliculus of the midbrain, or indirectly through cells in the superior olivary nucleus of the medulla which synapse in the inferior colliculus. From here, fibers synapse on the medial geniculate body of the midbrain, which sends fibers directly to the primary auditory cortex of the temporal lobe. At the level of each of the brainstem nuclei there are numerous connections to the corresponding contralateral nuclei (Carpenter and Sutin 1983).
Evidence for the Evolutionary Continuity of Speech Perception
The auditory system did not appear in hominids for the express purpose of allowing the development of language, however. Our auditory system is essentially the same as is found in all higher vertebrates. Lieberman (1984) notes that the basic mammalian auditory system is about 200 million years old. The ability to extract the patterns of formant frequencies embedded in sound waves is not a unique adaptation for language. Our level of sensitivity to such patterns of frequency maxima may be more acute, but even this has apparently not been demonstrated. In this respect, humans are probably only quantitatively different from other species.
For example, it has been shown that mynah birds "copy" human speech by mimicking the relative changes in formant frequencies (they produce two different tones at a time, one from each syrinx (Klatt and Stefanski 1974; Lieberman 1984). Obviously, if they can copy the formant frequencies, they must be able to perceive them. Fouts, et al. (1976) have shown that common chimpanzees (Pan troglodytes) can understand spoken English. Savage-Rumbaugh (1988) reports that one pygmy chimpanzee (Pan paniscus) named Kanzi responds correctly to a large array of spoken English words (even in strict double-blind experiments). He is also able to do this with computer-synthesized versions of the words. His ability to perform these kinds of tasks indicates quite clearly that pygmy chimps can hear the same kinds of phonemic distinctions that humans use.
Lieberman (1984,1988) points out that our ability to follow streams of phonemes is much greater than our ability to follow a series of non-phonemic sounds. He argues that this indicates the evolution of unique detection mechanisms in humans for phonemes. However, this does not necessarily follow. The very fact that humans are able to detect non-phonemic sounds sequences, albeit at a reduced rate, makes it entirely possible (and more realistic from an evolutionary perspective) that the human ability represents a fine-tuning of existing detection systems rather than the evolution of entirely new kinds of mechanisms just for this purpose. It should be noted that there is no direct evidence for unique neuroanatomical adaptations in humans for speech perception (though this is not proof that there are none).
Lieberman (1984) also points out that phonemes can be decoded by listeners even though they vary tremendously in acoustic characteristics from speaker to speaker (particularly in the specific frequencies of F0, F1, F2, etc.). Individual speakers have different length vocal tracts, which has the effect of shifting all the formant frequencies upwards (for short vocal tracts) or downwards (for long vocal tracts). Because of the position that [i] sounds occupy in the "vowel space" (low F1 and high F2) the shift does not cause [i] to overlap with other vowel spaces. [u] sounds, though they occupy a different part of the plot, are not easily misinterpreted with other vowels produced by other speakers. This has led some to suggest that [i] and [u] may serve to allow the listener to calibrate the length of the speakers vocal tract. Lieberman (1984) notes that "Human listeners probably make use of whatever cues are available to infer the probable length of a speaker's supralaryngeal vocal tract. For example, if the identity of a vowel is known, a listener can derive the length of the vocal tract that would have produced it." (p. 166)[8]
The ability to adjust for differences in acoustic features between speakers indicates that humans have a fairly sophisticated pattern recognition ability built into our auditory system. The important information carried in the acoustic signal is conveyed by the pattern of formant frequencies, not the specific frequencies that are used. It is true that humans are good at extracting these patterns from the acoustic signal, but this ability is probably not limited to humans, though the relevant experiments have not been done. We do know that the general ability to detect patterns from series of specific examples (which vary from one another) is an ability that is not limited to humans. This ability is implicit to any kind of learning. The more the behavior of an organism is learned, as opposed to being instinctual, the more this kind of ability is essential. Without it, the organism would never be able to, for example, form a search image for food items. Each item the organism happened upon would have to be treated as an unique object.
There is indirect evidence that other animals can apply this general pattern recognition ability to human speech sounds, since we know that some animals can make adjustments for differences in the specific acoustic signal between speakers. Recall that the pygmy chimpanzee Kanzi demonstrated an ability to understand both normal spoken words and computer synthesized words. Since it is very unlikely that these two sets of words contained identical acoustic parameters, and since Kanzi showed a high degree of concordance in understanding both types of words, this set of experiments suggests that even the specific application of this pattern recognition ability can be accomplished by at least some non-human animals. Again, it would appear that humans do not have unique abilities, but rather simple extensions of abilities found in other animals.
Categorical Perception
An important feature of human perceptual ability for language involves the ability to sharply divide continuous changes in the acoustic signal into discrete categories. For example, the acoustical difference between /da/ and /ta/ lies in a difference in the start of vibration of the vocal cords following the beginning of the acoustic signal (the position of the tongue and the shape of the supralaryngeal vocal tract are the same for the two sounds). The time between the beginning of the acoustic signal and the initiation of vocal cord vibration is known as voice-onset-time, or VOT. For English speakers, a VOT of 0 milliseconds is perceived as /da/, and a VOT of greater than +50 milliseconds is perceived as /ta/ (Kuhl 1986). However, English speakers do not perceive the continuum of differences in VOT from 0 ms to +80ms in a graded fashion. All VOT's from 0 ms to +20 ms are perceived as equally "good" /da/'s, and all VOT's of +50 to +80 are perceived as equally "good" /da/'s (Kuhl and Miller 1975). There is a sharp perceptual transition between the /da/ and /ta/ centered around a VOT of about +35 ms. Furthermore, the ability to distinguish between /da/ and /ta/ is greater than the ability to distinguish different /da/'s or different /ta/'s, even though the difference in VOT's within a category can be as great as the difference in VOT between categories. In other words, two speech sounds with VOT's of +25 ms and +45 ms are perceived to be more different from one another than two speech sounds with VOT's of 0 ms and 20 ms (Kuhl 1986; Kuhl and Miller 1975). Essentially the same phenomenon has been found for both the continuum between /ba/ and /pa/ sounds and the continuum between /ga/ and /ka/ sounds, except that the transition boundary for /ba/ and /pa/ is centered around a VOT of about +27 ms and the transition boundary for /ga/ and /ka/ is centered around a VOT of about +42 ms (Kuhl and Miller 1978).
"Categorical perception" has also been demonstrated for other kinds of phonemic distinctions. Mattingly, et al. (1971) demonstrated categorical perception for the /b¾/-/d¾/-/g¾/ continuum. This distinction depends not on VOT, but on differences in the second formant transition, which are caused by differences in the place of articulation of the tongue (differentiating these phonemes is therefore referred to as "place discrimination"). Miyawaki et al. (1975) have demonstrated categorical perception in the distinction between /ra/ and /la/ in English (but not Japanese) speakers, which depend on differences in the third formant transition (also caused by differences in the place of articulation of the tongue).
Snowdon (1990) points out that the categorical nature of the perception of phonemes is fundamentally different from the perception of other dimensions of auditory stimuli, such as basic duration, frequency, and intensity of tones, which have been shown to be perceived in an essentially continuous fashion (Divenyi and Sachs 1978; Kuhl 1986). It has also been shown that infants as young as one month old demonstrate categorical perception, which implies that this ability is not simply learned through years of experience (Eimas 1974, Eimas 1975, Eimas et al. 1971).
Evidence for the Evolutionary continuity of Categorical Perception
Categorical perception was initially thought to indicate that humans had evolved unique neurological adaptations for decoding the speech signal (Kuhl 1986). It had also been found that some non-speech sounds were perceived categorically by humans (Miller, et al. 1976), perhaps indicating the explanation lay in some kind of general mammalian auditory phenomenon, but these results were usually explained as simply an over-broad application of a special speech perception mechanism (Kuhl 1986). However, when the appropriate experiments were performed on a variety of non-human animals, it became clear that categorical perception was not a phenomenon that was limited to humans. Initial comparative experiments succeeded in demonstrating categorical perception in chinchillas for VOT of /da/-/ta/ (Kuhl and Miller 1975), as well as /ba/-/pa/ and /ga/-/ka/ (Kuhl and Miller 1978). What was perhaps most interesting, however, was that even though humans perceive different phonemic boundaries for each of these VOT continua (see above), the chinchillas nevertheless demonstrated essentially the same boundaries as humans for each of these phonetic continua. Subsequent work with macaques succeeded in demonstrating that this species also showed the same kind of categorical perception as had been shown for humans not only for VOT (Kuhl and Padden 1982), but also for place discrimination (Morse and Snowden 1975; Kuhl and Padden 1983). More recently it has been shown that Japanese quail can learn to categorize phonemes just as humans do (Kluender et al. 1987).
Furthermore, Snowdon (1990) reviews studies showing that categorical perception of species-specific vocalizations have been demonstrated in several other species: Snowdon and Pola (1978) demonstrated categorical perception for trill duration within pygmy marmosets, Masataka (1983) found evidence for categorical perception in the alarm calls of Goeldi's monkeys (Callimico goeldii ), and Ehret (1987) demonstrated it in mice (Mus musculus) for their pup's calls. I am not aware of any studies of the ontogeny of categorical perception in other species.
Clearly, the phenomenon of categorical perception can no longer be considered evidence for an adaptation designed by natural selection specifically for human speech. Lieberman (1984) argues that a general mammalian auditory mechanism is unlikely for the categorical perception of place discrimination because rhesus macaques can discriminate between different formant transition patterns that exist within human phonemic categories (i.e., they make distinctions that humans do not make; Morse and Snowden 1975). It is certainly possible that some aspects of the perception of human speech are due to evolutionary changes occurring in human evolution. In general, however, it is likely that during the evolution of language sounds would have been adopted both for their ability to be clearly distinguished by the existing perceptual systems and for ease in being produced by the existing vocal apparatus. Selection would have operated on both of these systems simultaneously. One scenario suggested by Kuhl (1986) is that "...speech capitalized on a set of existing auditory boundaries, but then elaborated on them, perhaps to take into account constraints inherently imposed by articulation," (p. 261). Early hominids would most likely have 1) made use of inherently obvious acoustic differences, and 2) enhanced these existing perceptual abilities. It therefore makes perfect evolutionary sense for modern humans and monkeys share features of categorical perception while differing in the degree of discriminatory ability (Lieberman 1984).
While the experiments that have so far been performed do not support the general idea of language-specific perceptual mechanisms, Lieberman (1984) nevertheless believes that other dimensions of the speech sounds (such as VOT distinctions based on burst amplitude, aspiration amplitude, and vowel duration) probably do require some kind of special mechanism. However, it is clear that the mere demonstration of some kind of sophisticated or complex acoustic perceptual ability in humans does not license us to assume the existence of special language mechanisms. All the dimensions of categorical perception that have so far been tested on non-human animals show unmistakable parallels with human abilities, and the burden of proof now lies with those who claim that special perceptual mechanisms exist in humans. As Kuhl (1986) notes: "No one predicted that monkeys and chinchillas would divide speech continua the way they do, so it does not seem wise to advance the claim that this or that phenomenon surely won't be demonstrated," (p. 254, italics in original). Only more research with non-human animals will allow us to discriminate between what did and did not evolve specifically for speech.
Thus, it would appear that categorical perception does not constitute even indirect evidence of neuroanatomical changes related to language evolution.
There are clearly some aspects of language that evolved because of pre-existing constraints. Lieberman (1984) notes that humans can produce non-nasal sounds because of our lowered larynx. We know that, on purely acoustic grounds, nasal sounds are inherently harder to distinguish than non-nasal sounds. Furthermore, we know that languages that use nasalized vowels are not very common (Greenberg 1963). Lieberman (1984) argues that lowered larynx in humans evolved to make the production of non-nasal sounds easier. There was, however, evidently no evolution in the detection apparatus to enable us to hear nasal sounds clearly. This kind of evolutionary change, where one system (in this case the production device) evolves in response to some kind of pre-existing constraint (the inherent "sloppiness" of nasalized signals), is quite easy to explain. However, even though the mechanisms of speech production and speech perception are physically separate systems, their evolutionary histories were clearly not independent of one another.
Lieberman (1984) believes that speech production and speech perception mechanisms are coevolved systems, but he discusses them in separate sections. He does acknowledge that it would be useful to "...differentiate between speech mechanisms that have evolved for the specific end of facilitating vocal communication and auditory mechanisms subject to the general constraints of audition that do not reflect the selective pressure of communication" (p. 170). He goes on to state that "The distinction can be tricky since particular linguistically relevant sound contrasts may have evolved in human languages to take advantage of auditory mechanisms." (p. 170, my italics). I would argue that this is not just a possibility, but that we should expect it to be the rule.
If we accept that behavior evolves and is not created instantaneously, we are implicitly recognizing that there is a continuum of intermediate steps between our present ability and our past inability. These intermediate steps necessarily involve small adaptations on what previously existed. What can occur is constrained by what has occurred. If there is selection in some direction (towards linguistic ability, for example) and if there are several evolutionary paths that potentially lead in that direction, the actual path taken is always more likely to occur along the path of least resistance. If there is some selective advantage to communicating at any given moment in time, individuals that utilize existing anatomical/auditory/neurological structures will necessarily have an advantage over those who wait generations for unique adaptations to arise. Furthermore, it is usually easier for the individual organism to change its behavior than to change its biology. If some kind of change will benefit the individual organism, such as more complex communication ability, it is much more likely to occur on a behavioral level first, using whatever biological adaptations are already available to it. Some individuals will have biological adaptations that better enable them to accomplish the behavioral changes that confer a selective advantage. A feedback process will occur over succeeding generations between increasing behavior and increasing biological change, such that these aspects are probably never, in principle, separable. The evidence discussed in the section on categorical perception clearly points to an evolutionary process in which existing behavioral abilities are capitalized upon, not one in which new abilities are created from scratch. This is exactly what we should expect (Lieberman 1984).
Lieberman (1984) quite rightly argues that the communicative signals and the perceptual equipment of any species are coevolved systems, but by separating his analysis of the two systems, he leaves the impression either that 1) the anatomy of the supralaryngeal vocal tract evolved towards the production of sounds which would not have been detectable by the existing neural mechanisms, or 2) that the neural mechanisms evolved toward the ability to detect sounds which would not have been producible by the existing supralaryngeal vocal tract. Neither of these scenarios can be considered real possibilities. Each and every allelic change along the evolutionary path must necessarily have been adaptive in the context of the constellation of alleles that existed at that time. This means that evolutionary changes in the ability to produce sounds would most likely have occurred in a direction that was, at each step along the way, inherently easier for the existing neural detection mechanisms. Concurrently, each and every evolutionary change in neural detection mechanisms would most likely have occurred in a direction which allowed a greater ability to detect sounds that were being produced by the existing anatomy.
Thus, while coevolution of these two systems made humans different from other organisms, we should expect human neuroanatomical evolution to represent quantitative changes, not entirely unique detection or production mechanisms. Lieberman (1984) clearly recognizes this necessity when he states that "Though I believe that the weight of evidence is consistent with the presence in human beings of neural mechanisms matched to the acoustic characteristics of human speech, it obviously must be the case that the initial stages of vocal communication in archaic hominids must have been based on neural mechanisms that are more general" (p. 169).
There are three dimensions to language semantics that will be discussed. The first concerns the representation of the semantic concepts that underlie word and sentence meanings, the second concerns the actual words themselves, and the third concerns the interface between the semantic concepts and the words. Areas of the cortex that focus on each of these three components can be found for many different types of words and associated meanings (Damasio and Damasio 1992).
The purpose of language is to communicate information. Thus, a key component to language is the ability to connect up arbitrary sequences of sounds (i.e., words) to concepts, ideas, and objects. This aspect of language, more than any other, would appear to involve vast amounts of cortex. The first dimension of this process is the formation of semantic concepts in the brain. This is not strictly a linguistic process, but it is an absolutely crucial foundation for language, and probably is the key to language evolution. It is not known exactly how semantic concepts are formed, but a general picture can be drawn of the process involved.
The semantic concept underlying a word has a number of dimensions. For example, the word "child" evokes a number of associations, including such things as small size, boundless energy, naivetŽ, lack of knowledge about the world, curiosity, etc.. In fact, it may be that semantic concepts are simply the web of associations themselves, not a single entity located somewhere in the brain. This idea is consistent with the work on neural networks, in which knowledge is not contained in any one place, but is distributed throughout the network (Lieberman 1984).
Some of the associations attached to a specific semantic concept are concrete (e.g., physical size, color, texture, smell, etc.), and some are increasingly abstract (e.g., energy level, curiosity, attractiveness, etc.). The concrete associations are aspects of the semantic concept which are directly perceived by one of the primary senses. Recent work with neuroimaging techniques (PET and functional MRI) has shown that when we imagine a visual image, the same cortical areas are activated as when the actual image is seen (Posner and Raichle 1994; see also Cohen et al. 1996). It is also known that increasingly sophisticated and abstract processing of primary sensory data occurs in areas adjacent to the primary sensory cortices. For example, Damasio and Damasio (1992) report that damage to the occipital and subcalcarine portions of the right and left lingual gyri causes achromatopsia (loss of color perception) only, without loss of other aspects of visual experience. Another area of the prestriate cortex, known as V5, appears to be involved specifically in processing black and white moving objects (Zeki 1992). Thus, the farther one goes from primary sensory cortex, the more abstract the processing appears to be, and the more abstract the form of knowledge that is represented there.
Semantic concepts appear to be stored in a distributed fashion over the whole cortex, with multidimensional concepts involving webs connecting widespread locations that process many different kinds of information in many different ways. This model is consistent with studies of amnesia, which suggest that memory formation is dependent on medial temporal structures (primarily the hippocampus) and midline diencephalic structures (e.g., mammilary bodies), but that memories themselves are not kept in these locations. Damage to the hippocampus results in the loss of long term memory formation, yet complex memories persist of events that occurred prior to the damage (Squire 1987). This indicates that the memories must be located in other areas.
Neuroanatomically, the linguistic representation of semantic concepts appears to be separable from the semantic concepts themselves (Damasio and Damasio 1992). Damage to posterior perisylvian areas (including Wernicke's area and adjacent cortex) affects the construction of words from phonemes, and can also affect the selection of whole word forms (but does not affect overall fluency of the output, or the general syntactic structure of the sentence). Damasio and Damasio (1992) note also that damage to these areas affects the processing of speech sounds into words, which results in comprehension difficulties.[9] They suggest that the auditory and kinesthetic records of phonemes and phoneme sequences are stored here. Neuroimaging studies in which a subject simply listens to words show activity in areas approximating Wernicke's area (Posner and Raichle 1994). In contrast, passively reading words produces activity primarily in areas approximating the visual cortex in the occipital lobe (Posner and Raichle 1994).
In order for the semantic concepts to be communicated via words, there must be a way in which these concepts (and their interrelations) are reliably translated into words and sentences. Clinical studies of brain damaged patients point to anterior and mid-temporal cortex as areas crucial for this process with respect to at least some classes of nouns (Damasio and Damasio 1992). Patients with damage here appear to have normal understanding of semantic concepts, in that they can correctly describe and discuss all sorts of aspects and relations of a particular object, yet they cannot reliably produce the name of the item in question. However, they do not have trouble with all semantic units denoted by nouns. Such things as fruits, vegetables, animals, and musical instruments, are harder to name than tools, utensils, and colors, for instance. It is not clear why this is true, or whether this indicates a fundamental conceptual division (it is hard to see what the troublesome words/concepts have in common).
In contrast, the area responsible for associating color terms with the perceptual features of different colors appears to be located in the temporal portion of lingual gyrus (located in anterior-inferior occipital cortex). Patients with damage localized in this region are unable to give the correct name for a particular color, even though they can sort objects by color and otherwise demonstrate that they have and can use the semantic concepts of color (Damasio and Damasio 1992).
The localization of the areas in which semantic concepts are translated into (or matched up with) actual word forms is an area of ongoing research. There are apparently other areas of the cortex which are crucial to this process, but which have not yet been adequately mapped. For example, Alexander et al. (1989) note that damage to the left operculum (posterior part of Broca's area) and adjacent lower motor cortex results in word-finding problems (however, this may be due to difficulty finding the correct motor sequence for the word, and not the word itself).
It comes without saying that pre-linguistic hominids must have had mental concepts to convey to one another, or they would never have begun talking. Did mental concepts originate with hominids or are they ubiquitous among living things? Several lines of evidence point to a quantitative difference between human and non-human organisms in semantic complexity, combined with a similar difference in the ease of attaching arbitrary symbols to different semantic concepts. First I will make a few general observations with respect to simple organisms, before discussing evidence that apes (and other species) not only have semantic concepts, but can also use arbitrary symbols to communicate about them.
I agree with Bickerton (1990) that the formation of mental concepts, in the form of categories, is actually quite fundamental to the existence of life. Bickerton (1990) points out that an "...essential difference between animates and inanimates is that the former are continually acting in their own interests to maximize their chances of survival, whereas the latter aren't" (p. 77). This means that it is essential for animates to react appropriately to important information in the environment. Bickerton adopts Gregory Bateson's (1979) definition of information which states that it is "a difference that makes a difference," (pp. 62-63). In order to use this information, the organism must be able to detect it, and this means that it must differentiate it from the other less important aspects of the environment. In the most primitive organisms, the amount and variety of information that can be detected is quite limited. Nevertheless, if an organism reacts differentially to some kinds of information than to others then it must have some form of representation of this information. For example, some 'sensitive plants' respond to touch by folding their leaves together. The plant's biochemical responses to this information are, of course, separable from the actual touch, and these biochemical responses can therefore be considered the plant's representation of physical contact. Bickerton points out that the ability to "...differentiate two states where others differentiate none..." (p. 79) is a crucial step in the evolution of representational systems. It is the beginning of category formation. In general, we can conclude that if some living thing shows a differential response to different kinds of information in its environment, it must have some sort of "category" for each of these "kinds of information," even if these are simply different neural pathways.
Some phylogenetic lineages, but not all (bacteria are still extremely successful life forms), evolved in the direction of increasing complexity, which included increases in the amount and variety of information that they were able to glean from their environments. This process ultimately led to the development in multicellular organisms of many different senses: smell, sight, hearing, touch, temperature, pain (which is actually a representation of excessive or dangerous levels of stimulation of the other senses), and others. At the same time, many organisms evolved increasingly complex processors of these new sources of information. In the simplest organisms, there is a direct connection between sensory input and some response. More complex organisms retain these kinds of direct connections for important simple responses (these are often called "reflexes"), while at the same time adding on the ability to evaluate many kinds of sensory information before "deciding" how to respond.
Increasingly complex organisms form an increasingly large number of categories. Bickerton (1990) points out that "The sea anemone, for instance, divides the world into 'prey' and 'nonprey', then divides the latter category into 'potential predators' and 'others', while the frog divides the world into 'frogs', 'flying bugs', 'ponds', and perhaps a few other categories like 'large looming object (potential threat)'," (p. 87). The kinds of categories that can be formed by complex organisms are not limited to specific sets of objects, like 'flying bugs' or 'ponds'. If they have multiple senses interconnected to each other they can form more abstract categories such as 'running', 'sleeping', or 'friendship'. As the mental complexity of an organism grows, the number and kinds of categories it can form will also grow. It is important to note, however, that the concepts recognized by one species may not be recognized by another. Dogs, for example, comprise one of many species that cannot differentiate colors. Humans cannot hear the acoustic echoes that bats use to differentiate between an insect and a bird. Each species has evolved to pay attention to (i.e., form categories of) those parts of the environment which became most important for its own survival.
Bickerton (1990) believes that because the categories that make up one species' representation of the world may not be the same as another species' categories, we cannot speak of there being a "true view of the world." However, it is false to conclude that because a particular species representation of reality is limited it is therefore somehow 'untrue'. It is more fair to say that a given species representational system covers a subset of the various dimensions of reality. In fact, the representational systems of many species overlap. Bickerton (1990) himself notes that pigeons have been shown to have visual categories for such things as 'people', 'trees', 'fish', and even 'Snoopy cartoons' that are essentially the same as our own (Herrnstein 1979). This clearly shows that, to a significant extent, human languages and cultures have made use of categories that are 'real' to a wide variety of animals.
It is also known that some animals communicate different concepts vocally. Snowdon (1990) provides a thorough review of the literature on this topic. Seyfarth et al. (1980) showed that vervet monkeys use three different alarm calls that are specific to three different types of predator: eagles, snakes, and leopards. The behavioral responses to these alarm calls are quite different: upon hearing the eagle call the vervets ran down to the ground and sought cover whereas the leopard call induced the vervets to run into the trees. By repeatedly playing back a recording of the same individual's call before switching to a different alarm call, Cheney and Seyfarth (1988) were able to show that habituation by a group of vervets to one of these calls was not transferred when the call was changed to one signifying a different kind of predator. This strongly suggests that the signals really do carry semantic meaning.
Furthermore, this kind of phenomenon may not be limited to primates. Owings and Hennessey (1984) have shown that California ground squirrels generally give different calls for ground predators than for aerial ones. They behave differently in response to the two calls, but it appears that the true semantic meaning of the calls may be relate to the urgency in the need to respond. The 'aerial' call is often given when a ground predator has approached very close without detection, and the 'ground' call is often given in response to an aerial predator that is far away. Snowdon (1990) has argued that, "Although the relative distance of a predator or urgency of response can also be considered to be symbolic, most authors have used symbolic communication as referring to specific objects in the environment," (p. 230). Nevertheless, I do not see how we can escape the conclusion that the two calls have different semantic meanings.
Snowdon (1990) reviewed several studies relating to food-associated calls and argues that the semantic meaning behind many of these calls relate more to motivational factors (e.g., more calls are given when more food, or higher quality food, is found) than to actual food types. Examples of this kind of calling have so far been found in chickens (Marler, et al. 1986a, 1986b), cotton-top tamarins (Snowdon 1990), toque macaques (Dittus 1984), and chimpanzees (Hauser 1987). Snowdon (1990) points out that these are generally not considered examples of symbolic communication. Again, I would argue that these calls relate to specific emotional categories, and that therefore there is no reason to differentiate them, in principle, from other kinds of symbolic communication. It is clear that these sorts of symbols are not nearly as complex or varied as those used by humans, but I do not see how we can escape the conclusion that they are symbols nonetheless. These animals are not transmitting the emotion itself, they are transmitting a vocal symbol of their emotional state.
Infant rhesus and pigtail macaques have been shown to use five discrete types of screams to elicit aid upon feeling threatened (Gouzoules et al. 1984; Gouzoules and Gouzoules 1989). Snowdon (1990) suggests that these different calls could simply indicate the degree of emotional arousal in the infant, but he points out that they do not grade into one another as one would expect if they only coded the level of arousal. He also notes that when the taped screams were played back to the mothers, they responded in a manner indicating that there was more encoded in the call than the simple emotional state of the infant. The mothers did give the greatest response (defined as the shortest latency and the longest duration of response) to the calls that indicated the highest probability of the infant being physically hurt, but their second strongest response was to calls that indicated a challenge to the dominance hierarchy along with the least probability of aggression to the infant.
Snowdon (1990) also notes that there are many examples of animal signals which are given differently depending on who is near the signaler ("audience effects"). Male chickens, for example, rarely give food calls in the presence of other males, even though they do in the presence of females (Marler 1986a, 1986b), and they give alarm calls only when conspecifics are present (Karakashian 1988). Ground squirrels (Spermophilis beechyii) give alarm calls more frequently in the presence of danger if there are kin nearby (Sherman 1977), and the same effect has been shown for vervet monkeys (Cheney 1985). As Snowdon (1990) points out, these studies indicate that "...at least some animal communication cannot be reduced to simple reflexes or fixed-action patterns," (p. 233). These examples indicate quite clearly that humans are not the only organisms to have mental concepts and to have labels for them to facilitate communication with others.
Perhaps the most impressive example of this comes from a grey parrot (Psittacus erithacus) named Alex trained and studied by Irene Pepperberg. Alex is interesting because he can communicate using English words. He has shown the ability to identify, request, refuse, quantify, and comment on more than 100 objects that vary in color, shape, and material (Pepperberg 1990). He can also categorize objects by color, shape and material, and can even tell which category two objects are the same in and which category they are different in (all the while giving answers in English).
Further examples of non-human use of symbolic referents in communication come from those species which are phylogenetically closest to us: the great apes. These creatures also have, outside of humans, the largest brain sizes among the primates, as noted in chapter 2. Most of these studies have focused on the common chimpanzee (Pan troglodytes), although there has been recent work with pygmy chimpanzees (Pan paniscus), and there have been studies of gorillas and orangutans.
Snowdon (1990) reviews the ape studies very concisely. He notes that the initial attempts at demonstrating symbolic abilities in common chimpanzees focused on teaching chimpanzees to speak English by raising them at home as if they were human children (Kellogg and Kellogg 1933, Hayes and Hayes 1951). These attempts were singularly unsuccessful at teaching them human speech. It became clear that their inability to speak might well be distinct from their ability to use symbols to communicate, and the focus therefore shifted to attempts at teaching them to communicate via some other channel. Starting in the mid-1960's, the Gardeners' attempted to train a chimpanzee, named Washoe, to communicate using American Sign Language (Gardner and Gardner 1969). Washoe and three other chimps raised by the Gardners attained vocabularies of between 122 to 168 signs (Gardner et al. 1989). Similar successes were reported for a gorilla (100 signs; Patterson 1978) and an orangutan (127 signs; Miles 1990).
However, objections were raised by Terrace et al. (1979) who also had attempted to train a chimp (Nim Chimpsky) to use American Sign Language. Among Terrace, et al.'s criticisms was the lack of control for possible "Clever Hans" (i.e., subtle cueing) effects. They argued that, for their data at least, Nim's use of the signs could easily be interpreted as simply copying his trainers without having any real understanding of the meanings of the signs. However, Gardner and Gardner (1984) showed in double-blind tests that some form of cueing was not an explanation for their chimps, who could correctly name objects that the experimenter/observer themselves could not see.
Terrace et al. (1979) also criticized the Gardners for reading too much into Washoe's abilities. One widely quoted example is of Washoe, upon seeing a swan on a pond, signing the words for "water" and "bird" in immediate succession. Terrace believed that Washoe was simply signing what she had seen, i.e., water and a bird, in contrast to the Gardners who believed she was creating a new word. Nevertheless, it appears that chimps can associate objects with appropriate signs, and this means they must have concepts in mind to connect to the signs.
Savage-Rumbaugh et al. (1980) argued that sign language studies were flawed in that many signs were not really symbolic but actually iconic. Although Gardner et al. (1989) contest this argument, other studies were designed which involved the use of "truly arbitrary" symbols in a manner such that possible ambiguities could be controlled. Premack (1972) used various plastic shapes to symbolize objects, and his chimp (Sarah) was given only two choices as possible responses to questions. Rumbaugh et al. (1973, see also Rumbaugh 1977) used lexigrams on a board which allowed the chimp (Lana) to select symbols in the appropriate order to ask for various items. Gardner and Gardner (1978), in turn, pointed out that these studies asked little more from the chimps than for them to learn simple discriminations (Premack) or lists (Rumbaugh), and did not allow for any creativity. Snowdon (1990) points out that this severely limits our ability to judge exactly what the chimps where learning.
Nevertheless, Premack's chimp Sarah demonstrated, quite convincingly, that chimps not only have semantic concepts, but also that they can assign and use arbitrary symbols to communicate information about them. When asked to provide the color and shape of apples, she correctly chose the symbols for "red" and "circle", even though her icon for apple (which was used to ask her the questions) was a blue triangle. This is as clear a demonstration as possible that chimps have the basic neural structures for the semantic underpinnings of language (Desmond 1979). Subsequent work by Savage-Rumbaugh (1986) showed that chimps could be trained to use arbitrary symbols to ask for specific items from an array, to ask for items which were out of sight, to respond to symbols requesting items from another room, and ultimately to request another chimp to get items for them.
One pygmy chimpanzee (Pan paniscus) named Kanzi is particularly remarkable in his ability to use symbols in a linguistic manner. Kanzi was presented with 660 commands asking him to behave in certain ways with particular items (Savage-Rumbaugh et al. 1993). Many of these commands had never been presented to him before (e.g., "Pour the lemonade in the Coke."). The final 416 of these trials were done blind, in which the person giving the commands was not visible to Kanzi, and the person with him either covered their eyes (for the first 100 blind trials) or wore headphones playing loud music (for the remaining 316 blind trials) to ensure that they would not inadvertently cue him. Overall, Kanzi responded correctly on 72% of trials (74% on the blind trials). Furthermore, the vast majority of incorrect responses were partially correct, indicating that he had at least some understanding of the spoken sentence. In only 1% of the 660 commands was he wrong in all aspects of his response. Obviously, he must have understood the connections between the symbols (the spoken English words) and their referents in order to accomplish this feat.
The fact that chimps have been trained to communicate in these ways seems to me to be unequivocal evidence that they are able to 1) form mental concepts, 2) assign arbitrary symbols to these concepts, and 3) communicate specific ideas concerning these concepts via purely symbolic means. However, this is not to say that their abilities are identical to those of humans. Snowdon (1990) points out that, "Although the abilities of Kanzi and his companions are remarkable and come very close to some of the capacities shown by young children, there still appear to be limitations. Bonobos [pygmy chimpanzees] and chimps appear to be more limited in the topics that they find interesting to communicate about ." (p. 222, italics added). This is an extremely important point. The difference between humans and non-humans does not lie in the ability to create concepts, or even to assign symbols to these concepts. The difference lies in the complexity and variety of the concepts that they are able to form. This is a difference in degree, not in kind.
It is even possible to make a rough quantitative assessment of this difference. Studies of the linguistic abilities of apes typically report vocabulary sizes on the order of a few hundred items. Miles (1990) reported that the orangutan Chantek learned 127 signs. Kanzi responded correctly at least 75% of the time on 149 English words in a double-blind vocabulary study (Savage-Rumbaugh 1988). Estimates for humans, in contrast, are orders of magnitude higher. It has been estimated that reading vocabulary of the average high school student is in the neighborhood of 40,000 items (defining groups of related words, such as "write," "writes," "written," "writer," etc., as single vocabulary "items"; Miller and Gildea, 1991). If we add proper names, the estimate approaches 80,000 items, which means children are presumably learning new words at an average rate of about 13 per day from age one to age 17 (Miller and Gildea, 1991). This suggests, purely as a rough guide, that the difference in cognitive complexity between apes and humans is on the order of a 200- to 400-fold change. Such a huge numerical difference would surely engender differences which would appear qualitative in nature, even thought the steps necessary to get from apes to humans in this regard obviously can be seen as incremental.
It is clear that humans have a much richer set of categories than sea anemones, frogs, and apes. We also have much more complex interactions with the external world. This world includes other complex humans with whom we must interact with in order to survive. At the same time, as I have pointed out above, we have brains that are much larger than are necessary to maintain bodies of our size. I find it hard to believe that this size increase does not relate directly to the difference in the degree of richness of our mental worlds, as Jerison (1985) and others have argued. I see no reason why we must assume a qualitative difference (in the sense of some fundamental change in the way humans use symbols) to explain this difference in the richness of our semantic worlds.
One caveat should be noted, however. Irene Pepperberg's (1990) work with Alex shows that parrots can not only learn a large number of semantic units but can also form categories of these units. He apparently knows the English words for roughly 70 different items, the numbers one to six, several colors, and types of material (Pepperberg 1990). Thus, Alex's abilities approach those of chimps, even though he has a much smaller brain size. It would appear either that there is a qualitative difference in brain organization between primates and parrots, or that there is more to brain size differences than simple changes in the complexity of the internal representation of reality. Pepperberg (1990) points out that avian intelligence appears to be subserved by striatal (as opposed to cortical) areas, which suggests that there may in fact be qualitative differences between primates and birds with respect to functional brain organization (Stettner and Matyniak 1968, Hodos 1982). There may also be more to brain size increases than changes in the complexity of semantic organization, though these other effects might well not be related to linguistic ability. The differences in brain size between parrots and chimps clearly is not due to language as we know it, since neither displays an obvious language in the wild.
Syntax refers to the set of rules which allow us to encode specific information about the interrelationships between various semantic units. The rules in English which let us distinguish between "The tiger killed the hunter" and "The hunter killed the tiger" are simple examples of syntax. Syntax is a part of grammar, but grammar also includes other language rules, such as knowledge of which sound patterns (or hand movements, in the case of sign language) connect with which semantic units (i.e., the meanings of words). These non-syntactical aspects of grammar have been discussed in the previous section on semantics. The evidence for localization of syntax processing is not as detailed as in semantics, though there are a number of clues that generally point to areas in the frontal lobe.
Studies of aphasic patients show that damage to the anterior peri-sylvian areas often results in various kinds of syntactic difficulties (Damasio and Damasio 1992). Broca's aphasia, defined clinically, includes such characteristics as reduced grammatical complexity in output as well as comprehension problems for material dependent on syntax (Alexander, et al. 1989). However, it also includes a variety of other linguistic problems, including impaired articulation, disturbed melodic line, phonemic and semantic substitutions, reduction in phrase length, and generally effortful speech output (Alexander, et al. 1989). Broca's aphasics have damage to the posterior, inferior frontal lobe in their language dominant hemisphere (areas known as the pars operculum and pars triangularis), but it turns out that damage to this area alone is not sufficient to produce the suite of behavioral deficits that define Broca's aphasia (see references in Alexander, et al. 1989). Permanent Broca's aphasia occurs in patients with widespread frontal damage, including not only pars operculum and triangularis, but also the posterior portions of the middle frontal gyri (superior to "Broca's area), anterior-superior portions of the insula (which is found on the deepest portion of the sylvian fissure), portions of the periventricular and subcortical white matter, the head of the caudate, and lateral portions of the putamen, and even portions of the anterior parietal region (Signoret et al. 1984, Alexander et al. 1989). The fact that Broca's aphasia results from such wide areas of cortex and subcortex leads Alexander et al. (1989) to conclude that, "...study of Broca's aphasia provides little insight into either the anatomy or the psycholinguistics of frontal language deficits." (p. 676).
Instead, Alexander et al. (1989) suggest that studies of damage to more circumscribed areas provide a better window into the functional organization of language in the brain. They review evidence showing that aspects of grammar and/or syntax is impaired with damage to specific areas of the frontal cortex and underlying white matter and deep cortical structures. Specifically, damage to the left lower motor cortex (on the precentral gyrus) plus the left pars operculum (immediately anterior to the motor cortex) and white matter deep to these areas results in restricted syntax in speech and difficulty comprehending sentences with complex syntax. Alexander et al. (1989) suggest this is due to disconnections between frontal and posterior language areas. Damage to the dorso-lateral frontal cortex (roughly the mid portion of the middle frontal gyrus and adjacent cortex), which is often accompanied by damage to the adjacent pars operculum, can result in problems of generating correct grammar and comprehension of syntax. It also often results in problems of word selection, though the problem occurs primarily for words that have significant grammatical roles (e.g., conjunctions, pronouns, verbs) as opposed to more concrete nouns (Damasio and Damasio, 1992).
Nadeau (1988) reports detailed linguistic testing of two patients with extensive anterior frontal damage that included most of the third frontal convexity (which includes Broca's area) as well as the dorsal lateral frontal convexity and portions of the anterior temporal lobe. Their lesions did not extend into the primary motor cortex. These patients retained normal fluency, normal phonology, had only mild anomia, had no major problems on tasks requiring the use of grammatical words in sentences, and had minor difficulties with inflectional morphology (e.g., adding [-z] or [-s] to words to indicate plurality). However, they displayed marked difficulties with tasks designed specifically to tap the processing of complex syntax. Nadeau (1988) reports that they had difficulty manipulating word order with respect to auxiliary verbs ("will," "do", etc.) and embedded sentences. Nadeau contrasts these patients with one described by Kolk et al. (1985), in which essentially the inverse was found with respect to linguistic ability: difficulty in correctly using grammatical forms, but normal grammatical judgment and quite long spontaneous utterances. This patient had damage confined to the motor strip (which is immediately posterior to Broca's area) and the posteriorly adjacent parietal operculum. Thus, non-overlapping lesions seem to indicate a dissociation of syntax from other aspects of grammar (i.e., morphological rules). Since Broca's aphasics demonstrate both these types of linguistic deficits, this is further evidence against the idea of a unitary "Broca's area" representing the site of speech processing. Instead, Broca's aphasia seems to be the result of damage to multiple areas of the mid- to anterior cortex on the language dominant hemisphere.
Novoa and Ardila (1987) provide evidence that the prefrontal area of the cortex (that is, frontal cortex anterior to motor, pre-motor, frontal eye fields, and the pars operculum and pars triangularis) plays a role in syntax processing. They studied 21 patients with prefrontal brain damage (none of whom had Broca's aphasia), and compared them to 15 normal controls on a number of linguistic tasks. While formal language ability was preserved in the patients, those with left prefrontal damage displayed problems with various syntactically oriented tasks (e.g., discriminating normal sentence order of "subject" + "verb" + "complement", and comprehending passive sentences). They also displayed problems with expressive aspects of language, such as overall verbal fluency (e.g., number of words per minute). In general, prefrontal subjects suffer from inaction and apparent lethargy which appears to be due to the inability to plan ahead. Decreased expressiveness may be due to this, rather than a specifically linguistic function.
Damasio and Damasio (1992) also suggest that the basal ganglia (which are deep cortical structures) and the cerebellum might also play crucial roles in syntax processing. Both the basal ganglia and the cerebellum have extensive connections with sensory areas of the cortex, and in turn project to motor areas of the cortex, and even to various association areas including the prefrontal cortex (see Leiner et al. 1986). The basal ganglia are known to play crucial roles in creating smooth, complex motions from simpler components, and the cerebellum plays a key role in maintaining smooth motor control. It is interesting to note that Williams syndrome patients, who suffer from severe mental retardation but are remarkably verbal, have nearly normal cerebellar vermis's (the midline portion of the cerebellum seen on their MRI scans), even though they have significantly smaller overall brain sizes (Jernigan and Bellugi 1990; see discussion in next chapter). However, the exact roles of the basal ganglia and cerebellum with respect to language have not yet been delineated.
Nadeau (1988) suggests that the role the prefrontal areas of the cortex play with respect to syntax may parallel the role this area plays for behavior in general: organization. This would be consistent with the idea that there is no syntax module in the brain coded by specific genes, but rather that syntax is an emergent characteristic of the explosion of semantic complexity evidenced during human evolution (Schoenemann and Wang 1996).
Neuroimaging studies of grammatical and syntax processing have so far been limited to basic questions about the retrieval of verbs versus nouns. In a study in which subjects were asked to produce verbs that would be appropriate for a given noun (e.g., producing "fly" in response to "bird"), it was found that areas approximating Broca's area (left infero-posterior frontal cortex) were active, along with areas in the right side of the cerebellum (Posner and Raichle 1994). This is consistent with the picture derived from clinical studies as outlined above. Of course, much more research needs to be done on the neuroimaging of grammar and syntax (a study could easily be framed around the syntax task described in chapter 6).
It should be emphasized here that while various linguistic abilities appear to have a degree of localization, this does not imply that these areas serve only language, or that language has its own dedicated processing "modules".
There is good evidence in non-human animals of the ability to understand at least some simple aspects of syntax. Snowdon (1990) provides an excellent review of this evidence and concludes that "There is an enormous gap between the simple grammars found in animals...and the syntax of any human language," (p. 229). The best evidence for non-human syntactical ability comes from studies of California sea lions (Schusterman and Gisinger 1988), bottle-nosed dolphins (Herman 1987), and chimpanzees (Premack and Premack 1972, Savage-Rumbaugh et al. 1993) who have been trained to communicate symbolically. Both California sea lions and bottle-nosed dolphins can make simple word order distinctions in the form: "Take [object A] to [object B]" versus "Take [object B] to [object A]". Premack and Premack (1972) report that Sarah was able to distinguish between sentences of the form: "If red is on green then Sarah take apple" from "If green is on red then Sarah take banana." While this does not show the hierarchical complexity of human syntax, it certainly demonstrates a fundamental aspect of human language. Two different sets of actions are conceptualized solely by a change in the order of symbols.
The work with Kanzi is also impressive. Savage-Rumbaugh (1988) argues that "Kanzi's comprehension of sentences appears to be syntactically based in that he responds differently to the same word depending upon its function in the sentence," (p. 247). As was pointed out in the section on semantics, in a subsequent experiment, Kanzi (then eight years old) acted appropriately in 72% of a set of 660 novel requests. This compares favorably with a one and a half to two year old human child (Alia), who was exposed to similar instruction and environment at the same research center where Kanzi was living, and who responded correctly to only 66% of these same sentences (Savage-Rumbaugh et al. 1993). Furthermore, for a number of the sentences Kanzi and Alia had to pay close attention to the order of the words of the sentence in order to respond correctly. For example, "Make the doggie bite the snake" required a different response than "Make the snake bite the doggie." This may be contrasted with sentences like "Pour the lemonade in the bowl," (to which Kanzi responded correctly) in which a correct response could be guessed without knowledge of English syntax (assuming one can decode the phonetic string into words, and correctly match the words with their underlying concepts, etc.). In all, 42 sentences (in 21 pairs) in which word order was crucial were given to Kanzi, and he responded correctly to 79% of them, (compared to 69% for Alia).
Another type of sentence which required syntactical knowledge were ones in which both word order and understanding the meaning of the verb were needed to respond correctly. For example, contrast "Go to the microwave and get the tomato" with "Take the tomato to the microwave." Note that for these sorts of trials the object to be acted on (in this case a tomato) was present in both locations (e.g., there was a tomato both in the microwave and in front of Kanzi or Alia). Thus, while it would be possible to guess, given microwaves are rarely moved but tomatoes frequently are, that the correct response was not to take the microwave to the tomato, it was not similarly easy for the subjects to guess, after hearing "microwave" and "tomato" which tomato was supposed to be acted upon. Furthermore, there were always several different objects in the destinations, such that Kanzi and Alia needed to remember which object they were supposed to retrieve. Of the 46 sentences (in 23 pairs) like this, Kanzi responded correctly to 83% of them (compared to 59% for Alia).
Finally, sentence pairs were given in which the order of the words was constant but the verbs and syntactical structure indicated that the correct response was different, e.g., "Put your collar in the refrigerator" versus "Go get your collar that's in the refrigerator." Again, the correct objects (as well as several distracters) were located in both locations. Nevertheless, Kanzi responded correctly to 79% of the 28 sentences of this type (compared to 67% for Alia).
Note that in these studies Kanzi did not imitate his experimenters, since there was nothing to imitate (see above). Thus, pygmy chimps have demonstrated the ability to comprehend at least some of the arbitrary formal structures that comprise syntax. Note further that to even be able to comprehend these syntactical structures, he first had to be able to correctly parse the continuous speech sounds into phonemes, and then phonemes into words.
What about sentence production? Only about 10% of Kanzi's utterances (made by pointing to particular lexigrams on a board) consist of two or more elements (Greenfield and Savage-Rumbaugh 1990). Because his utterances are usually no more than 2 lexigrams long, it is inherently harder to find evidence in his utterances of his understanding of syntax than it is to find evidence that he comprehends at least some forms of English syntax. However, there are clear indications that his utterances are not random arrangements of lexigrams and/or gestures. An analysis (done when Kanzi was five and a half years old) of his utterances of more than one lexigram indicates that he preferentially uses the English convention of placing the action first, followed by the object. For example, he shows a strong preference for "HIDE PEANUT" versus "PEANUT HIDE." This was presumably learned from his caregivers, whose use show the same preference (Greenfield and Savage-Rumbaugh 1991). However, Kanzi seems to have invented his own preference in utterances involving a lexigram and a gesture. In these instances he showed a very strong preference for pointing to the lexigram first and using the gesture second. For example, when the context indicates that he wishes to be chased by the caregiver, he says "CHASE you" (where the capitalized "CHASE" indicates the use of the lexigram and "you" indicates the pointing gesture), instead of "you CHASE." He shows this preference even in cases where he is closer to his caregiver than his lexigram board, where it would be less effortful to point to the agent first. This preference is also the opposite of what his caregivers model for him, indicating that he has invented this himself (Greenfield and Savage-Rumbaugh 1991). Lastly, we may note that when Kanzi puts two action lexigrams together to make a request (e.g., "CHASE HIDE"), he orders them in a way that reflects the sequence in which they naturally occur (even in wild pygmy and common chimps; Greenfield and Savage-Rumbaugh 1991). Furthermore, analysis of hours of videotape suggest that Kanzi taught these orders to his caregivers, since in almost every case they have been shown to be directly imitating him (Greenfield and Savage-Rumbaugh 1991).
These studies indicate that Kanzi understands basic syntactical relationships. Savage-Rumbaugh (1988) points out that this research "...suggests that the capacity to comprehend speech antedates the ability to produce it. The brain must have been ready to process language (both conceptually and phonemically) long before the vocal tract was ready to produce it" (p. 250, italics added) This is completely consistent with the view that language and thinking are separate, and that linguistic evolution made use of existing general cognitive abilities, rather than creating new ones.
At a more general level, it is important to remember that a crucial aspect of language Ð hierarchical structure Ð is a cognitive feature not limited to humans. The fact that dominance hierarchies are a ubiquitous feature of social animals bears this out. Chimpanzees, for example, demonstrate a remarkable ability to understand and manipulate social hierarchies (see, for example, de Waal 1982, 1989). This suggests that our ability to think hierarchically about social relations (at least) was inherited from our primate ancestors. Whether this ability in the social realm was simply extended into other cognitive domains (e.g., language) is not clear, but it is at least possible that the general cognitive basis for hierarchical linguistic structures in humans ultimately derives from our inherited ability to think hierarchically about social relations. In any case, it would appear that chimps can comprehend the hierarchical aspects of language, since they can understand commands that depend on word order, for example.
In short, there is no evidence that animals can learn syntax as complicated as that used in human languages, but fundamental elements appear to be present (i.e., different sequential order of the same symbols communicates different meanings). Given that humans are particularly adept at paying attention to sequential order, and given that humans have such a richness of mental concepts, syntax can be seen as a natural outgrowth of the desire to communicate increasingly complex cognitive worlds.
I should also point out that a strong argument can be made that syntax is not genetic per se, but rather an emergent characteristic of the explosion of semantic complexity documented above. This argument is based on a close analysis of the proposed features of so-called "Universal Grammar", which is actually simply a list of semantic/conceptual structures of how humans organize our internal and external reality (Schoenemann and Wang 1996). For example, one of the key features of universal grammar is said to be that it is hierarchically structured (see, e.g., Bickerton 1990). This is not a description of a syntactical rule as much as it is a description of human perception. Furthermore, hierarchical structure is intrinsic to complex systems generally (Simon 1962).
The general conception that language is lateralized to the left hemisphere has a history dating back over 100 years to the neurologists Paul Broca and Carl Wernicke, who discovered correlations between damage to the left hemisphere and the loss of different linguistic abilities. In Wernicke's aphasia, patients are usually able to speak fluently, but their utterances are devoid of meaning (though grammatically and syntactically correct). Broca's aphasics speak in slow, labored, agrammatical sentences composed almost entirely of nouns with a few verbs (Bradshaw and Nettleton 1983). Although Broca's aphasics have better language comprehension than Wernicke's aphasics, Zurif (1980) argues that Broca's aphasics still do not retain normal comprehension abilities.
Further clinical evidence suggesting a verbal/non-verbal dichotomy has come from tests in which only one hemisphere is anesthetized (by injection of sodium amytal into the ipsilateral carotid artery) while the patient is given simple tests of linguistic ability. Most patients show an abrupt interruption of speech for one hemisphere (usually the left) but not the other . Essentially the same effect has been found following unilateral electroconvulsive therapy (Bradshaw and Nettleton 1983). Epileptic patients who have their corpus callosum cut to control seizures (commissurotomy), thereby eliminating the direct connections between their hemispheres, show intact ability to verbally report words and pictures flashed into the right visual field (left hemisphere) but not the left visual field (right hemisphere) (Bradshaw and Nettleton, 1983). Dichotic and tachistoscopic studies of normal subjects indicate that verbal stimuli presented to the right ear or right visual field are usually reported more accurately, which has been taken to indicate a left hemisphere dominance for language (Bradshaw and Nettleton, 1983). Stroop effects (disruption in reporting the physical characteristics of a word that linguistically has contradictory semantic meaning, such as the word "red" written in blue ink or the word "high" sung at a low pitch) have been shown for both visual and auditory modalities to be stronger if presented into the right visual field or right ear (Bradshaw and Nettleton, 1983). Delayed auditory feedback has also been shown to be more disruptive if presented into the right ear (Abbs and Smith 1970).
However, are these findings best understood as the lateralization of language or the lateralization of a sequential processor? Studies of lateralization have increasingly called into question the traditional view of a strictly verbal/non-verbal dichotomy in hemisphere function (Bradshaw and Nettleton 1981, 1983). One line of research involves Japanese, which uses both a phonetically based syllabic script (kana) and a ideographic script in which single characters represent lexical morphemes (kanji). Japanese subjects have demonstrated a right visual field (left hemisphere) advantage in processing kana, but a left visual field (right hemisphere) advantage processing kanji (Hatta 1977, Sasanuma et al. 1977). More recent studies of commissurotomy patients have revealed that the left hand (right hemisphere) can obey simple verbal instructions and pick out named or described objects, indicating that the right hemisphere is not devoid of linguistic ability (Bradshaw and Nettleton 1983).
Studies of normal subjects have shown that the degree of lateralization depends on the characteristics of the stimuli, which suggests that lateralization of language may simply reflect a more fundamental underlying difference between the hemispheres. There are several studies that show a general left hemisphere superiority in right handers at discriminating temporal order and at sequencing auditory as well as visual stimuli (Bradshaw and Nettleton 1983). Cutting (1974) has shown that stop consonants show the greatest right ear advantage (REA), liquids less, and steady state vowels none at all. Schwartz and Tallal (1980) noted that these classes of speech sounds differ in the rate of acoustic change and demonstrated that the REA for stop consonants could be significantly decreased by artificially slowing down the formant transitions. A REA in right-handers has been demonstrated for melodies that differ only in rhythm (Gordon 1978). The right ear in right-handers has been found to be better than the left at discriminating inter-pulse durations of 50 msec or less (Mills and Rollman 1979). Mills and Rollman (1980) also showed that the auditory perception of temporal order in right-handers was finer if the right-ear stimulus (click) preceded the left-ear stimulus. This suggests that the temporal analysis was performed in the left hemisphere. The left-ear stimulus would travel primarily to the right hemisphere first and have to be transferred across the corpus callosum to reach the left-hemisphere. This transfer would take time, thereby allowing the clicks to be presented closer together while still appearing distinguishable.
Similar left hemisphere dominance for temporal analysis in right-handers has been shown for both tactile and visual stimuli by Efron (1963). Tzeng and Wang (1984) demonstrated a right visual field superiority for temporal processing in a task in which words could be processed either by temporal analysis or by spatial analysis. This was done by presenting the letters of a word with a tachistoscope such that spatially they were arranged to spell one word (e.g. "ACT") while the order that these letters were flashed spelled another (e.g. "CAT"). Depending on which word the subject reports, it is possible to determine whether sequential or spatial processing was used. All subjects reported the sequential order more often than the spatial order regardless of the visual field, but in addition all subjects reported the sequence more often when the letters were flashed in the right visual field than in the left. Furthermore, the difference between the number of words recognized by sequence and the number recognized by spatial order was the larger for the right hemisphere presentations than the left. The same effect was reported when the letters were replaced with three colored circles, but only on the second block of trials when the subjects had presumably become accustomed to the different possible combinations. Tzeng and Wang suggest that the left hemisphere language dominance is due to hemisphere differences in sequencing and timing abilities. They suggest that as soon as a set of stimuli gets encoded, through frequent exposure, it will show a left hemisphere dominance for sequencing and timing tasks. This "frequency effect" has been demonstrated also by Hardyck et al. (1978), who reported that lateralization effects for the comparison (i.e., same/different) of pairs of written words appeared only upon using a small number of repeated stimuli. Only under these circumstances could they detect a right visual field (left hemisphere) advantage for English word pairs, and a left visual field (right hemisphere) advantage for ideographic Chinese characters.
Tzeng and Wang (1984) point out that such a model explains some otherwise peculiar findings with regard to musical and Morse code laterality studies. For instance, trained musicians often show a REA, while novices show a LEA in recognizing melodies (Bever and Chiarello 1974), which can be interpreted to imply that as an individual gets more acquainted with a certain set of stimuli, the more the left hemisphere is engaged. Bradshaw and Nettleton (1983) point out that while there are studies that report the opposite effect, the laterality effects for music may have more to do with how the subject attends to the stimuli, rather than the nature of the stimuli per se. Pap�un et al. (1974) reported that novice Morse code operators showed a REA for simple sequences but a left ear advantage (LEA) for complex ones, whereas experienced operators showed a REA for both. Silverberg et al. (1980) showed that a left visual field superiority for new Hebrew words shifted to the usual right visual field superiority as native speakers learned to read. These studies imply that lateralization relates not so much to higher level cognitive processes such as "language" or "music" but rather to sequencing and timing of previously learned components or the analysis of novel stimuli, then the effects demonstrated by these studies make sense. The extent to which higher cognitive processes appear lateralized appears to depend on how the components are most efficiently processed (Bradshaw and Nettleton 1983).
Thus, various dimensions of language processing are lateralized (to the left hemisphere in most people), but there is substantial evidence that this lateralization reflects a specific kind of processing, which aspects of language require, but which is not specific to language. In other sections of this chapter I will discuss possible implications this might have for other cognitive and behavioral abilities.
Experiments investigating lateralization in non-human animals have not been nearly as extensive as in humans, but as Snowdon (1990) shows in his review, there is some evidence that non-human primates nevertheless also show lateralization of communication. Petersen, et al. (1978) and Petersen, et al. (1984) compared 2 Japanese macaques (Macaca fuscata) with a bonnet macaque (Macaca radiata) and a pigtailed macaque (Macaca nemestrina) in the ability to distinguish between two kinds of vocalizations that were behaviorally relevant only to Japanese macaques. Green (1975) had shown in a field study of Japanese macaques that these vocalizations were correlated with different social situations. When these vocalizations were presented to the right ear (left hemisphere) only, the Japanese macaques showed better discrimination than when they were only presented to the left ear (right hemisphere). The other monkeys showed no right ear advantage. Petersen, et al. (1978) also showed if rewards were given only if pitch (and not phonetic cues) were used to discriminate the vocalizations (thereby requiring that vocalizations which cut across the behaviorally relevant classes were grouped together), no right ear advantage could be found in any of the monkey species. They concluded that the left hemisphere is not better at discriminating all auditory features, but only those relevant to species-specific vocalizations. Given what we know about human laterality, it would be interesting in future studies to see whether the other monkeys were primarily using pitch to discriminate the Japanese macaque vocalizations. This would explain the lack of laterality found in these other species, and would suggest that the laterality shown in Japanese macaques closely parallels the human data described above, in which laterality was greatest with the speech sounds that show the most rapidly changing acoustic parameters.
Laterality in Japanese macaques has also been examined in one lesion study. Heffner and Heffner (1984) showed that the laterality displayed in distinguishing species-specific vocalizations could be eliminated by lesions to the supratemporal gyrus in the left hemisphere, and that the basic ability to distinguish species-specific calls could be completely eliminated with lesions to both supratemporal gyri. These studies demonstrate that there are remarkable parallels between non-human primates and humans in the lateralization of perception of vocalizations. While further study is clearly needed to probe the extent of these parallels, these studies are one more indication that human language ability evolved by making use of and expanding pre-existing neuroanatomical structures and organization.
While it is reasonable to suspect that the supralaryngeal vocal tract and neural mechanisms of speech perception have evolved to facilitate language, there is no reason to assume that the modern human arrangement of these features is necessary for language. The essential aspects of language seem to be 1) the use of arbitrary symbols to represent ideas (accomplished through the use of duality of patterning of elementary, meaningless units), and 2) the common rules (i.e., syntax) used to organize these symbols into communicable thoughts. Both of these aspects are cognitive processes, and it is therefore likely that cognitive advances during human evolution would be more directly indicative of language competence than the specifics of the supralaryngeal vocal tract.
Nevertheless, it is entirely on the basis of the anatomy of the vocal tract that Lieberman and Crelin (1971), Lieberman, et al. (1972), Lieberman (1984, 1988) and Laitman, et al. (1979) have argued that Neanderthals lacked fully modern speech. These researchers based their conclusions on comparative analyses of the basicrania of apes, humans, and various fossils. They argue that the increased flexion found in the cranial base in modern humans allows for the extended range in vowel sounds. The angle in the supralaryngeal vocal tract, caused primarily by the lowering of the larynx and hyoid bone into the neck, places the tongue in a more efficient position to produce the variety of vowel and consonantal sounds found in humans. For example, the /i/ sound is produced by a constriction of the last (oral) portion of the tract concurrent with an expansion of the first (pharyngeal) portion, whereas the /a/ sound is produced by the opposite pattern of constriction and expansion. It is inherently easier to accomplish this degree of glottal gymnastics if the tongue is curved and occupies a position at the middle of the bend. Relatively simple muscular movements are then necessary to produce different sounds: to produce an /i/ sound, the whole tongue can be shifted anteriorly and superiorly. Because so much of the tongue in apes occupies the oral cavity, the ability to form an /i/ sound is therefore restricted (Lieberman 1984).
Initial work by Lieberman and Crelin (1971), Lieberman, et al. (1972), and Laitman, et al. (1979) focused on Neanderthal specimens and found that these hominids more closely resembled the ape condition with regard to the degree of cranial base flexion, which led these researchers to conclude that Neanderthals would have had a limited ability to produce speech sounds. It was also clear, however, that the Steinheim and Broken Hill (Middle Pleistocene) fossils studied show essentially the same degree of flexion as is found in modern humans (Laitman 1983). Laitman (1983) believes that the epiglottis of the larynx would have been unable to reach the soft palate in Homo erectus which means that the ability to both breathe and drink at the same time would not have been possible. Given that this would have increased the probability of choking on food (a problem much more common in humans than other animals), there must have been some significant benefit to the lowering of the larynx (Lieberman 1984, 1988). It is likely that the benefit was increased range in the production of phonemes. Laitman (1983) believes his analysis shows that Homo erectus "...had made a quantum step toward the acquisition of the full range of human speech sounds," (p. 83). Among the specimens so far examined, the earliest hominid specimen thought to be different from the pongid condition with regard to basicranial flexion is OH 24, a badly crushed cranium dating to between 1.6 and 1.8 million years ago (Lieberman 1984; Laitman and Heimbuch 1982). OH 24 was initially classified as Homo habilis, though some argue that it is actually an australopithecine because of its relatively small brain size (Stringer 1986).
Lieberman (1984) argues that the kinds of initial changes found in these early specimens might partly be due to selection for the ability to breathe through the mouth even when the nose is blocked. He points out that Homo erectus was more heavily muscled than modern humans and this indicates a greater amount of strenuous physical activity, which in turn might have selected for mouth breathing. However, most other animals are more heavily muscled than humans, yet they do not share with us a lowered larynx or increased basicranial flexion. Since other mammals (e.g., dogs, pigs, etc.) pant through there mouth when they overheat but at the same time have not evolved lowered larynxes, the most likely explanation for the lowered larynx in Homo erectus is that it points to the origin of language.
It should also be noted that the studies of human language origins based on basicranial flexion have been widely criticized on anatomical grounds (Falk 1975; Burr 1976; Arensburg, et al. 1990). Falk (1975) pointed out that if the hyoid bone was positioned as high as has been suggested, the muscles which connect it with the mandible would have different functions than they have in chimpanzees and both human adults and newborns. Arensburg, et al. (1990) point out that, if this were true, the nerve to the mylohyoid muscle (which connects the medial surfaces of the mandible to the hyoid, thereby forming a muscular diaphragm across the floor of the oral cavity) would have a different orientation during its course from the basicranium. The actual orientation of these nerves in specific fossils can be reconstructed because they form grooves along the medial surfaces of the two ascending rami of the mandible on their way to the mylohyoid muscle. Arensburg, et al. (1990) have found no evidence of differences between modern and Middle Paleolithic mandibles with respect to this feature. Furthermore, Falk (1975) showed that in pongids and humans the position of the hyoid is closely associated to the inferior border of the mandible, whereas its relationship to the basicranium varies. Arensburg, et al. (1990) present data showing that Middle Paleolithic hominids in general have tall mandibles compared to modern humans, which strongly suggests that the hyoid was as low in these hominids as it is in modern humans. They further show that the Neanderthal hyoid bone they discovered (the only one that has so far been found) "...is not notably different, in either size or morphology, from that of modern human hyoids," (p. 145). They conclude, as did Falk (1975) and others, that there is no good evidence that Neanderthals were incapable of speech.
Another recent study of fossil hominid laryngeal anatomy also supports the conclusion that language dates back beyond Neanderthals to earlier Homo. Duchin (1990) performed a morphometric analysis of the length of the palate and mandible and the relative position of the hyoid bone in order to discover the extent to which the supralaryngeal vocal tract in Pan troglodytes and modern humans was governed by different structural relationships. She concluded that in humans the position of the hyoid was most closely related to the length of the mandible, whereas in chimpanzees it was most closely related to the distance from the ramus to the posterior palate and the width of the hyoid bone itself. Through the use of discriminant function analysis, she derived an equation that best differentiated between humans and chimpanzees on the variables she measured. When measurements of one Homo erectus and one Neanderthal were entered into this equation, they unequivocally grouped within modern humans (albeit toward the chimpanzee end of the human range).[10] She concludes that "...a human configuration of the anatomical region responsible for articulate speech was found in both Homo erectus and Homo sapiens neanderthalensis," (p. 694).
However, even if we were to take Lieberman et al.'s cranial-base-flexion model seriously, we run into complications. As already noted, at least two hominids predating Neanderthals show cranial base configurations that are essentially modern human in appearance: Steinheim (dated to ~250 KYA) and Broken Hill (presumed to be late Middle Pleistocene)(Laitman, et al., 1979). To accept the Lieberman et al. model, then, we would have to conclude that Steinheim and Broken Hill were able to produce the full range of modern human sounds, and were presumably in possession of a language as sophisticated as that used by modern humans, even though later evolving Neanderthals could not. To deal with this problem, Lieberman (1984) suggests that the common ancestor of Neanderthals and anatomically modern humans might have predated Steinheim and Broken Hill. This would mean that these fossils would not have been ancestral to Neanderthals, but only to anatomically modern humans. However, given that Steinheim is in Europe, along with a majority of Neanderthal sites, and since the earliest anatomically modern human sites are in South Africa (Klasies River Mouth) and the Near East (Jebel Qafzeh in Israel), it has generally been assumed that Steinheim is ancestral to Neanderthals - though this is a weak argument at best.
Another important result of Laitman, et al.(1979) that seems to be generally overlooked is that their discriminant function analysis did not unequivocally group Neanderthal specimens apart from anatomically modern humans. They derived 5 alternative variance models, but only one of these models "correctly" placed Cro Magnon 1 with modern humans.[11] Where did this model place Neanderthal specimens? La Chapelle comes out most similar to a young (dental stage 2) human Ð not to an ape (of any age). La Ferrassie and Saccopastore 2 come out closest to dental stage 3 human, and Monte Circeo appears most similar to a dental stage 4 human. Stage 3 corresponds to the age at eruption of the first permanent molar (usually around age 6) to the eruption of the second permanent molar (around age 11; Ubelaker 1978). This allows us to get a fairly good idea of how small the difference would have been between Neanderthal speech and our own. Anyone who has ever heard a 6 year old human child talk knows that the difference from adult speech is not striking.
All the available evidence indicates that the beginning of language dates to the earliest Homo. It is probably not true that Neanderthals differed significantly from modern humans in their linguistic competence. Even those who maintain that differences existed make it clear that they are referring only to the range of phonemic distinctions that Neanderthals could use, not to the question of whether they communicated via a complex language. For example, Lieberman (1984) states that:
"...I am not claiming that Neanderthal hominids lacked language and culture or could not reason because their phonetic ability differed from ours...Neanderthal hominids would have had linguistic and cognitive abilities that are similar to ours if human language is built on neural mechanisms that structure the cognitive behavior of other species, plus a comparatively small number of species-specific mechanisms adapted to human speech. The genetic principle of mosaic evolution, in any case, argues against linguistic ability evolving as a complete system. Neanderthal hominids thus probably represent an interesting case of closely related hominids that had general cognitive and linguistic abilities similar to our more immediate ancestors but who lacked the special characteristics of human speech." (p. 322-323)
After noting that old Neanderthal individuals appear to have been taken care of, indicating that the elderly had some value (perhaps because of their knowledge of various life experiences, which presumably would be communicated through the use of language), Lieberman concludes that "I therefore find it hard to believe that Neanderthal hominids did not also have a well-developed language, particularly given the linguistic and cognitive ability of modern chimpanzees when they are taught sign language" (p. 323).
There is an important issue hidden beneath this controversy: Just how essential is the specific range of distinct sounds to the development of language? This depends partly on how we define "language," but we should note that world-wide variation in the number of phonemes that are used to produce a functional language is extraordinarily broad (Wang 1976). English makes use of 44 phonemes, but Hawaiian uses only 15 (Biederman 1987; Corballis 1989). Clearly, the number of phonetic distinctions that can be made cannot be considered a crucial variable when discussing whether a particular hominid species used language. Furthermore, the shape of the supralaryngeal vocal tract primarily affects the production of vowels, which constitute only 12 of the 44 English phonemes. A good many consonants (and all tonal patterns) may be producible regardless of the shape of the vocal tract.
Duchin (1990) argues that chimpanzees lack the range of articulation of consonants found in humans because of a difference in the inclination and angles of insertion of three key muscles: genioglossus (which connects the tongue to a small area on the posterior surface of the mandibular symphysis), mylohyoideus (one of two muscles that connect the hyoid bone to the mandible), and palatoglossus (which connects the palate to the posterolateral portion of the tongue). In humans the most anterior fibers of genioglossus curve upward more dramatically than in the chimpanzee, and this suggests that humans can produce articulations that are not possible for the chimpanzee. Since the hyoid bone is positioned more superiorly in chimpanzees, their mylohyoideus serves more to retract and stabilize the tongue, whereas in humans in which this muscle works more to depress the mandible. Palatoglossus is more posteriorly positioned in chimpanzees, and therefore (arguably) serves more to restrict articulation, rather than helping to produce velar sounds (/k/, /g/). Duchin (1990) believes that the differences in function of these three muscles "...restrict articulation of consonants in the chimpanzee," (p. 689). More fairly, these differences only restrict the articulation of human speech sounds; they say nothing, in principle, about the number of distinguishable sounds of which the chimpanzee is capable. Speech sounds are nothing more than arbitrary building blocks used to produce distinguishable patterns; there is nothing particularly magical about exactly which sounds are used. The real core of language lies in how these arbitrary sounds are arranged (Wang 1976).
Given that a relatively small number of phonemic distinctions would have been required to communicate via language, and given that the current human ability to produce various phonemes did not evolve overnight, there is no reason to assume that rudimentary speech did not begin with the earliest Homo. Duchin (1990) believes that "The critical determinant of the uniquely human capacity for language is not brain size and abstract thought and is not laryngeal position and phonation. It is instead, the ability to produce articulate speech." (p. 695-696). In contrast, I would argue that since the first indications of anatomical changes in the supralaryngeal tract towards the modern condition occur in exactly the same fossils where we see the first significant increases in brain size over pongids, this brain size increase must have been intimately connected to linguistic development. It makes sense that, as hominid mental complexity increased, there would have been more to communicate. To deny that this increase in mental complexity (as indexed by massive brain size increases) was related to the development of language is to compartmentalize human behavior in a completely unrealistic manner. Language is, after all, fundamentally a means of communicating information. The kind of information communicated surely must relate directly to the cognitive complexity of the species.
Language makes use of a wide range of neural resources. A large number of brain stem nuclei are involved in the production and perception of speech sounds. The connections between these nuclei and other centers in the cerebral cortex are numerous. The semantic concepts forming the foundation of language presuppose a complex web of connections between primary sensory and secondary association areas of the cortex. A variety of areas of the language dominant hemisphere, including the mid- to anterior temporal cortex, the anterior-inferior occipital cortex (for color terms), portions of the parietal operculum (grammatical terms), appear to be crucial for the translation of these concepts into the actual word forms themselves. Wernicke's area and adjacent primary auditory cortex appears to be involved in the construction of word forms from component phonemes, as well as the reverse process of extracting words from the continuous stream of speech sounds in a sentence. Syntax processing appears to rely on prefrontal areas of the cortex, and possibly posterior portions of inferior frontal gyrus (including Broca's area), primarily in the language dominant hemisphere. In addition, the deep cortical structures (specifically the basal ganglia), and portions of the cerebellum appear to play important roles in language processing. Thus, language depends on large portions of the brain, and as such is an excellent candidate to explain the neuroanatomical changes during hominid evolution as outlined in chapter 2.
Evidence of continuity can be found in all aspects of language processing. Large differences are apparent in the abilities of human and non-human animals, but these are differences in degree, not differences in kind. The increase in semantic complexity appears to be a component crucial to the evolution of language, and this is also apparently its most neural-intensive aspect, requiring widespread interconnections between areas of the cortex. Because descriptions of universal grammar appears to be simply descriptions of our semantic conceptualization of the world, it is possible that increases in semantic complexity alone are sufficient to explain the evolution of complex, modern language.
The fossil evidence that some recent hominids (e.g., Neanderthals) did not have fully modern language is not convincing. The evidence is broadly consistent with the gradual emergence of language over the last two to three million years.
Humphrey (1984) has argued that the most likely selective agent for an increase in brain size was sociality. He suggests that the social dimensions of a species environment are infinitely more complicated than any other dimension, and that this helps explain why, for example, chimpanzees have brain volumes twice that of baboons even though they have no obvious technological advantages over them. This hypothesis is supported by comparative studies of primates. Sawaguchi (1990) showed that relative brain size is significantly larger in polygynous primates than in monogamous ones (controlling for diet). Furthermore, Sawaguchi and Kudo (1990) showed that relative neocortical size was also larger in polygynous than in monogamous prosimians and anthropoids (again, controlling for diet), and that there is a significant, positive correlation between relative neocortical size and troop size in both prosimians (r= 0.697, p< 0.05) and cebids (r= 0.814, p< 0.01). They also review several neuroanatomical studies that are consistent with this finding and conclude that, "It seems likely, therefore, that the development of social interactions may have been associated with the development of the neocortex, in particular, the prefrontal cortex in primates," (p. 287).
Dunbar (1992, 1995) has also shown with comparative data that brain size is directly related to the effective size of the social network. Dunbar (1993) further suggests that language represents a form of social grooming which allowed the increase in group size beyond that otherwise possible. He notes that grooming provides a crucial bonding mechanism in primates, and further that the time spent grooming increases with average species group size (r= 0.77, N = 22, p< 0.001). Since group size can be estimated reasonably well by neocortex ratios (ratios of the neocortex to the remainder of the brain), we can estimate the percentage of time humans should be grooming each other. It turns out to be somewhere between 28 and 66 percent. Since no other primate species is known to groom more than 20 percent of the time, some other mechanism would appear to be needed for humans, which Dunbar (1993) argues is language. While there are a number of extrapolations needed to arrive at this conclusion, it is nevertheless entirely plausible that language serves a social function. A number of behavioral characteristics appear to be dependent on each other, as well as with brain size and/or other neuroanatomical changes.
Although the comparative evidence that social complexity correlates with brain size is strong, the specific abilities that are crucial to social ability within humans (or any other species) are not clearly defined or understood. Some people are more social than others, and some social people are better at understanding and/or manipulating social interactions than others. However, there does not appear to have been much research into this question from a neurocognitive and/or neuroanatomical standpoint. Presumably social competence depends on a wide variety of abilities, including language, non-verbal cue processing, memory (particularly of past interactions, and the order of past events), and probably many other basic cognitive abilities. It is of interest that monkeys at the top of their dominance hierarchies fall to the bottom after undergoing prefrontal lobotomies (Myers et al. 1973). As was pointed out earlier, this region has undergone the largest increase of all cortical regions during human evolution, and humans (empirically) have the largest social groups of all primates (Dunbar 1993). This would appear to be a promising avenue of investigation with respect to brain evolution.
Kling (1986) reviewed a number of studies on the neurological correlates of sociality in primates and concluded that the amygdaloid nuclei, overlying temporal pole (tip of the temporal lobe), and the posterior medial orbital cortex (located in the inferior prefrontal cortex) are necessary for the maintenance of social bonds and affiliative behavior. The indicators of affiliative behavior in monkeys in these studies include behaviors such as grooming, huddling together, and prolonged proximity to other group members (e.g., Raleigh and Steklis 1981). The amygdaloid nuclei are known to be involved in the regulation of mood, aggressive behavior, motivation, sexual behavior, memory, among other things. Studies in squirrel monkeys (Saimiri sciureus) of the electrical activity of these nuclei have shown that they are highly sensitive to the social context of behavior (Kling et al. 1984). Kling (1986) suggests that a major function of the amygdaloid nuclei is to "...place an emotional bias on information from the internal and external environment..." (p. 177) thereby effecting appropriate social behavior. The temporal pole cortex apparently acts in a faciliatory manner to the amygdaloid nuclei. Posterior medial orbital cortex appears to be involved with behavioral inhibition, such that lesions to this area cause an inability to carry out passive avoidance tasks, a flatness of affect, and a lack of facial expression (Kling 1986, see also de Bruin 1990). Human patients who have had prefrontal lobotomies show similar social deficits (Kling 1986).
By contrast, lesions to superior or inferior temporal cortex, cingulate cortex (immediately adjacent to the corpus callosum), and even lateral frontal cortex (including dorso-lateral prefrontal cortex) do not result in significant changes in social rank or affiliative behavior (Kling 1986). Thus, only a small portion of the prefrontal and temporal lobes appear to be crucial for social competence, at least in monkeys.
Two studies on the importance of prefrontal cortex deserve special mention. Myers et al. (1973) compared the effects of prefrontal lobotomy to sham operations on rhesus monkeys (Macaca mulata) in a naturalistic setting on Cayo Santiago island (Puerto Rico). Four of five monkeys with lobotomies failed to successfully rejoin their social groups, showed increased levels of aimless pacing, and died within a few weeks (the one monkey who did rejoin the social group was a juvenile). By contrast, three of four monkeys with bilateral cingulate lesions successfully rejoined the social group, as did 10 control animals (some of whom where given pinealectomies or minor "sham" surgery). Of course, it was not possible to determine exactly what behavioral deficits in these monkeys caused their inability to rejoin the social group, this does strongly suggest that prefrontal cortex is crucial to a successful social existence in monkeys.
A study by Butter and Snyder (1972; reviewed in de Bruin 1990) studied the effects of orbital prefrontal lesions on the attainment and maintenance of dominance in rhesus monkeys. The experimental design involved introducing monkeys (one at a time) into an established social group containing four males that had a clear linear dominance hierarchy. Before the operations, the introduced monkeys in every case quickly achieved the alpha-position (due partly to their age and larger body size in relation to the four group males). After undergoing either orbital prefrontal lesion or sham operations, the monkeys were repeatedly reintroduced into the social group (at bimonthly intervals) and their ability to regain the alpha-position was assessed. In contrast to monkeys with sham operations, orbital prefrontal lesioned monkeys gradually lost their ability to recover the alpha position.
Calvin (1982, 1983a) has proposed a novel explanation of the initial brain size increase in humans that may also have broader implications for the evolution of language. Calvin argues that modern humans have a throwing accuracy that far exceeds that of any known primate (or any other mammal). By making some reasonable assumptions about the physical aspects of throwing overhand, Calvin (1983a, p. 126) calculate that the "permissible release error" required to hit a 10 cm high target at a distance of 4 meters (with less than a 10¡ elevation of launch) is on the order of only 5 msec. Doubling the target distance reduces the launch window by a factor of 8, due to the need to throw harder (increase velocity) at a target which is effectively smaller (the angle subtended by the target is halved). Calvin notes that 5 msec is already at the approximate limit of the known timing ability of a single neuron. This limitation is imposed by the stochastic fluctuations in the membrane potential of neurons (Calvin and Stevens 1967, 1968). The fact that accurate human throwing ability can far exceed 4 meters means that our ability to accurately time sequences can far exceed the known reliability of individual neurons.
How is this possible? Calvin (1983a) suggests that the answer may come from the study of circadian rhythms. Such endogenous daily cycles are pervasive in biology, having been found in such things as movements of plants, basic biochemical processes of cells, and the degree of pigmentation in certain crabs (Keeton 1980). What is interesting is that circadian cycles are known to continue independently of external environmental signals, such as the rising and setting of the sun.[12] They are also known to have remarkable reliability in higher vertebrates. The variability in daily cycle lengths usually varies on the order of 3 to 5 minutes, which is 6 to 50 times more precise than most other kinds of rhythmic biological systems, such as our heartbeat during sleep or the human menstrual cycle (Enright 1980). To explain this temporal precision Enright (1980) demonstrated that a system of unreliable timing circuits, if correctly arranged, could have a lower variability than any of the individual components. The individual circuits would be arranged in parallel, all feeding a single output circuit that would engage only after some set portion, say 30%, of the individual circuits had completed their cycles. It turns out that the variability of this system decreases proportionately with the square root of the number of individual circuits. This means that a proportionately larger number of individual circuits are needed to decrease the variability of the system by each successive step.[13] Calvin (1983a) points out that this model can be applied to neuronal systems to explain the apparently paradoxical accuracy of human throwing. In order to double the range of throwing accuracy, one could simply harness 64 times as many neuronal units as would be needed to accurately hit a target half as distant. It is not hard to see how accurately throwing increasingly farther and harder would be an advantage to a relatively slow ape (due to bipedalism) without large canines to protect itself out on open savannahs in Africa. This would obviously have great potential to increase brain size dramatically. It is of interest that in other animals that have specialized precise timing abilities, such as bats and dolphins (which have tremendous sonar capabilities), we also find relatively (though not absolutely) larger brains (Calvin 1983a).
It hardly needs pointing out that humans typically display a preference for one hand when performing some kind of manual activity. What is not as commonly recognized is that the degree of handedness changes with different types of tasks. Calvin (1983a) notes that it is the "ballistic" skills that are most strongly right-handed, while fine motor skills are less so. In one study, 89% were right-handed for throwing and hammering while 77% were right-handed for threading a needle (Annett 1970). Watson and Kimura (1989) report that right-handers showed essentially no hand preference for a task requiring that they intercept balls launched past them at various trajectories whereas a strong hand preference was evident for throwing darts at a target. Provins (1967) reports that the preferred hand shows more consistent temporal pattern of movement. O'Boyle and Hoff (1987) reported that mirror-tracing complex shapes, which placed minimal sequencing and speed demands but was very demanding of spatial processing, showed a non-dominant hand preference in males (females showed a trend in the same direction that did not reach statistical significance). These studies strongly suggest that handedness is not a dichotomous phenomenon but rather that it is the nature of the task that influences handedness. The more a task demands precision sequencing, the more lateralized it is likely to be.
Handedness is, at the same time, closely related to hemisphere dominance for language. Calvin (1983a) notes that there is evidence that the cortical areas that play a role in sequential manual tasks overlap significantly with speech areas in the left hemisphere of right handers. A number of studies indicate that linguistic abilities can be disrupted or affected by concurrent manual tasks of the right hand but not the left (Kinsbourne and Cook 1971, Lomas and Kimura 1976, Lomas 1980). Sequential finger tapping has recently been shown to be affected by concurrent memorization of sentences as well as by concurrent speech, although the effect was larger in the concurrent speech task (Ikeda 1987). Stimulation studies of the cortex of neurosurgery patients has revealed an 81% overlap of sites where stimulation disrupts phoneme discrimination with sites where stimulation disrupts the ability to copy orofacial sequences (Ojemann 1983). It has also been shown that patients with left hemisphere damage that are aphasic show more impairment in oral and manual tasks than do patients with left hemisphere damage that are not aphasic (Kimura 1982). There is also the suggestive observation that the representation of body parts on the pre-central gyrus (motor strip), which is where conscious motor activity is activated, runs from arm to hand to face to mouth to larynx as one gets closer to the sylvian fissure (Calvin 1983a).
Clearly, there is some close connection between speech and throwing which may reflect a common underlying mechanism. Calvin suggests that, in effect, linguistic ability was able to develop because of an increasingly accurate sequencer in the dominant hemisphere that was initially utilized for intermittent throwing.
Presumably, this increased throwing accuracy was due to an increased dependence on hunting. The extent to which hunting has been important during hominid evolution has been an issue of some debate within physical anthropology and archaeology, though differences in gut morphology between humans and modern apes clearly indicate that meat was a significant part of the hominid diet (humans have a significantly larger small intestine, the site of nutrient breakdown and absorption, suggesting a long history of nutrient-dense diets; Milton 1988, 1993). A number of researchers have emphasized the role of scavenging in early hominids (Isaac and Crader 1981, Bunn 1982, Potts 1984, Blumenschine 1987). However, a significant amount of hunting has been documented in chimpanzees (see Stanford et al. 1994), which makes the argument for a scavenging phase during hominid evolution (in which scavenging was the predominant mode of meat acquisition) unlikely on parsimony grounds. The importance of projectiles in modern human hunting is obvious, though direct evidence of, e.g., atlatls (spear throwers) does not appear until relatively late (the Upper Paleolithic, ~145,000 years ago to ~10,000 years ago). However, the use of simple projectiles like baseball-sized stones would potentially have been very useful for hunting, and would not leave many obvious traces (except for an overabundance of such stones at probable hunting sites and/or out of geological context; Sarich personal communication). Hunting would also have presumably required a somewhat sophisticated ability to track animals over varied terrain, as well as the ability to find one's way home. Thus, various aspects of spatial ability, as distinct from throwing accuracy, have probably been favored by natural selection.
It has been suggested that the sex difference in spatial ability (averaging close to one standard deviation for tests of mental rotation; Linn and Petersen 1985) might be due to past selection in males for hunting skills, (Kolakowski and Malina 1974, Jardine and Martin 1983). Among extant cultures today, hunting (particularly of large animals) is done almost exclusively by males (Murdock and Provost 1973). A similar situation exists with modern chimpanzees, in which almost 90% of the all the kills (in which the individual killer could be identified) were male (Stanford et al. 1994).[14] Thus it seems likely that hunting was primarily a male activity during human evolution. If so, then to the extent that hunting puts a significant demand on spatial skills, selection acting primarily on males might account for the sex difference in spatial abilities. One key component of this hypothesis that needs to be tested is the extent to which the specific spatial ability that shows the greatest sex difference, mental rotation, actually has some association with hunting skills.
There are also other explanations for the sex difference in spatial abilities that do not necessarily involve hunting. Work by Gaulin and colleagues has shown that two closely related species of voles which differ in mating strategy (one polygynous, the other monogamous) also differ in the extent to which males do better in laboratory mazes. Males of the polygynous species are better at mazes than females of the same species, but no sex difference is seen for the monogamous species (Jacobs et al. 1990). Jacobs et al. suggest that this is because males of the polygynous species have home ranges that are 4-5 times larger than females, whereas there is no sex difference in home range size for the monogamous species. Thus they argue that different selection pressures have been operating on males and females in the polygynous species for spatial ability skills.
This argument might also apply to some extent in humans. The majority of known societies practice polygyny. 75% of Murdock's (1957) world ethnographic sample (total N= 554) have some form of polygyny, while less than 1% are polyandrous. Judging from the degree of sexual dimorphism in fossil and living hominids (which is strongly associated with polygynous mating systems across many groups of mammals, including primates: Alexander et al. 1979), it would appear that this has also generally been the case during human evolution.
In any case, both the polygyny explanation and the hunting explanation share the idea that males are ranging farther from some central area than females, and that this would set up differing selection regimes for the two sexes.
There is some evidence that throwing accuracy and spatial ability are correlated at a low level. Kolakowski and Malina (1974) reported a correlation of r= 0.37 between spatial ability and throwing accuracy in teenage boys. Jardine and Martin (1983) report 72 correlations between tests of spatial ability and measures of throwing accuracy, and found that 53 of these indicated positive association (significantly more than would be expected by chance). However, the size of these correlations were small, averaging less than rÅ0.20, and it is not clear that spatial ability per se is the explanation for these findings, as opposed to attentional abilities, or other general cognitive demands that are shared by the two sets of tasks. It is possible that other measures of throwing accuracy that have not been examined, such as throwing at a moving target, would show greater correlations with spatial ability tasks. This is purely speculation at this point, however. Calvin (1983a) suggests a single sequencer mechanism utilized by both language and throwing, and specifically notes the close physical proximity of Broca's area and the primary motor areas for hands, mouth and tongue movements. However, mental rotation ability appears to be dependent primarily on other cortical areas than these (i.e., parietal association cortex), which calls into question the likelihood of an underlying association between spatial ability and throwing accuracy (at stationary targets, at least).
Studies of individual differences on various spatial tasks indicate that spatial ability has many dimensions. In his extensive review of the literature on this topic, Lohman (1988) identifies some ten different spatial dimensions, defined through factor analytic analyses, including: General visualization, spatial orientation, flexibility of closure (finding figures embedded in other figures), closure speed (identification of distorted or incomplete pictures), serial integration (identification of pictures in which portions of the whole are presented serially), speeded rotation (rotation of simple 2-dimensional figures), perceptual speed (matching simple visual stimuli), visual memory, and kinesthetic (ability to make left-right discriminations rapidly). In general, Lohman (1988) notes that complex spatial tests are "primarily measures of G or Gf [general cognitive ability], secondarily measures of task-specific functions, and thirdly measures of something that transfers to other spatial tasks." (p. 232). In other words, individual spatial tasks do not generally share much variance in common with other spatial tasks, after taking account of the correlations these tasks have with general intelligence.
Before speculating about the role of selection for spatial processing in humans, it is useful to review what is known about the neuroanatomical basis of spatial abilities. Spatial processing appears to depend on areas in the parietal cortex that integrate visual information from the primary and secondary visual areas. A number of studies have demonstrated that the parietal association cortex (bounded anteriorly by the primary sensory areas on the post-central gyrus and posteriorly by the occipital lobe) is activated for mental rotation tasks. A recent study by Cohen et al. (1996) using functional MRI (which maps blood flow changes) reported that mental rotation invariably activated mid-parietal regions of the cortex (Brodmann's area 7, and sometimes the inferiorly adjacent area 40). In addition, 88% of the subjects showed increased blood flow in the middle frontal gyrus (Brodmann's area 8, anterior to the primary and supplementary motor areas of the frontal lobe), and 75% showed an increase in extrastriate activation (adjacent anteriorly to the primary visual cortex in the occipital lobe). Activation of the hand somatosensory cortex (inferior portion of the post-central gyrus) was detected in more than half the subjects, and 50% showed activation in the dorsolateral prefrontal cortex (Brodmann's area 9 and/or 46) as well as the supplementary motor areas (Brodmann's area 6). Several other recent studies using different functional techniques are generally consistent with these findings (e.g., Bonda et al. 1995, Williams et al. 1995, Rosler et al 1995, Nikolaev 1995, Hartje et al. 1994, Wendt and Risberg 1994)
Interestingly, Cohen et al. (1996) found little evidence of lateralization of activation on their mental rotation task. Other recent functional mapping studies of mental rotation have produced conflicting evidence of lateralization, with some reporting bilateral activation (Williams et al. 1995, Nikolaev 1995, Hartje et al. 1994) and others reporting evidence of greater right- versus left-hemisphere activation (Wendt and Risberg 1994). Clinical evidence suggests that the right hemisphere is crucial for certain kinds of spatial tasks, particularly those involving the processing of visual configuration (Gardner 1974). Individuals with damage to their right hemispheres generally show intact linguistic functions, but often show difficulty in naming unfamiliar and/or complex shapes, remembering the precise locations of items in an image, and in identifying the whole item from a few of its parts. Studies of brain damaged patients also suggest that the right hemisphere is better at making fine color distinctions, better at depth perception, better at learning mazes while blindfolded, and better at making fine distinctions with respect to overall patterns Ð even in the auditory realm (see Gardner 1974 for a review). Right hemisphere patients also often show "left-neglect", in which they lack cognizance of the left side of their body, as well as the left side of their visual field. The same has not been found for left hemisphere patients (Gardner 1974). All of these deficits result in right hemisphere patients having difficulty finding their way around previously familiar locations, and even in dressing themselves, unlike left hemisphere patients. Studies of split-brain patients, in which the hemispheres cannot directly communicate with each other, also support the idea of a right hemisphere specialization for spatial tasks. For example, in such individuals the right hand (left-hemisphere) has great difficulty copying 3-dimensional models of blocks, unlike the left hand (right-hemisphere)(Gardner 1974). However, it should be noted that the left hemisphere does have basic spatial abilities, just as the right hemisphere has some basic (receptive) language abilities. The difference with respect to spatial processing are apparent in the complexity or detail of the processing required (Gardner 1974).
Thus, it appears unlikely on functional neuroanatomical grounds that throwing and spatial ability are somehow intrinsically linked, since they appear to be dependent on different areas of the cortex. Also, if selection for spatial ability was important for brain size evolution, we should expect to see an association between brain size and spatial ability (at least across species). This has not been demonstrated yet (see next chapter). It is important to note that the parietal association areas do not appear to have undergone as dramatic change as did, e.g., the prefrontal cortex, though we have only indirect evidence of this (see chapter 2).
Hunting was clearly important during human evolution, and it surely placed a premium on various spatial abilities. Spatial processing depends on parietal association areas (with the right hemisphere apparently dominant), though these areas did not increase as much during hominid evolution as did prefrontal areas. Selection for better throwing accuracy could at least theoretically have led to increased brain size, but whether or not this is the major explanation for the change remains to be seen.
An alternative and/or complementary explanation for brain size evolution has been proposed by Milton (1981, 1988). She suggests that dietary changes in hominids provided the major impetus to brain size increases. Her general argument is that nutrient-dense foods are harder to find, more difficult to obtain, and/or need complex pre-processing before they can be eaten, all of which would at least theoretically require greater neural processing (Milton 1993). This can be seen clearly in her comparison of howler monkeys (Alouatta palliata) and spider monkeys (Ateles geoffroyi), which live in the same location and are almost the same size, but which have very different dietary preferences. Howler monkey diets consist of 48% leaves, 42% fruit, and 10% flowers and buds. At certain times of year when fruit is scarce, their diet is almost 100% leaves. By contrast, spider monkey diets consist of only 22% leaves, 72% fruit, and 6% flowers and buds. At times when howlers are eating 100% leaves, spiders still eat about 60% fruit.
The reason this difference is interesting is that, unlike leaves, fruit sources are not found equally distributed across space. Milton (1988) found that 65% of all tree species are found less than once per hectare in Panama. Also, fruit sources are not found equally distributed across time: Ripe fruits of a particular species are available only for a period of 1.1 months per year, on average, in Panama (Milton 1988). There is, however, some consistency in fruit sources, in that their physical location does not change over the course of a year. If the animal can remember where a fruit source is and when it is likely to be fruiting, it can count on this source for the rest of its life.
An animal that depends on a patchily distributed food source requires more behavioral flexibility, all else being equal, and this leads to an increased importance of learning. To the extent that brain size is a correlate of behavioral complexity, we should expect to find that brain size differs consistently with dietary preferences. This is exactly the case for howler and spider monkeys, where the fruit-eating spider monkeys have brain sizes approximately twice as large as the leaf-eating howler monkeys, even though they are about the same body weight (Milton 1981, 1988; see also appendix A, table 2). This general pattern has been confirmed across other primates and small mammals, in which it has been shown that folivores have smaller relative brain sizes compared to frugivores and insectivores (Harvey et al. 1980). Dunbar (1992) showed that the percentage of fruit in the diet correlated with the neocortex ratio (the ratio of neocortex to the remainder of the brain) at a level of r= 0.50 (N = 29, p< 0.05). However, he also found that neocortex ratio correlates even more strongly with mean group size (r= 0.87, N = 36, P< 0.001). He interpreted this as evidence against the dietary hypothesis, though it is not clear that this follows. The social complexity hypothesis and the dietary hypothesis are not mutually exclusive possibilities. Even if brain size increases were caused initially by the increasing demands of sociality, the brain is metabolically a very expensive organ to maintain (as pointed out at the beginning of this chapter). This would surely cause selection for dietary changes which emphasize foods with relatively high concentrations of nutrients (as in fruits, insects, and meat).
Humans today have a diet that emphasizes nutrient-dense foods. As noted in the section on hunting, the evidence points to a long history of this kind of diet in hominid evolution. Hunting requires a level of complexity far beyond that of finding fruit, because the prey items are not stationary, are actively trying to avoid being eaten, and are potentially dangerous to capture. It would stand to reason that this would require much more in the way of neural resources than other sorts of dietary preferences. This conclusion is supported by comparative data on brain weights in carnivorous and herbivorous mammals. Using the average brain and body weights listed in Martin and Harvey (1985) for different orders of mammals, we can calculate (using Martin's 1981 formula) that carnivores have an average EQ of .85 (N=168), while artiodactyls (deer, antelope, cattle, etc.) have an average EQ of .60 (N=72) and perissodactyls (horses, zebras, tapirs rhinoceroses, etc.) have and average EQ of .54 (N=9). Furthermore, this pattern has been apparent during the entire radiation of mammals. Jerison (1973) compares EQ's of ungulates (perissodactyls and artiodactyls) and carnivores across three major periods of the Cenozoic and finds that carnivores have consistently had larger average EQ's than ungulates (even though EQ's of both groups have risen dramatically since the Paleocene).[15]
Whether or not dietary changes were the initial cause of increasing brain size or simply an effect, these changes no doubt played a key role in hominid neuroanatomical evolution.
Language, hunting and associated dietary and behavioral adaptations (such as throwing accurately), as well as increases in social complexity are all possible candidates to explain the major features of hominid neuroanatomical evolution. All of them clearly place demands on neural processing of various kinds. However, it is likely that these behavioral changes are all interdependent, and that none of them occurred in isolation from the others. For example, several researchers have suggested that language and brain size evolution are linked (e.g., Gibson 1988, Dunbar 1993). Dunbar (1993) further argues that language is a form of social grooming needed to maintain group cohesion. Dunbar's argument takes for granted that larger group sizes were adaptive, thereby selecting for larger brain sizes. The arguments that Dunbar (1992) reviews concerning the advantages of large groups center around defense, either against predators or other humans. However, it seems just as likely that, given a social existence, there would be an advantage to individuals within each social group who were better able to understand the intricacies of a social living (Humphrey 1984). Such an individual would be best able to take full advantage of the possible informational benefits available to individuals within social groups (by informational benefits I mean such things as where the best food can be found, which individuals are most dangerous, what behavioral expectations there are of the individual, how the individual might avoid being taken advantage of, and so forth).
My point here is that sociality has benefits in and of itself. Language would clearly facilitate the maintenance of larger social groups, but it might also simply be adaptive at any level of social complexity. Both increased sociality and increased linguistic capability undoubtedly had deeply intertwined feedback relationships with each other during human evolution, so much so that it is probably not profitable to attempt to disentangle the relative importance of the two. Furthermore, any increase in neural processing, for whatever reason, would presumably have lead to immediate selection for the necessary neurological adaptations (of which brain size increase is one obvious possibility; see next chapter). These neurological changes would immediately select for changes in dietary preferences. Thus, all of the behavioral changes reviewed in this chapter probably played key roles in the evolution of the human brain.
There is one common component to many of the behavioral changes discussed in this chapter that bears emphasizing: memory. As discussed in the section above on language, memory appears to be dependent on broad areas of the cortex. Memories are most likely simply networks connecting numerous primary sensory and secondary association areas. It stands to reason that the larger the cortex, the more rich the network of associations, and the more complex and detailed the internal representation of external reality can be. Thus, memory might well be the underlying thread connecting the fundamental behavioral changes that occurred during hominid evolution. It plays a key role in language, effective social interaction, hunting, as well as general subsistence activities.
[1]These EQ's are slighly different from Jerison (1973) because they (and those that follow) were recalculated with the mammalian equation of Martin (1981), which has a slightly higher slope than Jerison's (1973) equation.
[2]For comparison, modern cercopithecoids (catarrhines of the same adaptive grade as Aegyptopithecus) with the same body size have brain sizes that range from 70 to 110 cc (Stephan, et al. 1981; see appendix A, table 2).
[3]For some of the neuroanatomical discussion in this chapter, it may once again be useful to refer to appendix E, which contains line drawings of the major external features and landmarks of the human brain. These drawings are reprinted from DeArmond, et al. (1989), and are used by permission of the publisher: Oxford University Press.
[4]Specifically, the cricothyroideus, lateral cricoarytenoideus,transverse arytenoideus, and thyroarytenoideus muscles adduct (close) the vocal cords, while the posterior cricoarytenoideus muscle abduct (open) them (Christensen andTelford 1988).
[5]The overtones occur at 2 x F0, 3 x F0, 4 x F0, etc.
[6]This is because the larynx functions not only to keep food from getting into the lungs, but also to seal air into the lungs under pressure, thereby strengthening the thorax considerably and allowing more effective use of the forelimbs.
[7]Brachiation is a mode of locomotion characterized by swinging underneath tree branches with the forearms, and is uniquely shared by apes.
[8]This can be accomplished if the listener can somehow accurately deduce what word the speaker is likely to be saying. If there is a high probability of a particular sort of opening sentence then this sort of deduction can be greatly enhanced. Lieberman (1984) suggests that greetings such as Hello, Hi, etc. may serve this function. It is certainly true that there are a limited number of likely opening phrases when we meet someone for the first time, and this restricts the possible interpretations that a listener could make about the meaning of the sounds produced.
[9] It was originally assumed that word meanings were stored in Wernicke's area because of the comprehension difficulties shown by Wernicke's patients. These difficulties are now thought to be due to problems processing speech sounds (Damasio and Damasio 1992).
[10]Duchin (1990) does not name the specific fossils she used for her analysis.
[11]Cro Magnon is universally considered to be anatomically modern.
[12]This has been done by placing organisms in constant light and temperature conditions in the laboratory. One study placed hamsters, fruit flies, cockroaches, cockleburs, soybean plants, and bread molds on a platform on the South Pole. The platform rotated at exactly the speed of the earth but in the opposite direction such that the animals were essentially stationary with respect to the sun and the earth, yet the cycles still persisted (Keeton 1980).
[13]A system of 4 components in parallel will be half as variable as any of the individual components in isolation; 16 components are needed to create a system that is one quarter as variable; and so on.
[14]Additionally, Stanford et al. 1994 report that a single female, Gigi, accounted for 27% of all kills reported for females.
[15]'Archaic'
(the name Jerison gives to species of extinct mammalian orders)
carnivores had an average EQ of .44 (N=4), while archaic ungulates averaged .18 (N=13). Carnivores (in orders which have
survived to the present) from the Oligocene averaged .61 (N=11), while ungulates from roughly the
same period averaged .38 (N=26). Carnivores from the Miocene and
Pliocene averaged .76 (N=6),
while ungulates from this period averaged .63 (N=13; this includes two species from an
extinct family: Merycoidodontoidea).
Copyright 1997 by Paul Thomas Schoenemann