Birdsong has more to tell us about the biology of speech than one might expect, according to an “eBriefing” posted by Alisa G. Woods for the New York Academy of Sciences (the eBriefing is here). It is true that birdsong serves the common communication task of controlling relations rather than directing attention, so we would expect birdsong and speech to have evolved through different selective pressures. But they both depend on social learning. The sounds of song/speech are not the inevitable result of genes and anatomy, so they are not like the sounds crickets make. Crickets need no guidance in how to make their noise, and surely they sounded the same in Caesar’s day. With speech, however, we know for a fact that the speech sounds made by modern Italians do not match those used by Caesar’s Romans, and there has likely been some drift in the songbird sounds as well.
Young songbirds and speakers must both learn, first, to recognize the sounds their parents make and, then, to make the same sounds themselves.
Separate species confronted with similar problems often evolve similar solutions, a process known as convergence. Birds and bats both evolved wings that let them fly. Even when a matching solution is not possible, we can see how pressures lead to similarities. Birds have no teeth, so they cannot evolve the fangs of carnivorous mammals, but carnivorous birds do have curved, hooked beaks that work like fangs.
Physically speaking, bird brains are radically unlike mammal brains, so much so that it is impossible to guess at the sensations that comprise bird experience; we cannot hope for a simple match between bird and human speech/song recognition and repetition. But convergence of the hawk beak/wolf fang variety seems at least a possibility. Two contributors to the eBriefing that present food for thought along these lines are David Vicario who reports on vocal learning in young songbirds and April Ann Benasich who reports on vocal learning in human infants.
Vicario’s presentation makes specific comparisons between the way songbirds learn their songs and humans learn their language. Both depend on hearing; both are born with a disposition to perceive the typical signals of their species, and both require social interaction for them to learn. Although both species appear to have innate urges to make their characteristic sounds, neither species can do very well without hearing how its fellows perform.
The basic learning method Vicario has found in zebra finches is for the bird to hear a song from some more experienced bird. The young bird remembers the sound and through some “magical” motor process makes something like the same sound. The bird hears the sound it makes and compares it with the remembered sound. Cycling through this process many times the bird is able to reproduce the sound with greater and greater accuracy. At first the sounds the birds make are quite nondescript; essentially they are undifferentiated noise. But over time the birds become more and more adept at making the sounds they have heard and by the time they are mature they can make the characteristic song of the zebra finch. Unlike crickets, each bird does make a slightly different song from all the other birds, and it is at least conceivable, therefore, that birds can recognize at least some other birds by their voices. (Or so I’m assuming. Vicario was content to say each voice is slightly different.)
Zebra finches are small, Australian songbirds with tiny brains, yet they have specialized brain regions for learning and using songs. A region called NCM (the caudomedial neostriatum), responds specifically to the sounds of other zebra finches, enabling them to recognize finch sounds on first hearing them. Other pathways in the brain are devoted to learning the song, and singing the song themselves. This kind of biology is very different from what an engineer would likely build into a computer that communicated through song. In fact, we don’t have to speculate about how an engineer would do it. Your home wi-fi network illustrates the efficient solution. There is a special receiver for detecting a signal, and a special generator for transmitting. Insects and pheromones work in a similar method.
The social learning of the finches adds another level of complexity. Doubtlessly, finch ears are especially sensitive to the sounds of their song, and we know their brain’s NCM region is specialized for recognizing the song. This biology is equivalent to the wi-fi’s receptor. Finch vocal organs are surely specialized for generating their characteristic song and their brain includes specific song generating pathways. This biology is equivalent to the wi-fi’s generator. But between these two is a whole level of learning and mastery that computers do not require. Along with the brain apparatus to support the learning, finch biology must include the capacity to listen to oneself and adjust motor activity to better match output with remembered input. Plus there must be some strong urge to learn their species’ song. Vicario uses elaborate laboratory techniques to study the learning of individual birds, but there is no Skinnerian rewarding of improved singing with bits of food. The improved singing is its own reward.
Modern psycholinguistics began by recognizing that Skinnerian rewards could not and did not explain how children come to speak. Instead it looked at semantics, phonology, and (especially) syntax. If you look at the list of things a finch requires between ear and mouth, however, you find learning must include an ability to
- recognize sounds,
- imitate sounds, and
- compare sound generated with sound heard.
This listing doesn’t mean children are not also concerned with meaning and syntax. Plainly, they have to do more than a finch must, but they also have to do what a finch does: perceive the characteristic sounds of their species, and learn to generate them themselves.
April Ann Benasich’s part of the eBriefing looks at how children master the perception of sounds. She has found that she can predict whether the vocabulary of children at 36 months of age will be normal or “impaired” with an accuracy of 93.9% by looking at their ability to discriminate between sounds at age 6 months. The same data for six-month-olds can be use with 90.9% accuracy to predict the thirty-six-month-old’s comprehension abilities. (Impaired, in this study, refers to a child who performs at 1 standard deviation or worse below the mean.)
As a practical matter, a discovery like this suggests that perhaps valuable medical interventions are possible two years before children’s speech begins to show signs of difficulty. This blog, however, is not very practical. For us, this work suggests that the evolution of speech required the introduction of many more perceptual skills than rational ones. Chimpanzees may be as logical as us, but they are much less able to hear exactly what we are saying.
Benasich reports that 5% to 10% of children have “specific language impairment,” a problem with learning speech despite the fact that their hearing is normal, they are not autistic, and they are not retarded. Socially, audiologically, and mentally, they are normal, but they still have trouble with speech. Her premise is that the problem is perceptual. There is a strong correlation between perceptual discrimination in infancy and language ability at 3 years.
Anybody who has ever tried to learn a foreign language knows that a fundamental problem out of the starting gate is the speed of the speakers. Imagine what it sounds like to a baby with no experience at language. Like the finches, human infants are born with a tendency to recognize and listen to the sounds of their own species, but that trait does not guarantee learning. Infants have to discriminate between sounds, recognizing the difference between ma and pa, and they have to be able to integrate sounds into wholes so that they can recognize full words like mast and past. They also have to do this integrating and discriminating very rapidly so that they can work with the next phoneme to waft through the breeze. Most infants (90% - 95%) do very well at this task, but a few are overwhelmed and immediately start slipping behind their fellows. It serves as a good reminder that the evolution of speech required the introduction of a host of new perceptual skills if we were to accurately reproduce the sounds we heard.
It is also another bit of evidence that the archaeological argument for speech being a very recent part of the human story is untenable.