Music is universal among cultures and so is speech. How did that happen?
Before children start to talk, they babble and coo. Why? The usual answer answer is that this vocalization phase is a step toward speech. That answer works well enough if you believe in a language instinct as rich as the one described by Steven Pinker, but the weight of the evidence has been pushing this blog away from that line. So what's this cooing and babbling all about? The question seems especially pressing after last week's post (here) that said
It sounds good, but we cannot say that human vocalizations develop as a preparation for speech, and then that speech comes from the development of human vocalizations. Happily, a recent paper by Nobuo Masataka in Physics of Life Reviews, "The origins of language and the evolution of music," (abstract here) suggests a way out of this circle.
- Infants begin by making vocalizations within the capacity of many primates.
- They are taught by their caregivers to turn those vocalizations into the sharing of an emotion.
- These joint emotional exchanges take on a prosodic quality common to both music and language.
Step 1. The first non-crying vocalizations are coos (monosyllabic sounds) that start at around 6 to 8 weeks. Mothers typically respond by cooing back to the infant, starting up an exchange of coos. Matsataka reports that
Step 2. By the age of 3 or 4 months human infants are able to time their responses to caregiver coos. The caregiver coos commonly match the pitch of the infant coo, and by the end of the fourth month the infant begins matching the caregiver pitch. A true duet of matching pitches begins. Vocal control improves greatly during this period and by the age of 6 months; infants can generate a contrastive ptich contour, such as a sound with a falling pitch at the end. Thus, the duets between caregiver and child become increasingly complex. The imitation appears to be instinctive on the infant's part, but not the sounds themselves:
Caregivers intuitively encourage infants to engage in this vocal matching. "Whereas," says Masataka, "no such encouragement is observed in nonhuman primates."  This style of encouraging prosodic cooing is commonly called "motherese" and serves two functions. It gets an infant's attention and "affective salience," which I take to mean seizing one's emotional focus. At this point in his paper Masataka reports on how these functions "suggest a linguistic benefit for preverbal infants." Yes, yes, but what about all the nonlinguistic benefits for preverbal infants? There would seem to be some. The function of motherese is not linguistic but to bring two people together in a strong shared emotion.
Part 3. It sounds so agreeable that it seems a shame that language comes along to put an end to the exchanges. Except, of course, they do persist. At eight months human vocalization takes on the basics of both speaking words and singing songs. Infants begin making true syllables, and "In nonhuman primates,no vocalizations are as well-articulated as babbling of 9-month-old human infants."  This may reflect practice and learned control rather than any special motor neurons and muscles. (It is hard for me to tell from the paper.) What seems fundamentally new is the motivation to join in such an activitiy either as a caregiver or an infant.
Every now and then somebody tries raising newborn apes or monkeys the way human infants are raised. The ape infants accept the humans as their caregivers and are satisfyingly dependent and reasonably affectionate, but they do not join in cooing, let alone advancing to the babbling stage. They never become members of an emotional community.Thus, as they grow older, instead of fitting in ever more competently with their family's world, they fall out of it. Even apes who learn a bit of sign language become too uncontrollable to work with.
Conclusion. This process appears to be a fine example of the "C-induction" discussed in last week's post. C, or coordinated, mastery does not require learning some unyielding law of nature, but merely the ability to match one's behavior with one's fellows. Cooing seems to fit this definition well. It uses vocal abilities available to other primates and shows there is a critically necessary "instinct," the yen to join emotionally with others.
Ths motivation is not something found in the African apes and must have some sort of evolutionary history, meaning:
- the process required generations and could not have originally been just a short phase during infancy, and
- the point of such evolution cannot have been either the future development of either a musical or a linguistic ability. We had to evolve the cooing motivation for its own sake. We then had to add on the distinctive babbling stage for its own sake.