Evolving a language is much easier than evolving a sense organ, such as an eye or ear, argue Nick Chater and Morten H. Christiansen in a Cognitive Science paper, “Language Acquisition Meets Language Evolution” (uncorrected proof here ). Although the thesis sounds plausible, it does raise a question. If true, then why are eyes so widespread while language is unique to one species?
The authors distinguish between two types of mastery:
- Natural (N): The ability to understand/adapt to/manipulate the natural world.
- Coordinated (C): The ability to coordinate one’s understanding/adaptation/manipulation with others.
C type mastery (“C-induction”) is much the less demanding than N type (“N-induction”) say the authors.
For N mastery a person has to deal with what is physically out there. For example, a ball has the same physical properties, no matter what the culture of the person holding the ball. Therefore, learning to catch a ball requires developing some intuitions about those properties—the ball’s heft, momentum, and the motion through space. A person who cannot properly develop and organize the required perceptuo-motor skills cannot catch the ball, no matter what. Others may offer coaching, but the learner has to acquire the intuitions and skills that allow the action. Nature is a hard taskmaster, bending no rules to make the task any easier.
Meanwhile, learning to nod one’s head to indicate yes or no is relatively easy, because the physical part is built in. Apes can nod their heads, so the physical potential was in the nervous system before any functional role was assigned to the activity. The function itself is arbitrary. In some cultures, nodding the head means yes, while in Greece it means no. This difference makes life difficult for American tourists in Greece, but shows how easy it is for a child to learn head nodding. There is no natural solution, no intuitions and skills that must be learned by every head nodder in the world. You just have to learn how it is done where you are. The child has to learn the observable behavior and need have no intuitions about why it works that way.
These N/C distinctions may seem obvious, but Chater & Christiansen make a good case that too much work has assumed that C-induction is like N-induction. Their critical idea is that C-based acquisition (of language) requires different things from N-based acquisition (of non-linguistic skills), and C-based evolution (of language) works differently from N-based evolution (of non-linguistic skills).
Chomsky and others who believe that much syntactical ability depends on an innate, specialized organ (or modules) within the brain know that syntactic organization does not depend on some pre-existing laws of physics, but they don’t look to C mastery for a solution. Instead they treat language mastery as a kind of special case of N mastery in which the rules are arbitrary rather than eternal, but nonetheless have to be mastered as intuitively as children master the physics of balls. Pinker and Bloom (paper here) famously argue that the evolution of an eye (N-based evolution) is a good analogy for the evolution of a language module.
Empiricists lean away from the innate module, but again we see them tending to argue that C-induction is like N-induction. Empiricists commonly suppose that understanding the process of learning a task like catching a ball is sufficient to account for learning a cultural action like nodding the head. There are many rival empirical learning theories, but time and again researchers assume that the process that accounts for N-based learning can account for language learning as well.
The authors also make a critical distinction between the evolutionary processes of N-based adaptation and C-based adaptation. They write:
natural selection cannot lead to the creation of dedicated, domain-specific learning mechanisms for solving C-induction problems (e.g., innate modules for language acquisition). By contrast, such mechanisms may be extremely important for solving N-induction problems. [p. 9]
An example of an N-based adaptation (mine, not taken from the paper) is frog co-ordination of eye-tongue for fly-catching. It has been well demonstrated that frogs use a reflex rather than learning when they shoot out a tongue and catch a fly on the wing. It must have taken many generations to evolve a tongue that can reach so far and to coordinate the action so that it can send the tongue exactly to the point where the eye sees the fly is headed. But once evolved, the situation is very stable. The laws of motion and the geometry of space do not change. So now natural selection conserves its achievement. Frogs may diverge into new species, but their tongue-eye coordination persists.
Meanwhile, it is much easier for a species to adapt some new social function to an existing ability. Chater & Christiansen argue that this process accounts for the evolution of syntax. The ability to vocalize or sign voluntarily was there, and we added the function. Whatever evolving occurred consisted of the functional behavior adapting to the nervous system. This process also explains why human speech is so much less stable than the frog’s reflex. Natural selection’s extremely strong powers of conservation are removed from the story. The result is an unstable system that changes but remains meaningful because the meaning is coordinated by each generation of speakers.
The N-induction/C-induction distinction strikes me as valuable and explanatory, but only to a point. Coordinated activities are not always so inevitable as the authors seem to suggest. Bird songs are a familiar example of coordinated activity between individuals that still requires much practice and genetic support. The birds’ FOXP2 gene seems to have altered in this regard. (See: Birds Also Use FOXP2) so it seems overly simple to say that natural selection has no role to play in C-based adaptations. On the other hand, it has been shown that zebra finches who learn a distorted version of the wild song, will over a few generations sing a song that is hardly distinguishable from the wild version, so there really is a tendency for the song to evolve toward the pattern common to the species. (See: A Cultural Law of Gravity) Thus, even in cases that seem to be a mix of N- and C-induction processes, something more than Darwinian or genetic determinism seems to be afoot.
If this blog’s suspicion that the human lineage was engaged in some form of speech at least 2.5 million years ago is anywhere near right, at least some of the many mental and vocal changes in human anatomy during that long history probably had something to do with making speech richer. I’m not ready to throw out Terrence Deacon’s concept of body-language co-evolution just yet, but I will try to be clear in my mind and my writing about whether I’m talking about N- or C-based mastery.