Along with the question about the evolution of blank slates (see last post) Dr. Bolhuis sent me a paper just published in Trends in Cognitive Science. The paper, which Bolhuis teased would annoy me, is titled, “Structures, not Strings: Linguistics as Part of the Cognitive Sciences” and has a number of distinguished co-authors, including Noam himself. Bolhuis is the corresponding author. I told him I would never have been able to maintain my blog for so long if I did not have a taste for being annoyed.
Truth is, however, I was more puzzled than annoyed. The paper seems a concise summary of positions held for many years. Why write such old news?
There are hints to why scattered about the paper. The old news is being ignored. I was astonished by one sentence, “Introductions to psycholinguistics generally do not mention notions such as hierarchy, structure, or constituent.” As I recall it, the very word psycholinguistics was coined by George Miller to denote the marriage of psychology with generative grammar. Now the two fields are quite divorced.
Also, in the paper's abstract, we find: “taking language as a computational cognitive mechanism seriously, allow us to address issues left unexplained in the increasingly popular surface-oriented approaches to language.” So that’s what the paper rebuts: big data approaches to language. This method compares texts with a many similar texts in a database and then makes a statistically-supported guess as to the text’s meaning.
The limitations of big data approaches is demonstrated by the unlikely French sentence La pomme mange le garcon (The apple eats the boy) The authors submitted this simple sentence to Google Translate and got as their translation the more probable, “The boy eats the apple.” I got a nice chuckle out of that one, but the truth is that for all its flaws, Google Translate does a better job than machine translation based on Chomskyan rules. The big data approach so disliked by the authors is paying off in a practical way that 60 years of Chomskyan linguistics never has.
Chomsky began with the assumption that the brain is a type of computer and that language can best be understood as the product of an algorithmic computation. Over the years this theory has been honed so that now the decisive operation for generating sentences, an operation called Merge, is pretty much a duplicate of the way a Turing machine works. So, if the brain is a computer, and if language is the product of a computation, we should now be seeing computers that do at least a semi-decent job of generating sentences. Actually, we do. We can speak to our phones, translate news articles, and produce computer written news reports. It is just that these computers ignore the work of Chomsky and colleagues.
Big data's approach to language is like Deep Blue’s approach to chess. Neither tries to emulate human thinking, but just seeks to solve a problem through a computer’s enormous speed and search powers. Successful humans, on the other hand, think strategically. That is to say, good chess players have a common purpose behind their tactical moves. Machines, however, are no good at developing purposes, so they have to play some other way. When humans use language well, they have something to say. That is, they have a purpose behind their choice of words and the formation of sentences. Generative grammar has failed, and computer developers have moved on, because Chomsky’s bold ideas turned out to overlook the purposes that lie behind language use. There can be many purposes behind a sentence, but the universal one is to draw attention to a particular phenomenon or idea. The authors of the paper seem (and probably are) oblivious to this purpose. Thus, computers using generative rules are neither taking full advantage of their computational power nor managing to think like a human.
Big data approaches to language hold very little interest to this blog, because they are irrelevant to the issue of language origins. Big data assumes the existence of a large body of sentences to analyze, and I first became interested in the problem of language origins when I realized that at some time there must have been people who were not born into a world of sentences. At the same time, the Chomskyan revolution has, as the authors say, “been forgotten, ignored, or even denied.” The authors lament this state of affairs, but by arguing only with the big data approach they have nothing new to say to followers of this blog.