One of my first posts on this blog was titled Attention! It's a Revolution. Now that I have put a few more years blogging under my belt and put the results into a book length manuscript I believe that point even more intensely. The old idea that, at its core, language consists of a series of declarative statements to be judged by their truth value can no longer stand. At its core, language is a means of cooperation between people. It works by constructing shared perceptions through joint attention. More generally, it is a shared system that permits cooperative interactions between people. The various human sciences that don't now take cooperative interaction as the foundation of their subject will either adopt it or be left behind.
Parisi begins his report on robots by embracing the old, isolationist view of language: "the meaning of linguistic expressions is derived from the physical interactions of the organism with the environment." The striking thing is the author believes himself to be advocating something new, a view that links perceptions with actions, but the difference is only a variation on an old theme. It still sees the language user apart from society. How separate? A search of the paper indicates that neither the word attention (let alone joint attention) nor cooperation appears anywhere in the discussion.
The main body of the paper opens with the observation, "If we want to construct human robots rather than just humanoid robots, that is, if we want to construct robots which actually behave like human beings rather than robots which only resemble human beings in their external morphology, it will be necessary for our robots to possess language because language is such a prominent feature of human beings." The statement sounds like the kind of thing teachers hear from bright students who have missed the point. Yes, language is a prominent feature of human beings, but it would be more accurate to put it more strongly: without language you could not have human society and without society you could not have individual human beings. Language is more than prominent; it is the sine qua non of a human community.
Also missing from the account is the way communities use language to create a something new, although one can hardly fault Parisi for the absence. Our social sciences have not been good at developing a theory of cooperative interactions, probably because analysis does better at stripping things down to their atomic level and also because cooperative interactions demand a level of action and initiative that is easier for a storyteller to imagine than it is for a logician to deduce.
You can see what I mean in the paper's discussion of ambiguity. Parisi proposes solving the problem of ambiguous expression by taking context into account. Fair enough, but how do you do that? The paper goes on, "We define context as any additional input, linguistic or non-linguistic, arriving from outside the brain or self-generated inside the brain, that may influence what activation patterns are sequentially elicited in the internal units of the NL sub-network."
Linguistic or non-linguistic input… from outside or inside the brain—can anything be excluded from this definition? If context can be anything in the universe, real or imaginary, stating that disambiguation depends on context does not seem to help us much. It certainly gets us no closer to understanding how Martha, upon hearing Sally's words, understands them. How did the two happen to hit upon the same context?
This analysis would do better to break the context into what Tomasello calls the common ground of speaker and listener, and the functional shifts of piloted attention. Context cannot be anything whatsoever; it must be shared and the speaker must know it is shared. Take a sentence from a conversation between two people: "John came in and said he caught it." It depends on something that came before the sentence. We outsiders cannot possibly know whether it refers to a disease, a ball, or a joke. We are off the common ground. Another pronoun is easier to interpret: he refers to John. We know that because the sentence begins by focusing our attention on John and doesn't shift it elsewhere. If Charles caught it, that would have to be specified. This analysis is not 100% reliable because some common ground might override the sentence. Maybe the two were discussing Charles and whether he caught it. Even then, sharper language would be better, but common ground makes up for much sloppiness. And when it does not the listener either misunderstands or mutters, "Say what?"
The communication relationship between human and computer is quite different. Humans either provide an input that tells the computer to do something and the computer does it, or humans ask for some information and get a data dump response. These are not cooperative interactions, but they are what one might expect from servant and master, or tool and tool owner.
A robot with a true mastery of language would be a machine that could pay joint attention and contribute its share to the conversation. It would be much more of an initiator of appropriate surprises than today's computers could master. Computers make such useful tools, I don't know why we would want to exchange them for peers demanding respect, although I grant that making such a machine would be pretty dang impressive.