Final Thoughts on Baron-Cohen’s “Theory of Mind”, Autism, and Intentionality Detection and Violence in Humans

I breezed through Simon Baron-Cohen’s Mindblindness and took away the following:

  • His hypothesis is that Autistics lack a Theory of Mind Module (ToMM), which allows one to read the intentions of another person. He believes that Autistics do not exhibit shared attention (SAM), though he gives no clear indication as to whether this is a deficit in Intentionality Detection (ID) or Eye Direction Detection (EDD). Whatever it is, Autistics fail to share in attention. When you point somewhere, Autistics might not look where you’re pointing. Without SAM, one cannot have ToMM, and so since Autistics don’t exhibit shared attention, they must not be able to read others’ intentions, which he calls “mindreading.”
  • Intentionality Detection (ID) is observable in many primates.

Here’s my interpretation:

Autistics only *demonstrate* lack of Theory of Mind (ToM), but this is not evidence that Autistics actually lack ToM, which would mean that Autistics are capable of shared attention. Autistics, for example, are perfectly capable of shared attention when watching screens. They’re also capable of understanding the intentions of fictional characters.

I believe Autistics have difficulty sharing attention with people because they’re expected to reciprocate. When the demands of reciprocation are dropped, shared attention is easy. An Autistic might be busy drawing while you attempt to get his attention about something else, and though he might not respond, this doesn’t mean he’s not sharing your attention. He might simply be compartmentalizing your demand for attention somewhere else for later processing while he completes the current task. He might compartmentalize for some time before taking action on a large amount of these stored intents. Perhaps when he’s alone, he’ll process everything, and quite quickly. This might explain why Autistics seem frozen to inaction around people, but become fully functional when alone.

If Baron-Cohen’s researchers are not able to see Autistics demonstrate ToM, this might simply mean that the Autistics are withholding it from them, waiting to process later when they’re not required to reciprocate emotions, etc. So ToM will be present, but they won’t be able to see it. Autistics are well aware of what the researchers want out of them, and if they’re refusing to reciprocate, then this will be seen as a “failure”, when in fact the Autistics are demonstrating adequate central coherence, welling up over time.

There are some important takeaways though that apply to my ROBA Hypothesis:

The Intentionality Detector (ID) provides an important component of my ROBA Hypothesis. If we had to describe ID as an equation, it would be something like: stimulus is doing X and therefore intends Y. So ID can be represented a set {X, Y}. High level primates have ID.

High level primates (chimps in particular) are also capable of object-based aggression (OBA), as we see when they use rocks and sticks in the heat of battle. However, I’m not convinced they can plan for this. I’ll describe this later.

Chimps are also capable of using objects for foraging (cracking nuts) and hunting (dipping for ants). Many have claimed that this form of tool usage was a precursor to our Oldowan hammer axes followed by the Acheulean stone tools, both of which required “culture” for transmission. Australian Aboriginese had such a culture of stone tool manufacturing. The hypothesis has repeatedly been floated that chimp objects-as-tools gradually evolved into human tool manufacturing.

Apes like Washoe and Koko have been able to learn rudimentary sign language at the level of a 2 year old, which is seen as evidence that animal communication systems (ACSs) gradually evolved into human language.

Kathleen Gibson (1993) compiled a series of scientific papers which attempted to show a connection between tool usage, language, and cognition. A gradualist hypothesis was reinforced, claiming that there was a protohuman between chimp (or bonobo) and human, who had a “protolanguage”.

However, there’s a Rubicon to cross that nobody has even looked at: how could chimp violence “gradually” become human violence? Humans are alone in being able to combine object-based aggression (OBA) with intentionality detection (ID) when engaging in intraspecific (human-on-human) violence. A chimp, however, cannot use the ID to determine “Koko has a rock and intends to bludgeon me.” If it could use the ID in this way, chimps would use the ID to plan for battle and produce various tactical outputs: Bringing more rocks to battle, bringing more reinforcements with rocks, using sharper rocks or sticks, etc. You never see chimps bringing rocks to war. They only grab them in the heat of the fight, and when they do there’s no attempt at deescalation by other chimps. And still yet, chimps will bring tools to crack nuts. So there is a functional rift between planning for hunting/foraging and planning for intraspecific (chimp-on-chimp) aggression. This is the blind spot of all the paleoanthropological and anthropological studies I’ve read: Their emphasis is solely on hunting and foraging, entirely negating intraspecific aggression in chimps, intraspecific violence in humans.

Bottom line: If chimps could use the ID to anticipate OBA in other chimps, they would escalate violence by bringing deadlier objects and tools. They have all the mechanisms necessary to do this, and yet they don’t. This can’t simply be a choice. Certainly any beta chimp would benefit from bludgeoning an alpha chimp in his sleep by smashing his head with a boulder, or bringing sharp rocks to a carcass in case there will be other apes present. Whatever is the mechanism responsible for chimp aggression, it has no access to the OBA mechanism. OBA can only be employed in the heat of the fight. If anything, chimp OBA appears to be almost incidental. The use of a rock during a fight is more a threat than an attack, as in the act of shaking branches for intimidation. The grabbing of a branch for OBA would be secondary.

I would hypothesize that chimp object-based aggression (OBA) isn’t actually OBA proper. There’s no forethought to it. If any given chimp happens to come across a stick that he can swing wildly and get some advantage during intraspecific aggression, he’ll use it for however long he needs, but then he’ll drop it, and other chimps won’t bother mimicking him. Humans, however, employ OBA using whatever we can, and we will plan by fashioning bigger tools or keeping them in strategic locations.

We can assume that, since we’re able to use OBA, we can anticipate other humans are capable of OBA. We might imagine our overall neurological “map” looks like a human, with the addition of “props” that extend out of the hands. When we map someone else’s intents onto this schematic we can pass this to the ID to anticipate OBA from other people. This is all part of the intentionality detector (ID).

Animals also use the ID to anticipate the attacks of other animals using their own neurological. Interestingly, cats hunting snakes seem to be able to use the ID to anticipate snake strikes, as we can see from their pawing maneuvers. When a cat attacks a dog, a very different strategy is involved. Does the cat hunting “map” somehow include snake and dog moves? For some reason, these neurological maps have a lot of overlaps between species.

This might be the same for chimps when it comes to foraging and hunting, but their aggression “map” doesn’t include the use of objects. Why don’t chimps have “access” to this during aggression? Why can’t otters use the rocks they use to smash shells when fending off other otters in their territory? All animal aggression “maps” end at the extremities, while their hunting and foraging “maps” might extend to include objects and tools. A chimp’s aggression “map” almost feels incomplete, as though there is more work to be done, and yet we might assume that for 5 million years there’s been zero change here.

A closer look at how the ID works might help us understand how intraspecific aggression is parsed differently than interspecific hunting. Within the ID, we might assume there are two separate functions: interspecific (kind-on-other hunting) and intraspecific (kind-on-kind aggression). In the case of intraspecific aggression between chimps (and all other animals), the opponents’ aggression “maps” correlate 100%. There’s no ambiguity. The weaponry and armor are exactly the same between the two. But in the case of intraspecific aggression between humans, there’s ambiguity in the extremities’ abilities to use objects. Neither human knows exactly what will happen if any move is taken, which will produce hesitation, anxiety, etc. as both attempt to figure out what OBA the other will use after move 1, 2, 3, etc. This ping-pong effect makes the human intraspecific ID capable of generating infinitely nested tactics. It might be (and is often the case) that once a violent move is decided upon and made, the effects are catastrophic, perhaps apocalyptic. The fear of apocalyptic intraspecific violence makes humans uniquely dangerous to ourselves, in ways animals are never a danger to themselves.

The “merge” of the ID with OBA is what creates this nested, “recursive” ability to determine OBA in other humans and update one’s OBA plans in turn. Hence, I call it Recursive Object Based Aggression (ROBA). ID + OBA produces this recursive module, which Chomsky calls UG. This makes sense, since when OBA is passed into UG, you get ROBA as a “grammar” of violence. And I hypothesize that this is the first “language” of humans. It doesn’t look like language as we know it, but its recursive nature and deeply nested arguments have the same structure as non-violent language (NVL).

However, the Merge of ID + OBA (Universal Grammar/UG) allows for the grammarization of other inputs. Eye contact, hand gestures, written signals, vocal cues, etc. can also be parsed by the UG. Pantomimed violence would theoretically have been passed to UG to create the first non-violent language (NVL). NVL, for the first time, allowed early humans to avoid the apocalyptic end game of human violence. Perhaps this first sign happened 80,000 years ago, though it’s possible that ROBA existed since 2.5 million years ago. In that case, how exactly these early humans avoided wiping themselves out with ROBA but without NVL is difficult to determine, so it’s also possible that ROBA would have been the species-killer if not for the immediate onset of NVL, in which case ROBA and NVL would have both set in ~80k years ago. All these timelines are mere conjecture, but I want to urge that UG could have existed millions of years ago, grammarizing OBA as the first “language”, but the fear of ROBA might have resulted in the long-lasting Oldowan and Acheulean cultures, in which we see a “grammarization” of tool-making, which might have been what deferred ROBA for so long until a vocalized/gestural language could have emerged. Again, all this is conjecture, but the ROBA Hypothesis gives us a different way to judge what qualifies as “language”, “human” and “violence.”

The continued use of NVL in place of ROBA (with ROBA flowing under it like magma the whole time) would have produced substantial changes in homo sapiens without necessarily creating a new species. For this I look to Wrangham’s study of neural-crest cells in the Goodness Paradox, where we read that neural-crest cells usually reach the extremities of wild animals, but in domesticated animals they tend not to reach the extremities, producing white paws, white fur tips, etc. Continued use of NVL might have similarly impeded neural-crest cell transmission in the human embryo, producing whitened palms, white scleras in the eyes, a more flexible larynx, and whatever else might aid in non-violent linguistic (NVL) transmission.

Again, none of this is to say that human violence/ROBA is gone today. ROBA continues underwriting all NVL, from sports to contracts and institutions. So if we want to understand human universal grammar in the Chomskyan sense, we might start looking at the grammar of ROBA. How exactly we access that is up for debate. It seems that sports and martial arts allow the participants at least momentary access to the UG and it produces a kind of high. Language is expressed with the body, and perhaps the euphoria we’re feeling during the heat of battle is the raw grammar that first made us human. Though I don’t have any hopes that Chomskyan linguists will be hanging out with mixed martial artists any time soon. In fact I expect that the two fields to remain totally separated until someone comes along who can bridge them.