The Merge Function of Human Violence

I’m beginning to read Chomsky’s theory on linguistics, which gives some good insight into how language is “generated” almost without effort in a child as he/she develops. The child’s “language acquisition device” (LAD) can’t help but generate a linguistic structure based on the information given to him during childhood. Words fall into certain grammatical “buckets” naturally, and they develop hierarchies with one another based on their positioning in sentences.

Word order in sentences does matter, but not in any linear way. In English, saying “birds that fly instinctively swim” is an ambiguous statement. Do birds fly or swim instinctively? In other words, does “instinct” modify “fly” or “swim”? A person using this statement will be asked to clarify, since an adverb can modify a verb immediately before or after it. But if he says, “Instinctively, birds that fly swim,” English grammar dictates that “instinct” modifies “swim” even though it’s now further away. Or we could instead say, “Birds that instinctively fly swim” to modify “fly.”

This demonstrates the capacities of the “Merge” function. Merge {instinct, fly} produces “Birds that instinctively fly swim” while Merge {instinct, swim} produces “Instinctively, birds that fly swim”. Merge forms a hierarchy between two concepts which can then be expressed in a sentence using whatever the grammatical rule is.

All this is to say that the LAD is what gives humans the rules for sentence structure, and Merge is the underlying function that builds a hierarchy between terms. Any given language will have different grammatical rules, but all people, presumably, have the same LAD, and Merge is one of its core functions.

Animals do not have Merge, and therefore have no LAD. They cannot form hierarchies between concepts. Chimps trained in sign language will string words together without hierarchy. When the chimp Washoe was asked to sign “I want you to help me eat”, Washoe would make a laundry list of sentences: “I you eat help”, “I help eat you”, “you I help eat”, etc., until the correct order was made. Washoe demonstrated no ability to learn this proper order, except through rote memory. Washoe might eventually learn the string of signals “you help I eat” but couldn’t form a “theory” as to why these words are ordered this way. If asked to sign, “I want you to help me sit”, the same laundry list would result.

Chomsky couldn’t understand where the LAD came from and why only humans had it. He set out for years to understand what “event” caused the Merge function to evolve. Many Darwinian theories have been put forward, none of them convincing. All are gradualist. They look to the similarities between humans and simians for origins of Merge.

I propose that linguists need to look at the differences between humans and simians to pinpoint where Merge comes from. And I believe this key difference is in intraspecific combat. Animals are optimized for combat within their own species, humans are not, and this is why humans have Merge. I’ll explain further…

Intraspecific (i.e. chimp-on-chimp, deer-on-deer) combat in animals is optimized between attacker and defender. A chimp’s teeth are optimized against a chimp’s hands and fur. A buck’s antlers are optimized against another buck’s antlers and tough hide. Animal combat is a closed loop system, with each combatant’s intentionality detectors (IDs) feeding directly into their actions (As). There’s little foresight, so IDs feed directly into actions (As) that feed directly into IDs again, etc. You could imagine a pair of bucks sparring as a series of clashes.

– Buck 1 intends his attack (I), Buck 2’s ID reads his intent, and meanwhile Buck 1 performs action A.
– Buck 1’s ID is processing Buck 2’s intent, and meanwhile Buck 2 performs action A.
– There’s a natural overlapping of intent, ID, and action between the contestants that looks less like a chess game and more like bricolage. Nonetheless this bricolage is still sequential, so we will call each of these “clashes” C1, C2, C3…. Cn. We can write it out in an equation like this:

C1 {ID2(I1) // A1} -> C2 {ID1(I2) // A2} -> C3 {ID2(I1) // A1} …

The series can be represented by notation Σn+1 [ C(n) {ID2(I1) // A1} ] where n, the number of clashes, increases n+1 until one of the bucks backs down. The notation itself doesn’t really matter. The important point is that it’s an un-nested series. Each clash or C(n) is its own moment, which can only affect future clashes in the sequence.

Animal clashes rarely end in death, not by virtue of an animal not desiring the other’s death, but by virtue of the optimal strategy of animal combat. It is bounded by something, we don’t know what, which stops an animal from making repeated attacks on a downed opponent. Reciprocation probably plays a factor here: a downed opponent offers no fight, and is therefore met with no fight. Chimps do kill other chimps, wolves kill wolves, brown mice kill brown mice, but this can only be due to mobbing, since the optimization strategy makes it almost impossible for one animal to kill another of its kind. And even then, death by mob often takes days. There are horrendous examples of chimp mobs ripping an individual victim apart, but this still fits the equation above. No weapons are used in any premeditated or mimicked fashion. The chimps primarily grab and bite.

No animals use object-based aggression. Chimps will use objects during aggression, but there is no forethought in this, and no chimps mimic the act. A chimp in a mob might throw a rock at the victim, or preempt combat by rolling boulders, but object usage is an act of intimidation, not violence. And no other chimps will mimic this violence, and they don’t show up with objects to the fight to begin with.

And yet chimps will bring particular stones to nuts and teach their young how to use those tools. Chimps, colobus monkeys, sea otters, and cormorants will use tools for hunting and foraging, but never use these for aggression, at least not in any useful way. In short, animals have no mental bridge between object usage and aggression. Only incidental use of objects can be seen, and that only in chimps.

To equate any of this with language would be a mistake. Language demands a long chain of planning, something we never see in animal combat. Modern academics are hellbent on pushing Copernican Fundamentalism to the extreme: they believe that by diminishing human specialness they can find scientific truth, claiming that Copernicus began his search for truth by assuming that earth wasn’t special. This was not the motivation of Copernicus. He started by looking at the data, not by being “humble.” We should similarly stop pretending to be “humble” by assuming we’re just like our simian ancestors. Our violence is nothing like theirs, and Copernican Fundamentalism has been leading us astray with regard to both human violence and language for too long.

Humans, by contrast, are unoptimized for combat with one another. Unoptimization springs from the wildcard nature of what objects or weapons the human opponents might have access to. “Objects” can include rocks, sticks, cliffs, even access to blood avengers, vengeful family members, etc. “Object” here is merely the nominal “thing” that a person can use in intraspecific combat.

We are unoptimized by virtue of the fact that our intentionality detectors (IDs) also factor in the opponent’s own ignorance of OUR object based aggression: he also doesn’t know what our access is, or he’s spent much time figuring it out before making a move. All things equal, the opponent’s wildcard is echoed our own wildcard. And both sides form these nested statements that attempt to pinpoint what the wildcard is, long before any move is actually made. So human combat is not a series of clashes, but rather a nested statement:

= selfCombatPlan {ID(opponent), ID(ID(self))}

So before combat begins, a long complex string of nested commands will result in some kind of hierarchy between the self and opponent. This might the same Merge property from linguistics. If we write it out as a Merge statement:

= selfMerge {ID(opponent), oppMerge{ID(self), selfMerge{…}}}

Merge-based combat has the potential for infinite layers of intricacy. This is limited by 1) memory space and 2) time. Each time the recursive Merge function runs in the opponents’ minds, potential violence naturally escalates. I will anticipate the opponent bringing a knife, and so I might bring a bigger knife. Opponent will anticipate something in me and bring, say, a second smaller knife. Or we both bring friends as backup. Eventually 1) memory or 2) time, or both, run out, and an action will result. This nested, escalatory nature of human violence results in far deadlier combat than any intraspecific animal combat: I bring a big knife fight his two small knives, or 2 of us now fight 2 of them. It’s not a good situation for anyone. One might have better intel than the other and get the advantage, but most likely someone is going to die in this scenario, a very rare phenomenon in animal combat.

The nested nature of human, object-based aggression (OBA) is called “recursiveness”, hence the acronym ROBA. This creates the Merge function, and so I believe this qualifies ROBA as the first “language”.

Hypothetically, during the first moments of hominization, Merge was extremely limited in 1) memory and 2) time. With no vocal or written language, long chains of nested “statements” regarding rock-based combat would have been very difficult. They would have been short and to the point. Decisions would have been made quickly. The throwing of stones in intraspecific human combat might have come about fairly quickly with little thought, yet it would have still presented a deadlier situation than animal combat. A mob of humans throwing stones presents the first mob stoning. (Rene Girard isn’t far off when he says that this is the foundation of human society, though I would argue ROBA would have produced less deadly outbursts before this, and if we’re going to pinpoint hominization at the onset of Merge, then hominization begins before the first murder.)

It is in the natural interest of human antagonists to find ways out of this escalation to extremes. And yet without non-violent language there was only ROBA.

It might not have taken long for antagonists to realize that Merge could also parse non-aggressive actions. Pantomimed violence would have then been passed into Merge. Perhaps the first instance was, instead of throwing stones, they hit the stones together, producing flakes as a “warning” not to engage in combat. Merge utilized this to create a new “non-violent language” (NVL) of stone-cutting, and this was perfected when the right kinds of stones were found, producing flint-knapping. Perhaps this kicked off Oldowan stone culture, which was universal across the world.

Chipping stones helped defer violence for a long time. Their remains near watering holes might be evidence that chipped stones were useful for hunting. But chipping might have been percussive, producing a musical language as well. If music had any clear function here, it would have been to defer ROBA.

Eventually ROBA coopts these technologies, and Oldowan tools would have been no different: chipped stones could now be thrown more accurately at other people. Oldowan tools then would have had to be parsed by Merge in nonviolent ways. Maybe this produced Ascheulean stone handaxes in the southern part of the African continent. Then the same thing happened: handaxes were used for hunting and foraging, then coopeted by ROBA for weapons in war. That same weapon technology became arrowheads, tomahawks, etc. I would suspect that Oldowan users, if they weren’t able to adopt the Acheulean “language”, were then wiped out by Acheulean weapons.

We can see evidence of the limited memory space of stone technology based on the fact that Oldowan and Ascheulean tools persisted for millions of years with little discernable changes. As stone tech was coopted by ROBA, a new level of non-violent language (NVL) would have been required. I would expect that bone and early metal technologies would have first been used to defer violence (hunting, music, etc.), then coopted by ROBA (iron warfare). Followed by gunpowder, nuclear, and then whatever new physics brings in. Each new technology is an era that I would call an “aggression kernel” wherein the memory space is increased dramatically. Both nested language and violence expand into far more layers than previously possible. Language becomes more infinitely capable of deferring violence, and ROBA becomes more infinitely capable of destroying us. The two naturally dovetail into one another, both fighting to coopt the Merge function. Merge is agnostic: you can pass it violence or non-violence, and it will grammarize it for you. Art is a complex language produced by Merge.

In conclusion, ROBA is a model for understanding the wildcard of human-on-human violence and language. Our aggression is un-optimized and nested, which produces the Merge function, wherein Merge is a hierarchy between the opponents. Merge is therefore escalatory and apocalyptic. But Merge can be coopted by non-violent gestures and vocalizations and grammarized into non-violent language (NVL), which is eventually coopted by ROBA. The game is to continue parsing new concepts into Merge to keep deferring violence.

The next frontier of my research is understanding neural-crest cell (NC) migration during embryonic development. I think this will show that continued use of Merge for NVL impedes NC migration to extremeties, altering certain characteristics like palm color, eye whites, and larynx flexibility, which assist in utilization of NVL. These altered characteristics can continue to be parsed by Merge to develop newer forms of NVL that can defer ROBA, which will undoubtedly begin coopting our new technologies.

All of this will be written into book format soon. I’m tossing name ideas around: The Art of Violence, or The Unoptimized Strategy, or ROBA???