Tit for Tat

Kruxi
Jan 1, 2023
9 min read

I have read and heard of tit for tat being a powerful strategy in a game theoretic framework, but only after reading “The Evolution of Cooperation” by Robert Axelrod have I realized how powerful such a strategy really is. Axelrod also gives a beautiful account of how to test strategies in different settings. I will argue that tit for tat is a necessary longrun winning strategy among all things living. We will get to this thesis in 7 distinct points. First I will explain the framework in which we start off this discussion, a game theoretic payout matrix called the prisoner's dilemma. Then we will discuss a tournament of computer programs playing in this framework. Thirdly, we will discuss the winners of this tournament(s). With the submitted strategies we will play an ecological and stability game. And lastly I will give some real life examples.

The Dilemma

This whole journey starts with a puzzle. It's the prisoner's dilemma. The rules are simple: there are 2 players. They have two options to choose from. They can cooperate with each other (c ) or they can defect (d). When they cooperate with each other then both will get maximum total payout (top left ). When they both defect they will get the lowest payout (bottom right). If one defects and one cooperates then the player who defected (meanie) will get a reward while the player who cooperates (sucker) gets nothing. The classic example is if two people got caught doing a crime, but there is not enough evidence to lock them up, you need at least one of them to confess. If the criminals cooperate (c) with each other then they get away with the crime and split the merch they stole. If they both spill the beans then both get locked up and none of their statements is of crucial importance to the police. While if only one defects then this leaves the person who violated the cooperation with the other criminal the beneficiary while the loyal criminal is the sucker.

That aint a dilemma yet. The dilemma arises when we look at what each player should do (that's what you should always ask yourself, because players don't have the oversight, they just do what's best for them). Let's say player 1 sits in his cell and thinks about his reward conditional on what player 2 does. What if the other player cooperates… we are thus in the first row… Well either I cooperate and get 3 rewards, or I defect and get 5. Well that's easy… if player 2 cooperated I would defect. But what about if he defects? Well either I cooperate and get 0 or I also defect and get 1. Also here Player 1 will defect. We assume that player 2 is also this smart and comes to the same conclusion (symmetrical payout function). Now we arrive at the dilemma. Rational decisions led the players to both receive one point while no extra effort was needed for both to get three. It is sub optimal but still a Nash equilibrium. That is the dilemma.

The Tournament

We have seen that this payoff function is truly a dilemma. It gets interesting if you play this game among the same players a couple of times (iterated prisoner's dilemma). Here the theory goes that a player can convince the other player that they will actually cooperate and both can enjoy the higher payouts. But by signaling cooperation to the other player one could become a sucker and just get 0 out all the time. Since the payout of each player depends on what the other player does, there is no perfect strategy. But if we collect some strategies and let them play against each other we can observe which one does best in this environment.

This is where Robert Axelrod called for participants to send computer code (strategies) that play this game with all other computer codes submitted. Each code plays against each other 200 times. The code cannot read the other code, it can only memorize the moves made by the current player it plays against. Thus the code can be of three types:

Frequentialist: disregarding the actions of the other player just saying 50% of the time defect, or always defect (ALL-D).
Reactive: being reactive to what the other player did in the previous rounds, like cooperate until other player defects then always defect ( NICE-RETALIATORY)
A combination of both: Here the strategy reacts to what the other player does but throws in some probabilistic moves ( TESTER, TRANQUILIZER).

These submissions played against each other, and the strategy with the highest score won.

The Winner

The winner was Tit for Tat. It was one of the simplest strategies submitted. It is cooperative on the first move and then mimics the player it plays against. If the other player defected on the last move it retaliates this move with a defection. If the other player cooperated with the last move it will reward it with a cooperation with this move.

While the code of this strategy was never published a possible code could look like this:

Axelrod published the logic, and the results of each player, as well as rationale of why some strategies did better than others.

After making all of this knowledge available Axelrod called for submission for a second tournament. Exactly the same rules; the strategy to beat is tit for tat. There were now more submissions, knowingly trying to trick tit for tat. Astoundingly, Tit for Tat won again. In the next two points I will give two examples of why tit for tat does so well if we take the submitted strategies and play it in slightly different environments. I will then come back to why it did so well in the actual tournament.

Evolutionary measure, let the worst 10% die

Axelrod's game is simple: Each computer code plays against each other 200 times. What if we slightly modify the game and say… after 20 games with each player we eliminate the 10% worst strategies. Which strategy will be left at the end.

What we can observe in this survival of the fittest contest is that a strategy must be able to adapt to all kinds of strategies. If strategies only do well with one other type of strategies they are vulnerable to the other strategy dying out and then dying out oneself. We see that first meanies die out. They cannot reap the benefit from any cooperation. Then the nicies die out since more sophisticated praying strategies were able to take advantage of them. The problem now is that the preying strategies have noone to prey on anymore and thus also die out. Left is the winner Tit for tat and similar strategies. They are able to both cooperate with nicies, and punish meanies, predictably and reliably.

So although tit for tat is a strategy to be played in the Axelrod tournament, it also wins in the “Ecological Game” described above. It is a staggering success.

Stable strategies, native vs mutant

Another variant of the game one can play is a game of stability. Here we imagine a strategy to be a “native population”. This means that we have 100 participants that play against each other that follow a given strategy. We then inject a mutant strategy into this ecosystem and see how it does. If the mutant strategy does better than the native strategy, natives will change their mind and adopt the mutant strategy.

The key here is, how many mutant strategies need to infiltrate to change the native strategy. Imagine a native society of nicies, they always cooperate. And then a strategy walks along and always defects. While the native population gets 3 points per interaction with each other, and 0 points with the meanie, the meanie on the other hand always gets 5 points. He will win this contest. Soon nicies will take notice and turn to meanies. Being only nice is thus not a very stable strategy.

Now to the example of tit for tat. The beauty is, a small group of tit for tatters can beautifully infiltrate a native meanie colony. Here the tit for tat invaders can discriminate and defect with meanies, and cooperate with each other. Thus get higher returns than the meanies. Meanies will take notice and turn into tit for tatters.

Not only can tit for tat invade easily, but a native tit for tat group is extremely stable, and cannot be easily infiltrated with a small group of any strategy. Here you would have to get a lot of meanies into this native tit for tat population in order to break tit for tat's dominance. It would mean that the expected return (probability* reward) of the two tit for tatters playing against each other is lower than the negativity of losing the first round (tit for tat always cooperates on the first move and then always defects with a meaning).

Why tit for tat is so great.

I wanted to show the strength of tit for tat even outside the scope of the tournament itself, before making any conclusions on why it is so strong. And what other strategies that were strong had in common. Axelrod summaries them in 3 points:

Start nice: If there is a single factor that separates good from bad scores in both tournaments it is that starting nice gets you higher scores. The rationale behind that is that it just gets you in trouble later, since some strategies remember that and will punish you hard for it.
Punish hard: It is very important to show the other player that you will punish them for defecting. This discourages defections
Be forgiving: there is no reason to hold a grudge against a strategy that defected once. Punish them and then trust them again. If you don't, you might forgo the opportunity for future cooperation. The way I see it is after aptly punishing someone you start the game over, and we already know that starting nice is the most important factor.

Also you should be as clear as possible about those rules. If you leave any wiggle room for interpretation you let other players take advantage of you by not punishing you, or by not trusting you again by forgiving too much.

Some examples from the real world

The blues and the greens

Let's look at a scenario where there is a population of half greens and half blues. Their native strategy is to always cooperate within their group, and defect with the other group. This unfortunately also leads to a stable environment. A mutant green who is born into this world unknowingly wanting to cooperate with everyone will soon learn how to defect to the blues, otherwise he will be beaten hard. It gets even more cruel when we assume random allocation of pairs playing against each other but the greens are in the minority. This means that (fe) the greens only play 30% of the time with each other and get 3 points, while playing 70% of the time blues and getting 1 point. Vice versa for the blue. This means the blues are more in numbers and more in points. The greens might want to take action and create some sort of safe spaces and communities where interaction with greens are more common.

This could be translated to sex, race, religious affiliations…

The judicial system

I am a determinist, I don't think anyone is responsible for the crimes they commit. There is also little evidence showing that sticking someone in prison deters any future crime (at least not at the cost of imprisonment). Tit for tat might be a counterargument to that.

Let's imagine the two players being 1. An individual, 2. The society. Society agreed to cooperate as it says in the rule book (the law). If an individual defects, he commits a crime. What to do now.

I think it's ok to have player 2 (society) play a tit for tat strategy. Society should start to be nice (give rights to everyone), and punish defections. The question now is whether the societal player actually plays 2,3,4,or5 tits for one tat, rather than tit for tat . People are imprisoned for years for possession of weed. That seems more like a x>1 tit for x<1 tat.

The longevity of the game is important

We saw that a 1 time iteration of the prisoners game, leads to the dilemma of both players defecting. Moving to a iteration can lead to cooperation. Axelrod gives a couple of examples where senators in US politics are very cooperative with each other when young, but defect when close to retirement. Similar logic applies for the relationship of businesses and customers. If the business is doing well and expecting to have a long standing relationship with its customers then both can thrive. If the business is soon bankrupt, the relationship deteriorates and both defect: business sells worse goods, or doesn't deliver, customers don't pay.

I have hinted on this in my blog "jewish diamond trade", where I argued that the diamond trade is structurally set up for defection, because one single interaction and its defection can lead to so much benefit for the meanie. It needs a system of highly rewarding cooperators, which can be found in religious and orthodox jewish circles, to trade diamonds.

3 commentaires

Jaschi

14 janv. 2023

In your judicial example, isn't society playing a probabilistic tit for tat, potentially with longer memory? And did these kinds of strategy also participate in Axelrod's competition?

17 févr. 2023

En réponse à

I think I meant a probabilistic tranquilizer: If a portion p of crimes is detected, then the judicial system plays a fixed number of tits for a tat, where a tat is counted with probability p (which may be increasing).

Kruxdelux

Blog●Academic●Chess●Podcasts

Tit for Tat

Recent Posts

3 commentaires

Subscribe to get the latest blog post!

You wont get any spam I swear