ALFRED P. SLOAN SCHOOL OF MANAGEMENT
Cooperation in Two-Person Games
with Repeated Partner Choice
Why Be Helpful Even If You Are Exploited
January, 1992 WP# 3377-BPS/92
INSTITUTE OF TECHNOLOGY
50 MEMORIAL DRIVE
CAMBRIDGE, MASSACHUSETTS 02139
Cooperation in Two-Person Games
with Repeated Partner Choice
Why Be Helpful Even If You Are Exploited
January, 1992 WP# 3377-BPS/92
Massachusetts Institute of Technology
Alfred P. Sloan School of Management
50 Memorial Drive, E52-553
Cambridge, Massachusetts 02139
Cooperation in Two-Person Games with Repeated Partner Choice
Why Be Helpful Even If You Are Exploited *>
elseif (getClientWidth() > 430)
Frequently, actors face a rep)eated choice betweer\ several dyadic relationships
in which to interact. For example, firms may enter a series of joint ventures with
various partners, and employees may exchange information with a number of
colleagues from other firms. This paper introduces the notion of two-person games
with repeated partner choice to conceptualize such situations. Players repeatedly face
two decisions, (1) with whom to interact and (2) how to interact in a chosen dyad.
A computer simulation demonstrates that in such situations a strategy of
unconditional cooperation in a chosen dyad can perform as well or even better than
retaliatory strategies like Tit-for-Tat — as long as the players tend to at least slightly
favor those relationships in which they gain higher payoffs.
I propose that many situations resemble more the characteristics of two-
person games with repeated partner choice than of a traditional two-person game
viewed isolated from other relations. This might explain the puzzle of why
cooperation can frequently be observed even if defection creates an advantage for the
defecting player and the other players apparently do not to retaliate. In such
situations, unconditional cooperation may not only be justified as morally
appropriate but also as "good business."
*) I gratefully acknowledge the research assistance of Ava Kuo, who skilfully develop)ed the
simulation software as part of her Undergraduate Thesis Project at the Massachusetts Institute of
Technology. I thank Lael Brainard, Dietmar Harhoff and William Riggs for their insightful
Game theoretical frameworks are frequently applied to sodal interactions with
strategic components. The Prisoner's Dilemma framework, for example, is
employed to conceptualize such diverse issues as buyer-supplier relationships
(Jarillo, 1988; Johnston & Lawrence, 1988), informal information trading (von
Hippel, 1987; Schrader, 1990), marketing strategies (Hauser, 1987), inter-group
communication (Bornstein, Rapoport, Kerpel, & Katz, 1989), market fraud (Opp,
1986), trade policy (Yarbrough & Yarbrough, 1986), and trench warfare (Axelrod,
Retaliation for non-cooperative behavior and the threat of it constitute core
elements of most game-theoretical analyses (e.g. (Axelrod, 1984; Axelrod, 1986;
Schelling, 1960; Yarbrough & Yarbrough, 1986). Bhide and Stevenson, however,
suggest that, at least in the business world, non-cooperative behavior seldom evokes
retaliation. "Trust breakers are not only unhindered by bad reputations, they are
usually spared retaliation by parties they injure" (Bhide & Stevenson, 1990, p. 125).
Despite the absence of retaliation, the authors observe that most businesspeople are
trustworthy, i.e. behave cooperatively. But "why be honest if honesty doesn't pay?"
(Bhide & Stevenson, 1990, p. 121). The authors argue that game theory fails to
explain how cooperation can persist in situations in which non-cooperative
behavior creates additional payoffs and in which retaliation is not used. They
interpret the dominance of cooperative behavior not with game-theoretic arguments
such as provided for example by Stiglitz (1989) and Fudenberg & Tirole (1989) but as
the result of moral choices. "We keep promises because we believe it is right to do
so, not because it is good business" (Bhide & Stevenson, 1990, p. 128).
In this paper, I propose that the puzzle raised by Bhide and Stevenson can be
explained in game-theoretical terms if one conceptualizes the underlying
interactions as two-person games with repeated partner choice . The novel concept of
two-person games with repeated partner choice takes into account that players often
have the choice between several dyadic relationships in which to interact. Within
each chosen dyad a two-person game is played — frequently repeatedly if players
select each other anew.
In a two-person game with repeated partner choice, actors face two decisions:
They have to decide, first, with whom to interact and, second, how to interact in a
chosen dyad. The first decision takes ii\to account that actors are part of a multi-
person environment and thus may switch interaction partners at any given point,
based on their preferences. The second decision relates to the interaction of a pair of
actors and can be viewed as a two-person game. Thus the designation two-person
games with repeated partner choice.
Situations that follow these characteristics can be observed frequently. Take
informal information trading as example (von Hippel, 1987; Schrader, 1990;
Schrader, 1991a). Informally information trading refers to the exchange of valuable
information between employees working for different companies. Such exchanges
are common in many industries. Information trading takes place in the context of a
multi-person environment. Usually, several potential information trading partners
are available to a given employee. Consequently it is necessary to decide with whom
to enter information trading relationships. Employees clearly prefer to cooperate
with those colleagues with whom they had beneficial contacts in the past (Schrader,
1990). In addition to choosing trading partners, each employee has to decide on how
to behave in a given dyad. Interestingly enough — and supporting Bhide and
Stevenson's observation — retaliation within a dyad can be observed very
infrequently. In most cases, employees do not retaliate directly if a colleague acts
non-cooperatively in a situation in which cooperation would be expected. However,
they are now less inclined to interact with this colleague in the future (Schrader,
I propose that non-retaliatory, cooperative strategies can be as advantageous
as, or may even dominate retaliatory strategies like Tit-for-Tat in two-person games
with repeated partner choice if players adjust their interaction preferences for other
players based on past payoffs. In this paper, I present results from a computer
simulation to support this claim.
The main outcome of the computer simulation of two-person Prisoner's
Dilemma games with repeated partner choice is surprisingly different to what results
from the traditional analysis of two-person games (e.g. (Axelrod, 1984; Owen, 1982;
Rapoport, Guyer, & Gordon, 1976; Schelling, 1960). The strategy of unconditional
cooperation fares as well or even better than a retaliatory strategy like Tit-for-Tat —
as long as player's preferences for specific interaction partners are at least partly
updated according to the payoffs received.^ For unconditional cooperation to be
effective, players do not need to take radical measures. It is already sufficient if they
are to a small degree less inclined to rechoose another player if interacting has
created a disadvantage, and slightly more inclined if it has proven to be beneficial.
These adjustments might be so small that at any given point they are not discernable
to an outsider.
In the next section, I present briefly the arguments that are put forward in
support of retaliatory strategies in two-person Prisoner's Dilemma games. I then
introduce the notion of two-person games with repeated partner choice. Next, I
present a computer simulation of several two-person games with repeated partner
The term "unconditional cooperation" refers to the interaction behavior luithin a chosen dyadic
relationship. The adjustment of interaction preferences might be interpreted as a form of
retaliation. 1 will show later, however, that already small adjustments that might not be
discernable for the partners suffice to reach the described results. Since a characteristic of effective
retaliation is supposedly its visibility to the other players, I will not use the term retaliation (or
reward) for a slight adjustment of interaction preferences based on past pay-offs.
choice. These simulations study the effectiveness of different interaction strategies
depending on how sensitive players' interaction preferences are to payoffs and
depending on the number of players involved. Finally, the results of this
simulation are put into context. I suggest that unconditional cooperation might
have some additional advantages over more sophisticated strategies like Tit-for-Tat.
Retaliation by Tit-for-Tat can be misinterpreted as a tendency to be uncooperative
and thereby create a negative image spillover.
The Argument for Retaliatory Strategies in Two-Person Games Without
In two person games, players face frequently the possibility to exploit their
partners, e.g. to gain a benefit at the cost of the other.i Repeated games provide the
partners with the opportunity to retaliate such exploitation. A retaliatory strategy
punishes exploitation right after it occurs — even if it creates additional costs for the
exploited player (Schelling, 1960). Retaliation or sanctions and the threat of such
measures can serve as means to motivate or induce other players to perform in a
desirable way. (A threat is the ex-ante announcement of the willingness to retaliate
in order to deter undesired actions by the other player.)
A strategy that does not retaliate exploitation will be exploited by non-
cooperative strategies in the long run (Axelrod, 1984). In the context of repeated
Prisoner's Dilemma games, Axelrod defines a strategy as retaliatory if it immediately
defects after a defection by the other player (Axelrod, 1984, p. 44). In a Prisoner's
Dilemma and related situations, retaliation serves two purposes. First, it prevents
futvire exploitation by the other player. Second, it might induce the other player to
return to a cooperative strategy, especially since the temptation benefit of
^ An extensive discussion and exhaustive list of two-by-two games can l>e found in Rapoport, Guyer,
and Gordon (1976).
exploitation can no longer be reaped (Luce & Raiffa, 1957, p. 101; Rapoport & Guyer,
To investigate how to play a Prisoner's Dilemma game effectively, Axelrod
organized a seminal computer tournament.^ Fourteen game theory experts were
invited to submit a computer program that embodied a decision rule on how to play
an interactive Prisoner's Dilemma game. Tit-for-Tat, submitted by Anotal Rapoport,
won the tournament, i.e. gained the highest overall payoff. Tit-for-Tat starts with a
cooperative choice and thereafter mimics the other players previous action. In a
second round of the tournament, 63 programs were submitted. Again, Tit-for-Tat
turned out to be the overall most effective strategy.
The successful strategies identified by Axelrod's tournaments have at least
three characteristics in common. First, they are "nice", i.e. they start out
cooperatively. Second, they are retaliatory, i.e. they punish non-cooperative
behavior. And third, they are forgiving, i.e. after having punished non-cooperative
behavior, these strategies provide the possibility to return to a cooperative mode.
Donninger (1986) tested the generalizability of Axelrod's findings. In a similar
computer tournament, he both varied the payoff matrix and the composition of
strategies included. His results demonstrate that it is not always efficient to be nice.
Especially if the temptation payoff (i.e. the payoff a player receives if she defects while
the counterpart cooperates) is raised, non-cooperative strategies tend to perform
better. Retaliation (either direct or with some time lag), however, proved to be a
characteristic of all well performing strategies .
In sum, retaliation or sanctioning undesirable behavior appears to be an
element of successful strategies in non-zero-sum repeated games (Hardin, 1982).
Retaliation is employed to prevent exploitation and to induce the other player to act
A detailed description of the tournament can be found in (Axelrod, 1984).
cooperatively. A strategy of unconditional cooperation, on the other hand, is
unstable (Taylor, 1976)1 and likely to be exploited by other strategies (Rapoport et al.,
1976). Although nnutual cooperation can result from conditional, retaliatory
strategies, "it is never rational for a player to use the unconditionally Cooperative
strategy" (Taylor, 1976, p. 89).
Two-Person Games With Repeated Partner Choice
The traditional analysis of two-person analysis a specific relationship
independent of any other relationships the players might or could be engaged in.
However, many situations exist in which a specific dyadic relationship cannot be
viewed isolated from other relationships. Individual frequently face numerous
two-person relationships in which to potentially engage. Time and other constraints
often necessitate that only a few of all potential relationships are finally actuated.
For example, academicians frequently know many individuals with whom to write
joint papers. Unfortunately, the number of papers an individual can write at a given
time is limited. It is necessary to choose with whom to collaborate; and this choice
needs to be done repeatedly. A multitude of isomorphic situations exists, such as
cross licensing, joint development projects, or the exchange of birthday cards. In all
these situations, an actor can or must choose between several potential relationships.
Within each of the selected relationships, however, an interaction occurs that can be
interpreted as a two-person game. Thus, these situations contain clearly a multi-
person component. This component, however, is not addressed by the existing
research on multi-person games.
1 Taylor defines a stable equilibrium as a situation in which "no player can obtain a larger payoff by
using a different strategy while the other players continue to use the same strategies" (Taylor, 1976:
The analysis of multi-person games concentrates on different issues. Most
multi-person games conceptualize situations in which several actors interact
simultaneously and the action of one actor has direct consequences for the payoffs of
all the other actors. The provision of public or collective goods, for example, is a
central topic and issues such as the free rider problem and the emergence of
behavioral norms are discussed (e.g. (Axelrod, 1986; Hardin, 1982; Olson, 1965;
Taylor, 1976; Ullmann-Margalit, 1977).
The problems investigated in this article are covered neither by the traditional
analysis of two-person games nor of multi-person games. The phenomenon of
interest is located conceptually between two-person games and multi-person games.
A player interacts vdth only one other player at a time. However, this specific dyadic
relationship occurs in the context of many potential relationships. Such a situation
is different from a two-person game in which a specific relationship is
conceptualized independent of any other relationship. It is also different from n-
person games that are characterized by each player simultaneously interacting with
all other players.
At least two types of decisions have to be made when two-person games are
played with repeated partner choice. The first decision concerns the formation of
dyadic relationships. These decisions determine the pairs of interacting players. The
second type of decision relates to the behavior in each dyadic relationship. The
players have to decide how to act in a given dyadic. The structure of this decision is
similar to a simple two-person game — although the appropriate strategy might be
Axelrod (1984, pp. 158-168) discusses a situation that resembles two-person
games with repeated partner choice insofar as each player is involved in a number of
dyadic relationship at the same time. He investigates strategies for Prisoner's
Dilemma games in a simplistic multi-person environment. The environment is
structured so that each player has four neighbors, one to the north, one to the east,
one to the south, and one to the west. The game extends over several generations.
In each generation, a player interacts with all direct neighbors, i.e. plays four two-
person games. After all four interactions, they players attain a success score
measured by their average performance with their four neighbors. If a player faces a
more successful neighbor, the player converts to this neighbor's strategy. Axelrod
finds that nice, retaliatory, and forgiving rules tend to perform well in such an
environment. However, if future payoffs are discounted strongly, a community of
ruce strategies can be invaded by non-cooperative strategies.
In Axelrod's game, the formation of dyadic relationships is predetermined by
the location of the players. A given player always interacts with the same four
players. The formation of dyadic relationships is static, i.e. unaffected by the
outcomes of the interactions and unchanged in the course of the game.
Many situations exist, however, in which the formation of dyadic
relationships depends on past payoffs. Consider informal information trading.
Persons tend to prefer those relationships that have proven to be beneficial
(Schrader, 1991b). They are less inclined to continue relationships that have been
disadvantages. Frequently, it can be observed that individuals try to avoid to get in
contact with those who have taken advantage of them. They might, for example,
not return a phone call. In this case, the two players do not interact at all. Yet, if a
contact cannot be avoided, the individual might be as helpful as ever. Bhide and
Stevenson (1990) have observed this tendency of not adjusting one's interaction
strategy (i.e. the strategy used for interaction in a chosen dyad) and to remain
cooperative even if the partner's is known to take advantage of this — as long as the
players interact at all.
In the following, results of a computer simulation that takes into account that
players might update their preferences for specific interaction partners based on past
payoffs are presented. It will be seen that a strategy of unconditional cooperation
might actually be more advantageous than it appears on first sight.
Simulation of Two-Person Games With Repeated Partner Choice
Computer simulation is used here to study the effectiveness of different
interaction strategies for two person games with repeated partner choice. Computer
simulation is frequently used to investigate games with complex strategic
interactions (e.g. (Axelrod, 1984; Axelrod, 1986; Behr, 1981; Donninger, 1986; Fader &
Hauser, 1987; SchiilBler, 1986). Simulation allows to probe into situations that do not
lend themselves to formal analysis, including many games with multiple players
using different, sometimes varying strategies with random elements.
The Structure of the Simulation Game
The computer simulation investigates a situation in which several players
have a recurrent choice of which dyadic relationship to enter. In each dyadic
relationship, the same two-person game is played. Once the players have interacted,
they choose anew an interaction partner from the set of possible partners for the next
round. Their preferences for specific relationships, however, is influenced by past
payoffs (Figure 1). In other words, if a relationship has proven to be beneficial for a
player, he is more inclined to choose this relationship again. On the other hand, if a
player was exploited by another player, he is less inclined to choose the same player.
Instead he will lean more towards choosing another interaction partner.
based on players'
The Structure of the Game
Ir\ each dyadic relatior\ship, the players face a Prisoner's Dilemma. In the
computer simulation, the payoffs remain constant.^ The classical Prisoner's
Dilemma payoff matrix (UUmann-Margalit, 1977, p. 31) is used (Figure 2). This
matrix fulfills both conditions for a Prisoner's Dilemma as defined by Axelrod
(Axelrod, 1984: 10). The first condition refers to the ordering of payoffs. The best a
player can do is to defect while the other player cooperates, and the worst a player can
do is to cooperate while the other player defects. The reward for mutual cooperation
is higher than the payoff for mutual defection. The second condition states that an
even chance of exploitation and being exploited results in a worse outcome than
^ An interesting extension of the model would be to use dynamic pay-offs. As Granovetter's work
suggests, the expected benefit of each interaction within a specific relationship frequently decreases
over time (Granovetter, 1973). New, "weak" ties often provide greater benefits per interaction then
old, strong ties.
2 Rapoport and Guyer 's definition of a Prisoner's Dilemma encompasses the first condition only
(Rapoport & Guyer, 1966: 211).
FIGURE 2. Prisoner's Dilemma Pay-Off Matrix
Four strategies are included in the simulation. All of them are simple. They
are at the core of most other strategies suggested for playing a Prisoner's Dilemma.
The first strategy is Tit-for-Tat . Tit-for-Tat starts out cooperatively and thereafter
mimics the previous choice of the other player. (With previous choice of the other
player we refer to the last choice which the player with whom Tit-for-Tat is
interacting currently has made in the same dyadic relationship.) It is a conditional
strategy, i.e. Tit-for-Tat's choice depends on the other player's behavior (Taylor,
1976). The second strategy is Defect . Defect is an unconditional strategy.
Independent of the other player's behavior, defection is chosen. The third strategy.
Cooperate, is similar to Defect with the difference that cooperation is chosen all the
time. Random is the last strategy. It is an unconditional strategy as well and chooses
arbitrarily between cooperate and defect.
Choosing Interaction Partners
A player may choose any of the other players as desired interaction partner.
For the simulation it is assumed that an interaction occurs only if two players choose
each other simultaneously, i.e. Player A selects Player B and Player B picks Player A
in the same round. Without such a double coincidence no interaction occurs.
Players choose interaction partners based on individual preferences. These
preferences are expressed in a probability matrix. The probability matrix reflects the
likelihood that one player chooses another player as the desired interaction partner.
These probabilities can be interpreted as a function of the expectations regarding the
usefulness of future interactions with a given partner. In the beginning of the
simulation, no prior biases towards specific interaction partners exist, i.e. the
preference matrix is uniform. ^ In other words, each player has the same chance to be
picked by another player as the preferred interaction partner for that round.
Players update their preferences based on past payoffs. Once a player has
experienced that interacting with a specific partner is beneficial he will prefer to
interact more with this partner and less with the other players. Similarly, if a player
is exploited by another player he will be less inclined to choose that player again as
an interaction partner and will be more inclined to choose another player. This
assumption coincides with empirical research on the formation of exchange
networks that stipulates that those ties get reinforced that create benefit for the
network members (Schrader, 1991b). Thereby it is assumed that an expectation-