Replicator dynamics in public goods games with reward funds more

Co-authored with Tatsuo Unemi (Soka Univ., Japan); preprint; published in 'Journal of Theoretical Biology', 2011

Replicator dynamics in public goods games with reward funds Tatsuya Sasaki1,2,* Email address: sasakit@iiasa.ac.at Evolution and Ecology Program International Institute for Applied Systems Analysis (IIASA) Schlossplatz 1, Laxenburg A-2361, Austria 2 1 Graduate School of Engineering, Soka University Hachioji, Tokyo 192-8577, Japan Tatsuo Unemi3 Email address: unemi@iss.soka.ac.jp 3 Department of Information Systems Science, Soka University Hachioji, Tokyo 192-8577, Japan *Corresponding Author at: Evolution and Ecology Program International Institute for Applied Systems Analysis (IIASA) Schlossplatz 1, Laxenburg A-2361, Austria Tel: +43-2236-807 Fax: +43-2236-71313 For the submission in revised form, 23 July 2011 This has been published by the Journal of Theoretical Biology 287, 21 October 2011, Pages 109–114. Epub 3 August 2011. 1 Abstract Which punishment or rewards are most effective at maintaining cooperation in public goods interactions and deterring defectors who are willing to freeload on others’ contribution? The sanction system is itself a public good and can cause problematic “second-order free riders” who do not contribute to the provisions of the sanctions and thus may subvert the cooperation supported by sanctioning. Recent studies have shown that public goods games with punishment can lead to a coercion-based regime if participation in the game is optional. Here, we reveal that even with compulsory participation, rewards can maintain cooperation within an infinitely large population. We consider three strategies for players in a standard public goods game: to be a cooperator or a defector in a standard public goods game, or to be a rewarder who contributes to the public good and to a fund that rewards players who contribute during the game. Cooperators do not contribute to the reward fund and are therefore classified as second-order free riders. The replicator dynamics for the three strategies exhibit a rock-scissors-paper cycle, and can be analyzed fully, despite the fact that the expected payoffs are nonlinear. The model does not require repeated interaction, spatial structure, group selection, or reputation. We also discuss a simple method for second-order sanctions, which can lead to a globally stable state where 100% of the population are rewarders. Keywords: evolutionary game theory; cooperation; sanction; second-order social dilemma; rock-scissors-paper cycle 2 1. Introduction An enduring conundrum in the biological and social sciences is how cooperation can emerge and be maintained in a sizable group containing exploiters. The conundrum is the so-called social dilemma [1, 2] because its nature is described as follows: groups of cooperators outperform groups of defectors, whereas in a mixed group defectors always outperform cooperators. This represents common conflicts between a social optimum and individual interests very well, and it has traditionally been modeled as the public goods game in many experimental and theoretical studies [3]. In the public goods game (PGG), cooperators confer benefits on others with some cost to themselves, whereas defectors exploit the benefits without such contribution to others. Defection is the selfish choice that results in a decrease in the total benefit to the group, but defection is rational from the evolutionary viewpoint because of a higher individual payoff, with no cost. Thus, natural selection will often drive elimination of cooperation. Classical and evolutionary game studies have, however, identified supportive mechanisms under which cooperation is nonetheless sustained, such as repeated interactions [4, 5], reputation [6, 7], spatial structure [8, 9], and group selection [10, 11]. Punishment of defectors and rewards for cooperators are also major factors that maintain cooperation between self-interested individuals, as suggested by growing experimental and theoretical evidence [12–32]. However, sanctions are costly, and therefore pose the next conundrum: how costly sanctioning can subsist in the presence of those who freeload on others’ contributions to sanctions. This issue is the “second-order social dilemma” [12, 14], which has been particularly well addressed, in the case of costly punishment. One of possible solutions is to punish second-order freeloaders as well [13, 15, 24, 32]. At the same time, there is an issue of how costly punishment can emerge [21, 33]. In a population of defectors, a rare punisher suffers enormous costs because of the need to continuously punish defectors. However, recent studies have shown that punishment-based 3 cooperation can emerge if participation in the PGG is optional rather than compulsory [20, 21, 26, 32]. We note that optional participation is another way to maintain cooperation [33– 39], which can lead to “rock-scissors-paper”-type cyclic domination, well-known in evolutionary game theory [40, 41], among cooperators, defectors, and loners who earn a small but fixed payoff, instead of participating in the PGG [37–39]. Interestingly, Sigmund et al. [32] have found that when it comes to punishing second-order freeloaders, natural selection favors pool-punishment rather than peer-punishment. Peer-punishment is a sanctioning technique which has been the most widely used form of punishment in PGGs in which players decide whether to impose fines on exploiters after the PGG. By contrast, in pool-punishment, players have to decide whether to contribute to a punishment fund before the PGG [14], analogous to forming a volunteer band of watchmen in advance. While optional participation could be required for a population to evolve from a stalemate where everybody defects to a coercion-based regime, there problems associated with opting out of a public goods project, such as global environmental issues, remain [21]. When participation is compulsory, peer-rewarding can cause cyclical dynamics in infinite populations if reputation alone is important (for pair-wise interactions see Sigmund et al. [16], for interactions of arbitrary size see Hauert [30]). In contrast, reputation is given less weight in finite populations [29]. In this work, we explore the effects of pool-rewarding in compulsory PGGs with infinite populations. Similar to pool-punishment, players first decide whether to contribute to a reward fund. After a one-shot PGG among all group members, the common fund is divided equally among those players who contributed, irrespective of their contribution to the fund. While the list of real-world examples of reward funds is too long to list, we shall consider a generous voluntary fund, which may be threatened with collapse by second-order freeloaders. We propose a minimalistic model for infinite populations that does not require repeated interactions, reputation, spatial structure, group selection, or optional participation. We also compare two types of benefit-sharing models, which differ on whether or not a 4 contributor in the PGG may oneself benefit, thus corresponding to “weak altruism” and “strong altruism” [42, 43]. The evolution of cooperation is investigated by means of the replicator dynamics [40, 41]. 2. The game-theoretical model group of 𝑁 players is randomly formed from the population (where 𝑁 ≥ 2). The PGG is contributions are then distributed in the following different ways: in the case of weak altruism (WA), the contribution, 𝑐1, will be multiplied by 𝑟1 > 1 and then equally shared Consider an infinitely large, well-mixed population of constant size. From time to time, a of a one-shot version. Each player is asked to contribute 𝑐1 > 0 to the public good. The and the PGG with WA, also for 𝑟1 < 𝑁. Indeed, in each case, a player that does not among all 𝑁 players in the group, but in the case of strong altruism (SA), it will be shared among 𝑁 − 1 other co-players only. In both cases, if all group members contribute, they 𝑐1 (1 − 𝑟1 /𝑁) > 0 with WA, no matter what the other players do. For the PGG with WA, obtain a payoff of (𝑟1 − 1)𝑐1 > 0. The PGG with SA is a social dilemma for any rate of 𝑟1, contribute to the public goods can get an improved payoff by 𝑐1 with SA, and by the benefits by switching to a contributor. we assume 𝑟1 < 𝑁, as the social dilemma would otherwise be completely relaxed due to Next, we introduce the following pool-rewarding mechanism. Before participating in the 𝑟2 > 1, and after the PGG distributed equally to those who have contributed to the public good, if any. We consider the following three strategies: rewarders (R) who contribute both to the PGG fund, and defectors (D) who contribute neither to the PGG nor to the reward. If all 𝑆 5 behaviors in the PGG. The integrated contribution to the reward fund is multiplied by PGG, each player is first asked to contribute 𝑐2 > 0 to a fund to reward cooperative and to the reward fund, cooperators (C) who contribute to the PGG but not to the reward second-order social dilemma for 𝑟2 < 𝑆 because withdrawing one’s contribution to the and if all of them are C-players, they obtain nothing. The rewarding system is a reward fund can increase individual payoff by 𝑐2 (1 − 𝑟2 /𝑆) > 0. We note that pool-rewarding itself is another case of weak altruism: an R-player is allowed to obtain a return from contributing to the reward fund. We do not eliminate a return for individuals who choose to contribute to rewards. R-players would be more likely to evolve with it than without it. In the latter case D-players dominate (see Appendix A.1 for details). Nevertheless, it is not clear whether or not such weakly altruistic, reward system can subsist in the presence of second-order freeloaders. Indeed, the funding stage is set up before the PGG and thus R-players cannot avoid the risk of being exploited by C-players. We denote the expected payoff values for R-, C-, and D-players with 𝑃R , 𝑃C , and 𝑃D , respectively. The frequencies of the three strategies are expressed as 𝑥 , 𝑦 , and 𝑧 The strategy’s expected payoff is supposed to be the sum of the payoff from the PGG and � � � 𝑥̇ = 𝑥(𝑃R − 𝑃), 𝑦̇ = 𝑦(𝑃C − 𝑃), 𝑧̇ = 𝑧(𝑃D − 𝑃). from the reward fund. The replicator equations are written as (1) � (𝑥 + 𝑦 + 𝑧 = 1). The average payoff for the population is given by 𝑃 = 𝑥𝑃R + 𝑦𝑃C + 𝑧𝑃D . contributors in the PGG are R-players, they each obtain a net reward of (𝑟2 − 1)𝑐2 > 0, group with 𝑆 contributors obtains a benefit of 𝑟1 𝑐1 𝑆/𝑁 (0 ≤ 𝑆 ≤ 𝑁 − 1). Hence, the 1 𝑃D = ∑𝑁−1 � 𝑆=0 We first calculate the expected payoffs from the PGG. In the case of WA, a D-player in a 𝑟 𝑐 𝑆 𝑁 − 1 (1 � − 𝑧)𝑆 𝑧 𝑁−𝑆−1 1𝑁1 𝑆 1 expected payoff is given by = 𝑟1 𝑐1 �1 − 𝑁� (1 − 𝑧), where � 𝑟1 𝑐1 𝑆/(𝑁 − 1), and calculating the expected payoff as in Eq. (2a), 6 PGG are contributors. In the case of SA, a D-player in the group obtains a benefit of 𝑁 − 1 (1 � − 𝑧)𝑆 𝑧 𝑁−𝑆−1 is the probability that 𝑆 of 𝑁 − 1 co-players in the 𝑆 (2a) 1 1 Both the expected payoffs for R- and C-players (denoted by 𝑃R , resp. 𝑃C ) are reduced 1 from 𝑃D , by the cost for a contributor σ: 𝜎 = 𝑐1 (1 − 𝑟1⁄𝑁 ) in the case of WA and 𝜎 = 𝑐1 in the case of SA. 1 𝑃D = 𝑟1 𝑐1 (1 − 𝑧). (2b) reward of 𝑟2 𝑐2 𝑛R /𝑆 (0 ≤ 𝑛R ≤ 𝑆 − 1). Hence, the expected reward for a C-player in a group with 𝑆 contributors is = 𝑟2 𝑐2 �1 − 𝑆� �1−𝑧�, 1 𝑥 2 Regarding the reward system, the expected payoff for D-players is 𝑃D = 0. A C-player in a group with 𝑆 contributors and 𝑛R R-players (and thus 𝑆 − 𝑛R C-players) receives a 𝑆 − 1 𝑥 𝑛R 𝑦 𝑆−𝑛R −1 𝑟2 𝑐2 𝑛R 2 𝑆−1 � �1−𝑧� �1−𝑧� , 𝑃C (𝑆) = ∑𝑛R =0 � 𝑆 𝑛R 2 2 net reward for an R-player, 𝑃R , is reduced from 𝑃C by Among 𝑆 contributors, switching from R to C yields 𝑐2 (1 − 𝑟2 ⁄𝑆). Thus, the expected (4) 𝑐2 ∑𝑁 � 𝑆=1 𝐹(𝑧) has a unique root 𝑧̂ in the open interval (0,1) if, and only if, 1 < 𝑟2 < 𝑁, because (5) 𝐹(𝑧) is monotonic, 𝐹(0) = 𝑐2 (1 − 𝑟2 ⁄𝑁) > 0, and 𝐹(1) = 𝑐2 (1 − 𝑟2 ) < 0. Therefore, the advantage C-players have over R-players will change from positive to negative as 𝑧 increases across 𝑧̂ . 1 2 𝑃D = 𝑃D + 𝑃D , and obtain a simple expression for the average payoff for the population 𝑟2 𝑟2 1−𝑧 𝑁 𝑁 − 1 (1 � − 𝑧)𝑆−1 𝑧 𝑁−𝑆 �1 − 𝑆 � = 𝑐2 �1 − 𝑁 1−𝑧 � 𝑆 − 1 2 𝑃C = ∑𝑁 � 𝑆=1 contributors are R-players. Consequently, the expected reward for a C-player is = 𝑟2 𝑐2 �1 − 1−𝑧 𝑁 where � 𝑆 − 1 𝑥 𝑛R 𝑦 𝑆−𝑛R −1 � �1−𝑧� �1−𝑧� is the probability that 𝑛R of the other 𝑆 − 1 𝑛R (3) 𝑁 − 1 (1 2 � − 𝑧)𝑆−1 𝑧 𝑁−𝑆 𝑃C (𝑆) 𝑆 − 1 𝑁(1−𝑧) � �1−𝑧�. 𝑥 =∶ 𝐹(𝑧). 1 2 1 2 Integrating the above results, we can determine that 𝑃R = 𝑃R + 𝑃R , 𝑃C = 𝑃C + 𝑃C , and 7 both for the WA and SA cases. 3. Dynamics � 𝑃 = 𝑐1 (𝑟1 − 1)(1 − 𝑧) + 𝑐2 (𝑟2 − 1)𝑥, (6) {(𝑥, 𝑦, 𝑧): 𝑥, 𝑦, 𝑧 ≥ 0, 𝑥 + 𝑦 + 𝑧 = 1}. The three homogeneous states in which 100% of the the boundary of 𝑆3 for non-degenerate cases. Indeed, on the edge C-D: 𝑥 = 0, 𝑧̇ = population are R-players (𝑥 = 1), C-players (𝑦 = 1), and D-players (𝑧 = 1) correspond to The evolutionary dynamics of the three strategies take place in the state space 𝑆3 = obviously fixed points for the replicator system Eq. (1). There are no other fixed points on (𝑃D − 𝑃C )𝑧(1 − 𝑧) = 𝜎𝑧(1 − 𝑧) > 0, where 𝜎 = 𝑐1 (1 − 𝑟1⁄𝑁 ) in the case of WA and D. On the edge R-C: 𝑧 = 0 and on the edge D-R: 𝑦 = 0, resulting in 𝑦̇ = (𝑃C − and 𝑥̇ = (𝑃R − 𝑃D )𝑥(1 − 𝑥) = [𝑐2 (𝑟2 − 1) − 𝜎]𝑥(1 − 𝑥), respectively. The evolution on both edges is unidirectional and its direction three vertices of the simplex 𝑆3 (which we denote by R, C, and D, respectively). These are 𝜎 = 𝑐1 in the case of SA. Thus, the evolution on the edge C-D is unidirectional from C to 𝑃R )𝑦(1 − 𝑦) = 𝑐2 (1 − 𝑟2 ⁄𝑁)𝑦(1 − 𝑦) and 𝜎, respectively. depends on the magnitude of the relationship between 𝑟2 and 𝑁, and between 𝑐2 (𝑟2 − 1) 𝑥⁄(1 − 𝑧), which represents the fraction of contributors in the PGG that are also rewarders. 𝑓̇ = − (1−𝑧)2 (𝑃C − 𝑃R ) = −𝑓(1 − 𝑓)𝐹(𝑧). This yields 𝑧̇ = −𝑧(1 − 𝑧)[𝑐2 (𝑟2 − 1)𝑓 − 𝜎]. 3.1. The global attractor D 𝑥𝑦 To analyze the dynamics in the interior of 𝑆3 , let us introduce a new variable 𝑓 = � Substituting 𝑥 = 𝑓(1 − 𝑧) and Eq. (6) into 𝑧̇ = 𝑧(𝑃D − 𝑃) yields (7) (8) 8 to D. Eq. (8) yields 𝑧̇ > 0 in the interior of 𝑆3 . Thus, there is no interior fixed point and the direction of evolution on the edge R-C is from R to C; if 𝑟2 > 𝑁 and otherwise, it is 𝑟2 < 𝑁, the edge is separated into an unstable segment (0 ≤ 𝑧 < 𝑧̂ ) and a stable one that if 𝑟2 < 1, then 𝑐2 (𝑟2 − 1) − 𝜎 < 0 holds. In the boundary case that 𝑐2 (𝑟2 − 1) − all interior orbits converge to the vertex D, which is a global attractor (Fig. 1a). If 𝑟2 < 𝑁, 𝜎 = 0, 𝑧̇ = 0 holds when 𝑓 = 1 and thus, the edge D-R is a line of fixed points. If Supposing 𝑐2 (𝑟2 − 1) − 𝜎 < 0, then the direction of evolution on the edge D-R is from R from C to R; and when 𝑟2 = 𝑁, the edge R-C consists of unstable fixed points. We note drift and occasional invasion of the missing C-player will eventually send the state within the stable segment to the vertex D, in the long run. 3.2. The global attractor R stable segment (Fig. 1b). If 𝑟2 ≥ 𝑁, then the edge D-R has no unstable segment. Random (𝑧̂ < 𝑧 ≤ 1). Since 𝑧̇ > 0 holds in the interior of 𝑆3 , all interior orbits converge to the Supposing 𝑐2 (𝑟2 − 1) − 𝜎 > 0 and 𝑟2 > 𝑁, then the direction of evolution on the edge D-R is from D to R, and from C to R on the edge R-C. The fact that 𝐹(𝑧) < 0 in the open (0 ≤ 𝑥 < 𝑥RC ) and a stable one (𝑥RC < 𝑥 ≤ 1), where 𝑥RC is given by 𝜎⁄[𝑐2 (𝑟2 − 1)] as then the edge R-C is a line of fixed points, which consists of an unstable segment a non-trivial solution of Eq. (8). The fact that all interior states satisfy 𝑥̇ > 0 leads the population to evolve towards the stable segment. Thus, random drift and occasional invasion of the missing D-player will eventually bring the population to the vertex R, in the long run. 3.3. The mixture equilibrium of the three strategies all interior orbits converge to the vertex R, which is a global attractor (Fig. 2). If 𝑟2 = 𝑁, interval (0,1) yields 𝑥̇ > 0 in the interior of 𝑆3 . Thus, there is no interior fixed point and 9 ̂ 𝑧̂ of 𝐹(𝑧) and 0 < 𝑓 ∶= 𝜎⁄[𝑐2 (𝑟2 − 1)] < 1. From Eqs. (7) and (8), we see that there is a unique interior fixed point 𝑄 = (𝑥 𝑦 𝑧̂ ), with �, �, ̂ ̂ 𝑥 = 𝑓(1 − 𝑧̂ ), 𝑦 = (1 − 𝑓)(1 − 𝑧̂ ). � � (9) The mixture equilibrium, Q, is a center, i.e., it is neutrally stable and surrounded by closed ̂ unique fixed point (𝑓, 𝑧̂ ) corresponding to Q (see Appendix A.2 and [38] for details). form a heteroclinic cycle of a rock-scissors-paper type. We now have a unique interior root edge D-R is from D to R, and from R to C on the edge R-C. Thus, the three edges of 𝑆3 Supposing that 𝑐2 (𝑟2 − 1) − 𝜎 > 0 and 1 < 𝑟2 < 𝑁, the direction of evolution on the orbits that fill the interior of 𝑆3 (Fig. 3). This results because Eqs. (7) and (8) can be Given 𝑐1, 𝑟1, and 𝑁, which are all original parameters for the PGG, the location of Q can expressed in the form of a Hamiltonian system, H, and now H has a strict maximum at the ̂ line 𝑦 = (1⁄𝑓 − 1)𝑥, independent of the group size, 𝑁. As 𝑁 increases, Q moves toward 𝑁 = 2. In other extreme cases, where 𝑟2 = 1, 𝑟2 = 𝑁, 𝑐2 (𝑟2 − 1) = 𝜎, and 𝑐2 = ∞, Q arrives at the vertex D, the edges R-C, D-R, and C-D, respectively. 4. Discussion Conflict between contributors and freeloaders in public goods interactions is inevitable. How can we avoid conflict between contributors and freeloaders? An effective solution is to set up a reward fund for cooperative behaviors. The key conditions for the reward system necessary to maintain cooperation with free riders in public goods games (PGGs) are given where 𝜎 = 𝑐1 (1 − 𝑟1⁄𝑁) in the case of weak altruism and 𝜎 = 𝑐1 in the case of strong (10) 10 be determined by the remaining parameters, 𝑐2 and 𝑟2 . According to Eq. (9), Q lies on the the vertex D along the line 𝑦 and 𝑧̂ → 1 as 𝑁 → ∞. On the other hand, as 𝑁 decreases, Q moves in the opposite direction and 𝑧̂ decreases to 2⁄𝑟2 − 1 > 0, which occurs when 𝑐2 (𝑟2 − 1) > 𝜎, by altruism. Eq. (10) means that the optimum group reward should exceed the cost for a contributor in the PGG, which is relaxed by a self-returning benefit of 𝑟1 𝑐1⁄𝑁 in the case of weak altruism. In infinite populations, it has been determined that peer-rewarding is a potent motivator, but only if reputation is important [16, 30]. However, in pool-rewarding, this is not the case. With such attractive rewards, cooperative investments in both the PGG rewarding system, i.e., for 𝑟2 < 𝑁 . In the case, the replicator dynamics exhibit a and the reward fund can subsist, even when second-order freeloaders can dominate the rock-scissors-paper cycle among the three strategies: defectors who never contribute (first-order freeloaders), cooperators who contribute only in the PGG (second-order freeloaders), and rewarders who contribute to both. The cyclical evolutionary scenario can be described as follows. If most players are rewarders, the reward system is actually a second-order social dilemma and thus cooperators spread. If cooperators are prevalent, it is better to become a defector due to the social dilemma. If most players are defectors, the number of beneficiaries of the reward is usually small enough to subvert cooperator dominance over rewarders, and thus the number of rewarders increases. If the number of rewarders increases sufficiently, then the second-order dilemma returns. In this scenario, traditional defectors play a pivotal role in maintaining the cyclic domination among the three strategies. The moderate advantage defectors have over cooperators, given by σ, prevents the second-order dilemma from eliminating rewarders and then ensures that rewarders, not cooperators, dominate. Global environmental and energy issues often appear to be compulsory public goods projects, such that in the short-term cooperation will yield only very little benefit and the social optimum is not to cooperate. The situation is not a social dilemma, and has thus our model, this may correspond to the case where 0 ≤ 𝑟1 < 1. We remark that the results remained outside the scope of studies on the evolution of cooperation in large groups. In shown hold even when 0 ≤ 𝑟1 < 1, and thus pool-rewarding is applicable to a broader range of public goods interactions. 11 earlier public goods game with optional participation [37, 38, 39]. Indeed, the PGG taking part in another PGG with a cost of 𝑐2 and a multiplier of 𝑟2 . This is just an implementation of the inverse form of the loner’s option. A fascinating extension of this work is to consider second-order sanctions [13, 15, 24, 32]. Indeed, in our model, it looks practical for the rewarding system to mete out punishment on cooperators (second-order freeloaders) in such a way that will reduce rewards for those [12]. Let us see how, for instance, reducing rewards to cooperators by 𝑎% changes the degenerates into a game in which there is no longer benefit from contribution 𝑐1. Each player therefore seems to have the option to avoid the participation fee of 𝑐1, instead of We note that in the extreme case where 𝑟1 = 0, our model is significantly similar to an dynamics. According to preliminary numerical simulations, the existing interior fixed point Q is destabilized (Fig. 4), and for discount rates a higher than a threshold value, the population can converge to a state of 100% rewarders, irrespective of the initial conditions cooperators and rewarders, enters the state space 𝑆3 and is unstable within the (Fig. 4b). As increasing a crosses the threshold, a new mixture equilibrium P, of rewarder-cooperator boundary (see Appendix A.3 for details). If defectors (first-order freeloaders) are absent, the population cannot avoid the resulting coordination problem: depending on the initial condition, the population evolves to become either 100% rewarders or 100% cooperators. Otherwise, interestingly, the population can make an end run around the bistability and establish the social optimum. It would be a rather intriguing issue for future research to theoretically analyze the result that reward-based cooperation will necessarily becomes globally stable, whenever it cannot be invaded by second-order freeloader. By contrast, in the case of pool-punishment, punishment-based cooperation can never become globally stable, even if second-order sanctions are assumed, because a state of 100% first-order freeloaders remains stable [44]. One important issue we left out is the effects of economies and diseconomies of scale on the provision of sanctions. So far we have focused on linear cost-benefit functions for 12 rewarding, whereby any group of rewarders generates the same per capita group benefit. According to Mathew and Boyd [33], the existing interior fixed point of the optional public goods game becomes an attractor for decreasing returns and a repeller for increasing returns. In practice, the rich dynamics afforded by scale would provide many options for the proper design of sanctioning systems to support the evolution of cooperation. 13 Appendix A.1. The strongly altruistic rewarding We here turn to a strongly altruistic variant of pool-rewarding, in which the rewards resulting from an R-player will be shared among other contributors only. We assume that if there exists no other contributor, the investment to the incentive from a single R-player will 2 𝑃C = 𝑟2 𝑐2 �1−𝑧� (1 − 𝑧 𝑁−1 ), 𝑥 be exactly refunded to her. The expected reward for a C-player turns into 2 and that for an R-player is reduced from 𝑃C by the expected incentive cost 𝑐2 (1 − 𝑧 𝑁−1 ). 𝑓̇ = −𝑐2 𝑓(1 − 𝑓)(1 − 𝑧 𝑁−1 ), Eqs. (7) and (8) turn into 𝑧̇ = −𝑧(1 − 𝑧)[𝑐2 (𝑟2 − 1)𝑓(1 − 𝑧 𝑁−1 ) − 𝜎]. converge to the vertex D. point. If 𝑐2 (𝑟2 − 1) − 𝜎 ≤ 0, then 𝑧̇ > 0 holds in int 𝑆3 , and thus, all interior orbits If 𝑐2 (𝑟2 − 1) − 𝜎 > 0, the system has a new equilibrium at (𝑓, 𝑧) = �1, �1 − 𝜎 𝑁−1 � � 𝑐2 (𝑟2 −1) 1 Since 𝑓̇ is negative in the interior of the state space 𝑆3 , int 𝑆3, there is no interior fixed a saddle. We consider the z-isocline that is the set where 𝑧̇ = 0: in int 𝑆3, this is the set where 𝑓 = 𝑐 fixed point and the point (𝑓, 𝑧) = �𝑐 𝜎 . (𝑟2 −1)(1−𝑧 𝑁−1 ) 2 on the edge D-R, which is a source. The vertex D is a sink, while the vertex R still remains The interior component forms a curve that connects the new 𝜎 , 0� (𝑟2 −1) 2 two regions: one region where 𝑧̇ < 0 and the other where 𝑧̇ > 0. The last one includes the which starts in the state with 𝑧̇ < 0, has to travel to the region where 𝑧̇ > 0. Hence, all interior orbits converge to the vertex D. 14 vicinity of the edge C-D given by 𝑥 = 0. Since 𝑓̇ < 0 holds in int 𝑆3, any interior orbit, on the edge R-C, and divides int 𝑆3 to A.2. The Hamiltonian System Divide the right-hand side of Eqs. (7) and (8) by the function 𝑓(1 − 𝑓)𝑧(1 − 𝑧), which is 𝑓̇ = 𝑧(1−𝑧) =: −𝑔(𝑧), −𝐹(𝑧) positive for any (𝑓, 𝑧) in the interior of the unit square [0,1]2 . Hence, 𝑧̇ = 𝜎−𝑐2 (𝑟2 −1)𝜕 𝜕(1−𝜕) introduce 𝐻(𝑓, 𝑧) ∶= 𝐺(𝑧) + 𝐿(𝑓), where 𝐺(𝑧) and 𝐿(𝑓) are primitives of 𝑔(𝑧) and This transformation corresponds to a change in velocity and does not affect orbit. Let us 𝑙(𝑓), respectively: 𝑟 2 𝐺(𝑧) = 𝑐2 �1 − 𝑁 � log 𝑧 + 𝑐2 (𝑟2 − 1) log(1 − 𝑧) + 𝑅(𝑧), =: −𝑙(𝑓). 𝐿(𝑓) = 𝜎 log 𝑓 + [𝑐2 (𝑟2 − 1) − 𝜎] log(1 − 𝑓). 𝑓̇ = − 𝜕𝑧 , 𝜕𝜕 with 𝑅(𝑧) bounded on [0,1]. Thus, we obtain the Hamiltonian system ̂ (𝑓, 𝑧̂ ) if 𝑐2 (𝑟2 − 1) − 𝜎 > 0 and 1 < 𝑟2 < 𝑁, the interior equilibrium 𝑄 is a stable point Because the system is conservative and the Hamiltonian attains a strict global maximum at surrounded by closed orbits. Indeed, all interior orbits are closed: 𝐺(𝑧) → −∞ as 𝑧 → 0, 1 if 1 < 𝑟2 < 𝑁 and 𝐿(𝑓) → −∞ as 𝑓 → 0, 1 if 0 < 𝜎 < 𝑐2 (𝑟2 − 1) . Hence, 𝐻 → −∞ uniformly near the boundary of [0,1]2 and thus all constant level sets of 𝐻 are closed return to their starting points. 𝑧̇ = 𝜕𝜕 𝜕𝜕 . ̂ curves around (𝑓, 𝑧̂ ). The solutions have to remain on the constant level sets and thus A.3. The second-order sanctioning freeloaders) will be reduced by 100𝛼% ( 0 ≤ 𝛼 ≤ 1 ), under the assumptions that We examine an extensive model in which rewards for cooperators (second-order 𝑐2 (𝑟2 − 1) > 𝜎 and 1 < 𝑟2 < 𝑁. In the extension, the expected payoff for a cooperator is 1−𝑧 𝑁 𝑥 2 𝑃C = (1 − 𝛼)𝑟2 𝑐2 �1 − 𝑁(1−𝑧)� �1−𝑧�, given by 15 𝑓̇ = −𝑓(1 − 𝑓)[𝐹(𝑧) − 𝛼(𝑐2 (𝑟2 − 1) + 𝐹(𝑧))𝑓], and, Eqs. (7) and (8) turn to 𝑓𝑄 = 𝛼𝜎+(1−𝛼)𝑐 𝜎 The fact that 𝐹(𝑧) is monotonically decreasing and 𝐹�𝑧𝑄 � ≥ 0 yields that 0 < 𝑧𝑄 ≤ 𝑧̂ , . a threshold 𝛼𝑃 given by increasing α. This implies that as α increases, Q moves towards the edge R-C. As α crosses 𝑐2 (𝑟2 −1)+𝐹(0) 𝐹(0) In the interior of 𝑆3 , there exists at most one fixed point 𝑄 = (𝑓𝑄 , 𝑧𝑄 ) such that 2 (𝑟2 −1) 𝑧̇ = −𝑧(1 − 𝑧)[𝑐2 (𝑟2 − 1)𝑓 − 𝜎 − 𝛼(𝑐2 (𝑟2 − 1) + 𝐹(𝑧))𝑓(1 − 𝑓)]. and 𝐹�𝑧𝑄 � = 𝑐2 (𝑟2 − 1) 1−𝛼𝜕𝑄 𝛼𝜕𝑄 where 𝑧̂ is the unique solution of 𝐹(𝑧) = 0 . 𝑓𝑄 increases and 𝑧𝑄 decreases, with , a new equilibrium with (𝑓, 𝑧) = �𝛼𝑟 𝑁−𝑟2 , 0� 2 (𝑁−1) enters the edge R-C through the vertex R, which then turns into a sink. The boundary equilibrium, P, is a saddle point, unstable within the edge and stable to invasion of another threshold 𝛼𝑄 given by defectors. As α further increases, P moves towards the vertex C, and when α crosses 𝜎+𝐹(0) 𝐹(0) orbits converge, if 0 < 𝛼 < 𝛼𝑃 , to a heteroclinic cycle on the boundary of 𝑆3 , and if edge. Preliminary numerical simulations imply that Q is a source for α > 0, and all interior 𝛼𝑃 < 𝛼 ≤ 1, to the vertex R. source. For larger values of α, 𝑆3 has no interior equilibrium but P still remains within the , Q exits 𝑆3 through P, which then turns into a 16 References [1] R.M. Dawes, Social dilemmas, Annu. Rev. Psychol. 31 (1980) 169–193. [2] P. Kollock, Social dilemmas: the anatomy of cooperation, Annu. Rev. Sociol. 24 (1998) 183–214. [3] J.O. Ledyard, Public goods: a survey of experimental research, in: J.H. Kagel, A.E. Roth (Eds.), The Handbook of Experimental Economics, Princeton University Press, Princeton, NJ, 1995, pp. 111–194. [4] Q. Trivers, The evolution of reciprocal altruism, Rev. Biol. 46 (1971) 35–57. [5] R. Axelrod, W.D. Hamilton, The evolution of cooperation, Science 211 (1981) 1390– 1396. [6] M.A. Nowak, K. Sigmund, Evolution of indirect reciprocity by image scoring, Nature 393 (1998) 573–577. [7] M. Milinski, D. Semmann, H.-J. Krambeck, Reputation helps to solve the 'tragedy of the commons', Nature 415 (2002) 424–426. [8] M.A. Nowak, R.M. May, Evolutionary games and spatial chaos, Nature 359 (1992) 826–829. [9] T. Killingback, M. Doebeli, N. Knowlton, Variable investment, the Continuous Prisoner's Dilemma, and the origin of cooperation, Proc. R. Soc. B. 266 (1999) 1723– 1728. [10] D.S. Wilson, E. Sober, Reintroducing group selection to the human behavioral sciences, Behav. Brain Sci. 17 (1994) 585–654. [11] A. Traulsen, M.A. Nowak, Evolution of cooperation by multilevel selection, Proc. Natl. Acad. Sci. U.S.A. 29 (2006) 10952–10955. [12] P. Oliver, Rewards and punishments as selective incentives for collective action: theoretical investigations, Am. J. Sociol. 85 (1980) 1356–1375. [13] R. Axelrod, An evolutionary approach to norms, Am. Polit. Sci. Rev. 80 (1986) 1095– 1111. 17 [14] T. Yamagishi, The provision of a sanctioning system as a public good, J. Pers. Soc. Psychol. 51 (1986) 110–116. [15] R. Boyd, P.J. Richerson, Punishment allows the evolution of cooperation (or anything else) in sizable groups, Ethol. Sociobiol. 13 (1992) 171–195. [16] K. Sigmund, C. Hauert, M.A. Nowak, Reward and punishment, Proc. Natl. Acad. Sci. U.S.A. 98 (2001) 10757–10762. [17] E. Fehr, S. Gächter, Altruistic punishment in humans, Nature 415 (2002) 137–140. [18] J. Andreoni, W.T. Harbaugh, L. Vesterlund, The carrot or the stick: rewards, punishments, and cooperation, Am. Econ. Rev. 93 (2003) 893–902. [19] A. Gardner, S.A. West, Cooperation and punishment, especially in humans, Am. Nat. 164 (2004) 753–764. [20] J. Fowler, Altruistic punishment and the origin of cooperation, Proc. Natl. Acad. Sci. U.S.A. 102, (2005) 7047–7049. [21] C. Hauert, A. Traulsen, H. Brandt, M.A. Nowak, K. Sigmund, Via freedom to coercion: the emergence of costly punishment, Science 316 (2007) 1905–1907. [22] M. Sefton, R. Shupp, J. Walker, The effects of rewards and sanctions in provision of public goods, Econ. Inquiry 45 (2007) 671–690. [23] K. Sigmund, Punish or Perish? Retaliation and collaboration among humans, Trends Ecol. Evol. 22 (2007) 593–600. [24] T. Kiyonari, P. Barclay, Cooperation in the social dilemma: free riding may be thwarted by second-order reward rather than by punishment, J. Pers. Soc. Psychol. 95 (2008) 826–842. [25] M. Shinada, T. Yamagishi, Bringing back Leviathan into social dilemmas, in: A. Biel, D. Eek, T. Gärling (Eds.), New Issues and Paradigms in Research on Social Dilemmas. Springler-Verlag, Berlin, Germany, 2008, pp. 93–123. [26] H. De Silva, C. Hauert, A. Traulsen, K. Sigmund, Freedom, enforcement, and the social dilemma of strong altruism, J. Evol. Econ. 20 (2009) 203–217. [27] M. Nakamaru, U. Dieckmann, Runaway selection for cooperation and strict-and-severe punishment, J. Theor. Biol. 257 (2009) 1–8. 18 [28] R. Boyd, H. Gintis, S. Bowles, Coordinated punishment of defectors sustains cooperation and can proliferate when rare, Science 328 (2010) 617–620. [29] P.A.I. Forsyth, C. Hauert, Public goods games with reward in finite populations, J. Math. Biol. (2010) Published Online First: 24 September 2010. Doi: 10.1007/s00285-010-0363-7 [30] C. Hauert, Replicator dynamics of reward & reputation in public goods games, J. Theor. Biol. 267 (2010) 22–28. [31] C. Hilbe, K. Sigmund, Incentives and opportunism: from the carrot to the stick, Proc. R. Soc. B. 277 (2010) 2427–2433. [32] K. Sigmund, H. De Silva, C. Hauert, A. Traulsen, Social learning promotes institutions for governing the commons, Nature 466 (2010) 861–863. [33] S. Mathew, R. Boyd, When does optional participation allow the evolution of cooperation?, Proc. R. Soc. B. 276 (2009) 1167–1174. [34] J.H. Orbell, R.M. Dawes, Social welfare, cooperator’s advantage, and the option of not playing the game, Am. Sociol. Rev. 58 (1993) 787–800. [35] C.A. Aktipis, When to walk away and when to stay: cooperation evolves when agents can leave unproductive partners and groups, J. Theor. Biol. 231 (2004) 249–260. [36] T. Sasaki, T. Unemi, Probabilistic participation in public goods games. Proc. R. Soc. B. 274 (2007) 2639–2642. [37] C. Hauert, S. De Monte, J. Hofbauer, K. Sigmund, Volunteering as Red Queen mechanism for cooperation in public goods games, Science 296 (2002) 1129–1132. [38] C. Hauert, S. De Monte, J. Hofbauer, K. Sigmund, Replicator dynamics for optional public goods games, J. Theor. Biol. 218 (2002) 187–194. [39] D. Semmann, H.-J. Krambeck, M. Milinski, Volunteering leads to rock-paper-scissors dynamics in a public goods game, Nature 425 (2003) 390–393. [40] J. Hofbauer, K. Sigmund, Evolutionary Games and Population Dynamics. Cambridge Univ. Press, Cambridge, 1998. [41] K. Sigmund, M.A. Nowak, Evolutionary dynamics of biological games, Science 303 (2004) 793–798. 19 [42] J.A. Fletcher, M. Doebeli, A simple and general explanation for the evolution of altruism, Proc. R. Soc. B. 276 (2009) 13–19. [43] J.A. Fletcher, M. Zwick, The evolution of altruism: game theory in multilevel selection and inclusive fitness, J. Theor. Biol. 245 (2007) 26–36. [44] K. Sigmund, C. Hauert, A. Traulsen, H. De Silva, Social control and the social contract: the emergence of sanctioning systems for collective action, Dyn. Games. Appl. 1 (2011) 149–171. 20 Figure Captions Figure 1. Defectors (first-order freeloaders) prevail. Oscillations do not occur and the (b) In the boundary case that 𝑐2 (𝑟2 − 1) − 𝜎 = 0, the edge D-R is a line of fixed points. interior state space has no fixed point. (a) All interior states evolve towards the vertex D. All interior orbits converge to a stable (lower) segment of the edge. Random drift and Parameters: 𝑁 = 5; 𝑟1 = 3; 𝑐2 = 1; 𝑟2 = 1.2 (a) or 1.4 (b); 𝜎 = 0.4; and 𝑐1 = 1 (in occasional invasion of the missing C-player will eventually send the state to the vertex D. the case of WA), 𝑐1 = 0.4 (in the case of SA). Figure 2. Rewarders prevail. Oscillations do not occur and all interior states evolve 𝑟1 = 3; 𝑐2 = 1; 𝑟2 = 5.5; 𝜎 = 0.4; and 𝑐1 = 1 (in the case of WA), 𝑐1 = 0.4 (in the towards the vertex R. The interior state space has no fixed point. Parameters: 𝑁 = 5; case of SA). and the boundary of 𝑆3 represents a heteroclinic cycle. The interior of 𝑆3 has a unique Figure 4. The effects of second-order sanctions. (a) The existing interior fixed point 𝑄 𝑐2 = 1; 𝑟2 = 3; 𝜎 = 0.4; and 𝑐1 = 1 (in the case of WA), 𝑐1 = 0.4 (in the case of SA). fixed point 𝑄, which is a center surrounded by closed orbits. Parameters: 𝑁 = 5; 𝑟1 = 3; Figure 3. Rock-scissors-paper cycles. All three corners of the simplex 𝑆3 are saddle points turns into a repeller by cutting off 𝑎% rewards for cooperators. The population converges if there is an invasion of defectors. Parameters: 𝑁 = 5; 𝑟1 = 3; 𝑐2 = 1; 𝑟2 = 3; 𝜎 = 0.4; to a heteroclinic cycle on the boundary of 𝑆3 . (b) For a sufficiently high 𝑎, the vertex R can be a global attractor. At the same time, 𝑆3 has a boundary fixed point 𝑃, which divides the basins of attraction of rewarders and cooperators on the edge R-C and is stable and 𝑐1 = 1 (in the case of WA), 𝑐1 = 0.4 (in the case of SA). The rewards are cut by the following percentages (a) 𝑎 = 10 and (b) 𝑎 = 20. 21 Figure 1 a R b R D C D C Figure 2 R D C Figure 3 R Q D C Figure 4 a R b R P Q D C D Q C
x

Log In

or reset password

Reset Password

Enter the email address you signed up with, and we'll send a reset password email to that address

Academia © 2012