Do Anti-doping Policies Trigger End-game Effect?

Share Embed


Descripción

Do Anti-doping Policies Trigger End-game Effect? Jakub Sindelar∗ Collegio Carlo Alberto, Universita degli studi di Torino

March 6, 2017

Second draft Do NOT Quote Abstract It was previously argued that older athletes might be particularly prone to taking performance enhancing drugs since the economic costs of doping become negligible as athletes approach the end of their careers. This paper takes the aforementioned question seriously and tries to answer it both theoretically and empirically. My model shows that the use of doping of a rational athlete may be constantly rising over time and hit the maximum in the last round (year). Empirical part uses two main sources of data: publicly available data on doping in track&field and a leaked blood samples database by IAAF. All estimates show a significant correlation between age and doping. It is argued that the effect itself is not strong enough to justify a radical policy change.

1

Introduction

Doping in sport is a growing academic topic. Cases of doping and corruption in sport have occupied the front pages of all major newspapers and other media. Corruption scandals, such as the one reported by German national television ARD1 , undermine public opinion in anti-doping and show that the problem is not just on the side of athletes anymore. This creates a challenge for social scientists to rethink the current anti-doping system to make it more effective and efficient. ∗ Ph.D. candidate in international program in Institutions, Economics and Law. Email: [email protected], [email protected] 1 ARD (2014). Geheimsache Doping - Wie Russland seine Sieger macht. TV Documentary.

1

In economics, most of the research on doping is dealing with two-players (prisoner’s dilemma type) one shot games where athletes are choosing to dope or not (Breivik, 1992; Berentsen, 2002, Haugen 2004, Bervoets et al. 2014). Recently, scholars have started to apply different models, paper by Emrich& Pierdzioch (2013) highlighted the problem of international coordination of antidoping policies; Berentsen et al. (2008) has shown the importance of whistleblowing for reducing the frequency of cheating; Eber (2002) sees the problem in the insufficient credibility of the current anti-doping system and Buechel et al. (2013) stresses the need for transparent doping tests. Even though the scope of doping literature is extending, many policy issues were not discussed and analyzed. This paper takes a closer look at one question that stems from the current system of penalties which is almost solely based on bans. As was already mentioned in the literature (Maennig, 2002), such a system can distort level playing field as it may not deter older athletes who are in the end of their careers. The basic idea is as follows, in the end of her career, a rational athlete will not consider the threat of punishment (being banned from competition) in her doping choice, i.e. she will have no costs of doping other than direct monetary (obtaining substances) and indirect costs such as future health deterioration. This paper is trying to shed light on the end-game effect both theoretically and empirically. The motivation for it is the striking lack of both theoretical and empirical understanding of this phenomenon. First part of this paper will introduce a theoretical model of one athlete who is maximizing his doping effort in time. An athlete that is in the last round (year) of his career cares just about present and dopes maximally with respect to the probability of being caught. In periods preceding the last period he has to consider possible future losses. An increasing natural performance discourages athletes from misbehaving in present period since it raises future costs of doping. The model predicts that doping effort will be growing throughout the career with an increasing pace (with exceptions in certain model configurations). Time has two effects on optimal level of doping. The first is the aforementioned endgame effect and the second one is what I call the ”declining natural performance effect”. The latter effect can be understood in the following way, as natural performance grows in time (when athlete is young), the possible future loses of being banned are higher compared to a situation when his natural performance is declining. Second part of my paper takes theoretical prediction and tests it on two 2

alternative databases. First one consists of positive doping tests in track&field and the other is the leaked IAAF database of blood samples. My estimates show that older athletes are on average more often tested positive and also have more extreme blood values. Even though the blood tests database allowed me to run a fixed-effects regression, one has to be careful with a causal interpretation of the results, thus the hypothesis that the current anti-doping system triggers the end-game effect can’t be empirically proven. Nevertheless the results are still interesting for policy makers since it can help them to optimize testing procedures and it is an important addition to academic research on doping.

2

The Model

In order to show how age affects doping behavior, I will present a simple intertemporal choice model with a rational representative agent, who is optimizing his level of doping throughout his career. Let’s assume a risk-neutral athlete in a non-strategic situation (he can estimate precisely the effects of doping on his profits) that is facing a 1 year ban in case of positive doping test. In each round t (year) he can choose to dope dt ∈ [0, 1] to enhance his natural performance Pn such that the final performance is given by Pn ∗ (1 + dt ∗ ε), where ε ∈ [1, ∞) stands for the effect of doping on his final performance. For a given level of natural performance Pn there is a level of profit π e = f (Pn ) that would be earned without taking PEDs. Now, without loss of generality, I assume that Pn = π ˜ . Athletes possible earnings are therefore given by π et ∗ (1 + dt ∗ ε) where the coefficient ε now stands directly for the effect of doping on earnings. Now, let’s model the costs of doping. Athlete has no direct costs of obtaining doping substances or other costs (mainly health deterioration). He may only lose current and future profits when tested positive. The choice to dope depends not only on the costs of doping but also on the probability of being caught r ∈ [0, 1] which is defined as P rob(caught|d = 1), i.e. as chance of being caught when he athlete dopes maximally, i.e. dt = 1. Doping creates possible costs in two periods. The athlete can lose his profits in the period he is competing in and also in the following period while serving a ban. His aim is to choose a level of doping that takes all the current and

3

future expected losses into consideration. As it is usually assumed he also faces a discount factor δ. The costs of doping are given by a product of the probability of being caught r and the level of doping d. The decision he is facing takes the following form:

Time of decision t

Time of decision t+1

z }| { z }| { max Eπt = (1 + dt )e πt (1 − dt r) +(1 − dt r)δ [(1 + dt+1 )e πt+1 (1 − dt+1 r)] dt {z } |

(1)

exp. future costs (t+1) given dt

Alternatively in a form stressing the negative effect of doping on both current and future costs:

πt + δ[(1 + dt+1 )e πt+1 (1 − dt+1 r)]] max Eπt = (1 − dt r)[(1 + dt )e dt

(2)

The first part of equation 2 shows the expected profit from competition in time of decision and the second part depicts the possible loss of profit in the next period (if the athlete is banned). The choice varible is dt which is affecting his performance in time t, therefore enhancing his ”natural profit” π et by making him relatively better to others. On the other hand, higher dt comes also with higher costs. Doping in time t plays a crucial role in determining the expected profit from the next round (t+1) where the higher dt is the lower is the expected return. When the function is maximized the optimal level of doping at time t takes form: β



dF z }| { z }| { 2 δ(dt+1 r + dt+1 r − dt+1  − 1) π et+1 −r dt = + 2 π et 2r

(3)

the first observation is that if an athlete finds himself in the last period of his career, i.e. π et+1 = 0, the level of doping d∗F is simply

−r 2r ,

which implies

that for nonnegative levels of doping  ≥ r has to hold. This condition always holds since it is assumed that doping has a positive effect on performance and therefore on income ( > 1). If either dt+1 < 1 or r < 1 it holds that the rate of doping is highest in the last round and in all previous rounds the optimal level of doping is below that level, because the first part (marked as β) of equation 3 is negative and

4

represents a deviation from last round’s doping level.2 Ratio of current and future natural profits π e is an important determinant of dt . The higher π et+1 the lower is the optimal level of doping at time t. This stems from the fact that β < 0. Intuitively, if an athlete knows that the next year he can be more successful than the present year, doping now is quite risky, not because he can lose profits from this year, but because he can lose high profits of the following one. Higher future levels of doping dt+1 leads, ceteris paribus, to lower current PEDs consumption if dt+1 <

−r 2r

which, as was shown, always holds given that

either condition dt+1 < 1 or r < 1 is satisfied.3 The model gives one counter-intuitive result, in certain situations a higher probability of being caught r leads to higher levels of doping. Since higher r means lower doping in t+1, it also means lowers costs of doping in time t. This is obvious when looking at the last period, where the effect of growth in r is always translated in higher deterrence4 , i.e. lower dt+1 therefore lowers ”future” costs of doping for the previous period. The model has shown that there are 2 effects leading to higher levels of doping in later stages of athlete’s career. The first is the endgame effect given by d∗F and the second is the declining natural performance effect, given by

π et+1 π et .

The faster one’s natural performance is declining, the higher levels of doping are predicted.

3

Empirical Research

Currently there are two empirical clues pointing to the direction that older athletes tend to dope more than their younger opponents. The first evidence comes from Dilger and Tolsdorf (2005) who looked at 64 top 100m runners and concluded that athletes who have been tested positively are significantly older (about 3 years). Coup and Gergaud (2012) have provided a newer evidence based on UCI’s ”index of suspicion”. Data for 200 cyclists have shown a correlation between 2 2 Proof: In order for the first part of equation 3 to be negative (d t+1 r+dt+1 r−dt+1 −1) < 0 has to hold. If either dt+1 , r or  is negative, dt+1 2 r − dt+1  < 0 and dt+1 r < 1, which implies that the whole equation is negative. 3 Proof: In order for this statement to hold first derivate of d with respect to d t t+1 has to π et+1 −r δ be negative: ∂d∂dt = 2 (2dt+1 r + r − ) π < 0 =⇒ d < t+1 et 2r t+1 4 Since r ∈ [0, 1] and d ∈ [0, 1] in certain situations lower r does not change outcome since t doping is already at maximum.

5

cyclist’s age and index of suspicion at first but when more control variables were added, the correlation disappeared. Unfortunately, both papers did not have large enough and reliable datasets to draw any policy-relevant conclusions. Wu et al. (2016) have conducted an experimental study on doping with 4 different treatments, no-punishment, fine, ban and a superannuation scheme. As was predicted by the theory, the highest prevalence of dopers was measured in the last round of the ban treatment. The most effective deterrent was the so called superannuation scheme. If we want to understand the relationship between age and doping, we cannot just look at the age-distribution of positive doping tests. The results might be blurred by the fact that older athletes are tested more/less often, they might end-up less often in the top rankings (and therefore be less often chosen for testing) and their overall prevalence in competitions is presumably different from average. Therefore the question have to be asked differently. I raise two complementary empirical questions: 1. Is it more probable that a doping test will be positive for older athletes? 2. Are extreme blood values more prevalent among older athletes? The best way to answer the first question is to have an individualized dataset consisting of all doping tests in a given sport discipline for a certain time period and simply compute the prevalence of positive doping tests in each age group. Formally: P revalencei =

P ositivei T estedi

P revalencei stands for the share of positive doping tests in age group i. Unfortunately, the nature of publicly available data do not allow to compute it precisely, because there is not a comprehensive database of all doping tests. The best public data currently available consist of a list of tested athletes by IAAF (International Association of Athletics Federations) and USADA (United States Anti Doping Agency) in years 2009-2012 and a list of positive doping findings in athletics in period 2004-2012 (details on the structure of data will be discussed in the following chapter). Answering the second empirical question is possible with a leaked IAAF blood samples database. Firstly, I will provide descriptive statistics to shed light on the structure of my data, then I will estimate prevalence of positives among age groups with 6

standardization technique commonly known as Standardized Mortality Ratio (SMR) and in the final part I will estimate regression models.

3.1

Data

This paper exploits two alternative sources of data. At first an analysis of positive doping cases is made and subsequently also the IAAF leaked bloodvalues database is explored. The following paragraphs and this section will deal with the first set of data, positive doping tests. For the purpose of analyzing the correlation between age and positive doping tests, four datasets were employed. www.dopinglist.com is currently the most comprehensive database of doping cases; for purpose of this research I have collected 1164 cases in track&field in years 2004-2014. Those are cases recorded throughout all NADAs and anti-doping agencies.5 For the sake of standardization of the distribution of positives I have used two sources. The main dataset for this purpose is comprised of 3276 out-of-competition (OOC) tests by the IAAF anti-doping program in years 2008-2012. The secondary dataset contains a list of 2029 OOC and in-competition (IC) tested athletes by USADA.67 Publicly provided data do not contain information on age of athletes, I had to gather them separately with the use of IAAF athletes’ database, the official database of athletes by IAAF containing information on circa 110,000 athletes .8 The number of athletes in all datasets was reduced in the process of assigning information about age of athletes, not all tested and positively tested athletes have been found in the official IAAF database. In the end, 417 positives in track&field from years 2004-2012 were taken into consideration, 2635 tested athletes by IAAF in years 2009-2012 and 1598 by USADA testing program. There are several issues with the data used. The main issue is that the subsample of all tested athletes is not random, therefore a potential bias has to be considered. Throughout my analysis I take the IAAF testing sample as the main one and check the validity of my results with the control sample by USADA. Another issue is that the information on number of tests per athlete is avail5 Not all positive doping cases are in the database, some NADOs and countries have strict privacy laws that do not allow to reveal doping cases publicly. 6 source: http://www.usada.org/ 7 source: http://www.iaaf.org/about-iaaf/documents/anti-doping 8 source: http://www.iaaf.org/athletes

7

able only for USADA database (for each athlete a number of tests in given year). IAAF divides tested athletes into two groups of 1-3 and 4+ tests a year. A bias could emerge if athletes of certain age groups are tested more often. Data by USADA and partly also data by IAAF can help us to identify this bias. Table 2 summarizes available data, it shows number of observations, its share in the sample and a Standardized Mortality Ratio (SMR). SMR is commonly used to express mortality of a study cohort with respect to the general population. In my case SMR is an indicator of a prevalence of positive doping tests in a study cohort (agei ). Formally: Observedi Expectedi

SM Ri =

Observedi stands for the number of observed doping cases (from the official data) while Expectedi is the number of cases that would take place if the prevalence of doping is evenly spread among age groups, this was computed taking testing numbers (by IAAF of USADA) into consideration. Even though one can observe that younger and older age-groups (above 30) have on average higher SMR, small number of positive doping cases in most age-groups does not allow us to make any conclusions. Now I will look at the whole sample to see if there is any difference in means, i.e. if those tested positive are on average older. The mean age of positive doping tests in my sample is 27.56, the mean of IAAF tested athletes is 26.98 and the mean of USADA sample is 25.91, i.e. there is almost 1 year difference in the mean age between testing samples. Two means difference test shows a highly significant difference between Positives and USADA sample (p-value = 1.104e07) and less but still significant difference between Positives and IAAF sample (p-value = 0.02269) and also very significant difference between USADA and IAAF samples (p-value = 1,1078E-09). The last comparison is crucial to understand if both samples are interchangeable or significantly different from each other. If both subsamples are similar it adds credibility to my estimates. Closer look at the data structure reveals that there is a bias towards the youngest age groups. As can be seen in the table 2, there is an anomaly in the 19y age-group in both IAAF and USADA, in the latter case 19y are tested 3 times more than 18y and 6 times more than 20y. This is given by intensive testing during junior championships. Comparing my datasets restricted to observations above age 20 reveals a different picture. Table 3 summarizes all correlation coefficients and t-tests for

8

Table 1: Samples Distribution by Age-group AGE 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49

Positive 8 20 12 20 25 26 28 30 25 33 30 19 31 17 20 12 11 7 10 7 8 2 1 3 3 3 1 2 0 2 1 0 417

% IAAF % SMR USADA 1,92% 63 2,40% 0,80 71 4,80% 113 4,31% 1,11 198 2,88% 70 2,67% 1,08 31 4,80% 109 4,15% 1,15 44 6,00% 140 5,34% 1,12 78 6,24% 163 6,21% 1,00 101 6,71% 186 7,09% 0,95 116 7,19% 220 8,38% 0,86 134 6,00% 237 9,03% 0,66 122 7,91% 225 8,57% 0,92 122 7,19% 197 7,51% 0,96 102 4,56% 171 6,52% 0,70 86 7,43% 152 5,79% 1,28 71 4,08% 116 4,42% 0,92 68 4,80% 104 3,96% 1,21 47 2,88% 91 3,47% 0,83 36 2,64% 52 1,98% 1,33 33 1,68% 47 1,79% 0,94 22 2,40% 43 1,64% 1,46 16 1,68% 35 1,33% 1,26 14 1,92% 37 1,41% 1,36 10 0,48% 16 0,61% 0,79 8 0,24% 16 0,61% 0,39 3 0,72% 6 0,23% 3,15 2 0,72% 6 0,23% 3,15 1 0,72% 2 0,08% 9,44 3 0,24% 3 0,11% 2,10 2 0,48% 1 0,04% 12,59 4 0,00% 0 0,00% 0,00 3 0,48% 1 0,04% 12,59 3 0,24% 1 0,04% 6,29 2 0,00% 1 0,04% 0,00 2 100,00% 2624* 100,00% 1555* *Sum of 18-49y range, not all observations.

9

% 4,57% 12,73% 1,99% 2,83% 5,02% 6,50% 7,46% 8,62% 7,85% 7,85% 6,56% 5,53% 4,57% 4,37% 3,02% 2,32% 2,12% 1,41% 1,03% 0,90% 0,64% 0,51% 0,19% 0,13% 0,06% 0,19% 0,13% 0,26% 0,19% 0,19% 0,13% 0,13% 100,00%

SMR 0,42 0,38 1,44 1,70 1,20 0,96 0,90 0,83 0,76 1,01 1,10 0,82 1,63 0,93 1,59 1,24 1,24 1,19 2,33 1,86 2,98 0,93 1,24 5,59 11,19 3,73 1,86 1,86 0,00 2,49 1,86 0,00

both the restricted samples (20+) and samples using all observations. T-test clearly shows that without the outliers in group 19y there is no difference in terms of mean (µiaaf = 27.5884 ; µusada = 27.5876), this has also moved both their p-values with respect to Positive close to each other. Table 2: Analysis of Samples Used Correlation Matrix 20+ USADA IAAF

0.856346754

Positive

0.901954964

All

IAAF

USADA

IAAF

0.902060483 0.897778459

0.929970542

0.907646277

Two samples difference T-test p-value IAAF

0.996141491

Positive

0.036358391

1.1078E-09 0.025907911

9.95644E-06

0.226082345

Here I can conclude that there is a significant difference between age means of those who are tested and those who fail doping tests. The second main source of data is the leaked blood samples database from IAAF that was used by Sunday Times and ZDF/ARD documentary. Its validity was approved by WADA’s independent commission report n.2 which also reveals that the database comes from Giuseppe Fischetto who worked as a delegate for IAAF and European Athletics (EA) in years 2008 to 2012. The dataset contains mainly tests made by IAAF but also by NADOs, event organizers and WADA.9 The database covers period between 2001 and 2012 with 12360 blood samples of both genders, 6626 for men and 5740 for women. Pre-competition tests are predominant (N=9336), followed by out-of-competition tests (N=1068) and incompetition tests (447). 6438 tests were conducted in endurance disciplines, 3275 in non-endurance. For the rest, the information about discipline was not specified. The original database contained name of athletes, however during handover of the database the names were deleted, afterwards an ID for each anonymous athlete was created so that panel regression can be estimated. Independent commission has compared the database with data from ADAMS 9 Authenticity of the database can be proved in person to anyone who is interested upon a request.

10

(The Anti-Doping Administration & Management System), to which IAAF started to enter data from early 2009. Of all samples 41.5% were matched within ADAMS. However, several samples from early 2009 were not matched since blood analysis at these events was not performed in compliance with ABP guidelines. From the World Championships in Berlin on 11th August 2009 onwards 97.3% of samples were matched in ADAMS. The number of unmatched samples since 2009 is less than 141. The unmatched samples can be caused by data entry errors such as wrong discipline, therefore the actual number cannot be defined precisely. WADA experts also found 35 samples in ADAMS where IAAF was testing authority and which were not found in the database.

10

There are several important blood measures stored for each blood sample, haematocrit level (HCT), hemoglobin (HGB) and a percentage of reticulocytes (RET%). To determine suspicious blood values the so called OFF-score was developed, it incorporates the information of HGB and RET% into one number. A very low or very high value is an indicator of blood doping. The ”normal” value of an off-score lies between 80 and about 110. Higher values than 110 indicate a recent use of blood doping (in the preparation for the competition). Low OFF-score is a sign that the body is getting back to its natural equilibrium by lowering production of red blood cells. Since most of the observations in the database are pre-competition tests, I will put emphasis on those and check the correlation between age and off-score. The average age for athletes with an off-score higher or equal to 110 is 26.66 (N=917), while for the rest it is 25.67 (N=8419, p-value = 2.228e-08). For off-scores higher or equal to 120 the average age is even higher, 27.26 (N=359, p-value = 5.467e-09). Table 4 provides information for each cohort. The following statistics are computed: number of samples with an OFF-score higher than 110 (of f > 110), number of samples with an OFF-score higher than 120 (of f > 120), number of samples in the respective cohort N , share of samples with respect to the whole population Share N and the last four columns show the age-distribution of samples with high OFF-scores and the respective standardized mortality ratios. It is easily recognizable from the table that high OFF-scores are positively correlated to age. The only group where OFF-scores are below average is in the youngest (18-24). In the oldest age group (36-40) the prevalence of extreme blood profiles is 77% (> 110) and even 230% (> 120) 10 Those

samples were probably lost when data were manually attached to the spreadsheet.

11

Table 3: OFF-scores and age AGE 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 18-24 25-30 31-35 36-40

off>110 7 32 26 36 57 47 56 66 50 44 59 57 39 43 23 26 17 20 23 14 7 5 2 4 261 315 129 51

off>120 2 11 9 15 17 15 15 31 26 12 30 27 19 18 11 13 11 10 14 7 3 3 1 2 84 145 63 28

N 159 607 284 451 648 507 563 599 577 591 516 473 405 369 279 222 184 148 110 87 47 41 16 19 3219 3161 1202 301

Share N 2,00% 7,63% 3,57% 5,67% 8,15% 6,37% 7,08% 7,53% 7,25% 7,43% 6,49% 5,95% 5,09% 4,64% 3,51% 2,79% 2,31% 1,86% 1,38% 1,09% 0,59% 0,52% 0,20% 0,24% 40,47% 39,74% 15,11% 3,78%

Share 110 0,92% 4,21% 3,42% 4,74% 7,50% 6,18% 7,37% 8,68% 6,58% 5,79% 7,76% 7,50% 5,13% 5,66% 3,03% 3,42% 2,24% 2,63% 3,03% 1,84% 0,92% 0,66% 0,26% 0,53% 34,34% 41,45% 16,97% 6,71%

Share 120 0,62% 3,42% 2,80% 4,66% 5,28% 4,66% 4,66% 9,63% 8,07% 3,73% 9,32% 8,39% 5,90% 5,59% 3,42% 4,04% 3,42% 3,11% 4,35% 2,17% 0,93% 0,93% 0,31% 0,62% 26,09% 45,03% 19,57% 8,70%

SMR 110 0,46 0,55 0,96 0,84 0,92 0,97 1,04 1,15 0,91 0,78 1,20 1,26 1,01 1,22 0,86 1,23 0,97 1,41 2,19 1,68 1,56 1,28 1,31 2,20 0,85 1,04 1,12 1,77

SRM 120 0,31 0,45 0,78 0,82 0,65 0,73 0,66 1,28 1,11 0,50 1,44 1,41 1,16 1,20 0,97 1,45 1,48 1,67 3,14 1,99 1,58 1,81 1,54 2,60 0,64 1,13 1,29 2,30

higher compared to population average. The first conclusion based on the blood-samples database is that athletes with high off-scores are significantly older, approximately by 1 year, and the prevalence of extreme blood values is higher in older age groups. Both sources of data point to the conclusion that doping in older age-groups is more common, however validity of this result have to be checked by regression analysis that will also consider control variables.

3.2

Regression Analysis

So far I have discovered that athletes tested positive or having abnormal blood values are on average approximately 1 year older. Regressions will allow me to look deeper in the age-doping relationship and control for other variables that might affect doping attitude. In the first part of this section I will run a regression using the public data on tests and positives. Regressions based on the leaked blood database will follow.

12

3.2.1

Public data

In order to run a regression, I will use the testing data by IAAF and combine them with the data on positives into one database. This strategy has several drawbacks, the most important one is that the IAAF sample of tested athletes is not a population, therefore I have to assume that the testing sample is representative of the whole population. Secondly, the share of positives is obviously much higher than in reality, this can affect the size of the estimated coefficients. The data will be restricted to age-groups 20+ to avoid the bias made by excessive testing of junior athletes. Period 2009-2013 was chosen so that the data coincide in terms of time. The reduction will leave 274 positives and 3170 tested athletes. Numbers of tests per athlete were not added since they are known only for the tested athletes. A possible effect of adding number of tests into the analysis will be discussed later. Both IAAF sample and Positives contain information on gender of athletes and their nationality and a set of dummy variables for each year. Variables included in the following regression are P ositivei (1-positive, 0not), AGEi is the age of an athlete, M eni is 1 for a man and 0 otherwise, y20xx is a dummy for each year. Dummies for all countries were added too.11 A simple OLS or Logit of the equation 4 gives no significant result for AGE, by looking at the table 2 one can observe that the relationship might be rather nonlinear with peaks in the beginning and also in the end of the age-distribution. Model in equation 5 therefore adds squared term for AGE and control variables. P ositivei = α + βAGEi

P ositivei = α + β1 AGEi + β2 AGEi2 + γM eni +

i=4 X i=1

(4)

δi Y eari +

i=128 X

φi Countryi

i=1

(5) Table 4 reports the results. Both linear and quadratic terms are significant and the results are predicting that both younger and older athletes are more often caught in the anti-doping net. This effect is stronger for older athletes, 11 Other control variables might extend the analysis, the data specifiyng Out-of-competition (OOC) and In-competition tests could also play a role, unfortunately my dataset of tested athletes by IAAF is based solely of OOC; relative performance of an athlete or her relative ranking would be another interesting extension.

13

e.g. for 37y the probability of being positive was 80% higher than for 27y age group. Table n. 4 Dependent variable: Positive (1) OLS

(2) Logit

(3) Probit

const

0.3934∗ (0.2073)

1.480 (1.721)

0.7907 (0.9035)

AGE

−0.02506∗∗ (0.008142)

−0.2597∗∗ (0.1111)

−0.1447∗∗ (0.05883)

sq AGE

0.0004477∗∗ (0.0001345)

0.004610∗∗ (0.001773)

0.002559∗∗ (0.0009495)

Men

0.01312 (0.009069)

0.3371∗∗ (0.1502)

0.1780∗∗ (0.07548)

y2009

−0.04827∗∗ (0.01528)

−0.7969∗∗ (0.2376)

−0.4192∗∗ (0.1202)

y2010

−0.05322∗∗ (0.01377)

−0.7776∗∗ (0.2084)

−0.4246∗∗ (0.1080)

y2011

−0.08365∗∗ (0.01383)

−1.293∗∗ (0.2505)

−0.6548∗∗ (0.1214)

y2012

−0.06018∗∗ (0.01266)

−0.8554∗∗ (0.1901)

−0.4480∗∗ (0.09806)

3127 0.2301 184

3127 0.1130 −748.8

3127 0.1138 −748.1

n ¯2 R `

Standard errors in parentheses Country Dummy variables are not shown in the table * indicates significance at the 10 percent level ** indicates significance at the 5 percent level For logit and probit, R2 is McFadden’s pseudo-R2

The results obtained here will be decomposed in the appendix.

3.3

Regression on leaked IAAF database

The remaining empirical strategy is to run a regression on the IAAF leaked blood data. To check the correlation between age and doping attitude I will use 14

several approaches, firstly I treat the data as cross-sectional and secondly I will create an unbalanced panel dataset from all the individuals tested more than once. Off-score is the main measure used in ABP and combines information about hemoglobin and reticulocytes into one number. A ”normal” off-score lies approximately between 80 and 115.12 Athletes with values outside this range can be considered, with high level of certainty, as doped. For cutoff points 116.7 for men and 104.4 for women (1% probability of a false positive, Gore et al. 2003) the database contains 940 samples with off-scores higher than the cut-off point, a significant amount. To run a cross-section regression I restrict the database to normal or high offscores (higher or equal to 80) and take into consideration only pre-competition tests. A doper is expected to have a high off-score in a pre-competition test and a non-doper is expected to have a ”normal” value, on the other hand an ex-doper (for whom the competition at which she was tested is not of main interest) will exhibit a low off-score. Therefore if all the observations are taken into account it would lead to an underestimation of the true effects. This restriction leaves 4490 observations.

OF F scorei = α + β1 AGEi + γM eni +

i=6 X

δi Y eari +

i=1

i=218 X

φi Countryi

(6)

i=1

The following table contains four variations of model in equation 7, (1) and (2) are regressions without controlling for year of test (Y eari ) and country, (3) and (4) add those variables (not reported in the table). OLS estimates Dependent variable: OFFscore 12 The

range can vary for each individual and situation.

15

const

AGE

(1)

(2)

(3) controls

(4) controls

89.90∗∗

87.43∗∗

78.38∗∗

76.36∗∗

(1.028)

(1.888)

(5.987)

(6.175)

0.1394∗∗

0.3185∗∗

0.1854∗∗

0.3295∗∗

(0.4131)

(0.4131)

(0.4119)

(0.4119)

−0.003143

sq AGE

Endurance

M

n ¯2 R `

−0.002512

(0.03726)

(0.1209)

(0.03670)

(0.1145)

1.544∗∗

1.563∗∗

1.817∗∗

1.830∗∗

(0.3964)

(0.3965)

(0.4344)

(0.4345)

3.851∗∗

3.862∗∗

6.696∗∗

6.706∗∗

(8.254)

(8.254)

4490

4490

4490

4490

0.0263

0.0266

0.1739

0.1741

−1.777e+04

−1.777e+04

−1.73e+04

−1.729e+04

Standard errors in parentheses * indicates significance at the 10 percent level ** indicates significance at the 5 percent level As in the previous estimates, coefficient for age is highly significant, increasing age by one year increases OFF-score by circa 0.2, i.e. a 20 years difference between 2 athletes translates into 4 points difference in an OFF-score. Athletes in endurance disciplines are predicted to have OFF-scores by almost 2 points higher. Men have generally higher OFF-scores, however this difference cannot be attributed to doping attitude but rather to physiological differences. Using athletes’ ID I can create a panel of those athletes that were tested at least twice. This leaves us 4819 observations for 1332 athletes. The following table reports estimates on Pooled OLS (1-3), fixed effects (4-6). Columns 7-8 show coefficients estimated only for pre-competition tests (3698 observations) and the last two columns take into account only pre-competition tests and OFFscores higher than 80 (2352 observations). Dependent variable: OFFscore

16

const

AGE

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

Pooled OLS

Pooled OLS

Pooled OLS

FE

FE

FE

FE-Pre

FE-Pre

FE-Pre80

FE-Pre80

84.73∗∗

85.81∗∗

63.58∗∗

79.98∗∗

105.4∗∗

81.59∗∗ const

66.99∗∗

111.3∗∗

85.03∗∗

110.0∗∗

(1.570)

(7.263)

(6.723)

(4.141)

(12.95)

(15.26)

(4.058)

(12.68)

(4.327)

(13.33)

0.1345

∗∗

(0.05767)

sq AGE



−1.293

0.8392

(0.9348)

(0.1537)

−0.01078

0.03587∗∗

0.8392∗∗

0.06408∗∗

0.03607∗∗

(0.008668)

(0.01735)

(0.01705)

(0.01739)

(0.01819)

(0.009472)

∗∗

−2.586

∗∗

(0.9508)

0.001447

M

∗∗

−1.628

(0.4853)

OUT

∗∗

(0.1544)

0.8254

(0.5300)

IN

`

∗∗

0.3117

0.05399

PRE

n ¯2 R



(0.9420)

0.5348

(0.1635)

−1.397 (0.9876)

∗∗

3.242

7.868

(0.6689)

(0.7307)

−11.41

−10.37

(8.925)

(7.886)

0.5312

0.4916

(1.799)

(1.630)

15.51∗∗

14.22

(0.5148)

(15.62)

4819

4819

4819

4819

4819

4819

3698

3698

2352

2352

0.0009

0.0007

0.1653

0.0012

0.0024

0.0387

0.0117

0.0170

0.0069

0.0094

−2.115e+04

−2.115e+04

−2.071e+04

−1.87e+04

−1.87e+04

−1.861e+04

−1.366e+04

−1.365e+04

−8204

−8201

Every single estimate provides a positive relationship between age and offscore. The linear coefficient for age varies between 0.14 for pooled OLS and 0.84 for only pre-competition tests fixed effects. Estimates in columns 1-8 are all expected to be a bit biased. The nature of OFF-score makes it a bit tricky to read the results since not just very high OFF-scores but also very low are a sign of doping, therefore columns 9 and 10 report fixed effects models with observations limited to pre-competition tests and OFF-scores above 80, in this region an increase in the value of OFF-score increases the probability that the athlete was taking PEDs to improve his performance, a one year increase raised the OFF-score by app. 0.5 points, i.e. a 10 years difference between athletes meant 5 OFF-score points on average for the older one. The drawback of looking only to pre-competition tests and OFF-scores higher than 80 is the reduction of usable observations. The correlation between age and doping seems to be established with a high level of confidence, however self-selection bias can undermine the believe in a causal relationship. Only some athletes continue into their thirties, it can be argued that there is not a random selection, but rather a self-selection by those for whom it is a profitable choice. Those who dope in their twenties might be more successful and thus be more willing to continue their professional career. Despite the issue with causality, the estimates are still relevant for policy makers.

17

4

Discussion and conclusions

This paper provided a theoretical analysis of athlete’s doping choice with respect to his age and an extensive empirical analysis on that matter. A rational sportsman in the end of his career does not have to think about future economic costs of his behavior, he will therefore dope as much as possible with respect to both the probability of being caught and the effect doping has on his income. In rounds preceding the final round (year) his decision to dope is heavily affected by his prospects, higher expected future income will make current doping more costly. Extend to which an athlete is abusing PEDs is expected to grow as he approaches the end of his professional career. There are few problems in the empirical analysis that need to be addressed here. Reversed causality can be an issue in the first part of my analysis, an increase in the rate of positive doping samples in one cohort is expected to be followed by an increase in testing in that cohort, thus the distribution of tests and also positive tests is not random. To tackle this issue I decompose my empirical results in the appendix and come to the conclusion that my estimates of a share of positives in older age-groups were rather underestimated. All my empirical findings show that older athletes are relatively more often doped, however one has to be cautious when talking about causality. Even though a fixed effects estimate can reduce the omitted variable bias, still a selfselection bias occurs since only some athletes choose to compete in their 30s. It can be caused by the fact that those involved in doping are more willing to prolong their career because of higher expected returns. Another selection bias stems from the fact that athletes are not randomly chosen for anti-doping samples collection. To account for that more personal data would be necessary to estimate the choice model. Even though causality can’t be identified with high level of certainty, it is still worth estimating the correlation sign and its magnitude. A sign of the correlation coefficient is identical among all estimates, however the magnitude of the correlation varies. There was a 29% more highly suspicious blood samples in the age group 31-35 than average and even 230% more in the oldest one (36-40). A weaker, but still positive, relationship can be found in public data on positive doping tests in athletics. The regression estimates, controlling for gender, type of discipline and country, give similar results. Age of athletes seems to be a good predictor of their doping attitude, thus it is reasonable to use this information to improve anti-doping policies in a cost effective way.

18

Maennig (2002), who pointed to the end-game problem, discussed also possible solutions to it. Implementation of high-enough financial penalties instead of bans could better discourage athletes from doping regardless of their age. Another proposal is a conditional superannuation scheme (also discussed by Maennig, 2002) and experimentally studied by Wu et al. 2016. In this system, athletes would be obliged to send a share of their prize money to a special fund. The money accumulated in the fund would be paid back after the end of her career if she never tested positive. In my point of view, the estimated end-game effect is not that significant to think that an introduction of a complementary anti-doping policy, such as the superannuation scheme proposed by Maennig (2002) is necessary. However such a policy might still work to help to overcome the scientific delay anti-doping has in detecting PEDs directly in athletes’ blood/urine. In 2015, World Anti-Doping Agency has changed the recommended ban length from 2 to 4 years for the first rule violation.13 In the case of a second time violation the recommended punishment is still a life-time ban. The idea behind extending ban to 4 years is to further improve deterrence, however the problem of the end-game effect can be rather magnified. The longer a doping ban is the higher an inter-generational doping inequality can be expected. To understand the effect that different anti-doping rules might have, much more work has to be done. Experimental economics, more sophisticated gametheoretical models or multi-agent systems could give more insights. More empirical work on doping in sports is also needed, but for this to happen the cooperation of sport organizations and anti-doping agencies is necessary.

5

Appendix

5.0.1

Decomposition of Empirical Results

To understand better the meaning of the estimates based on public data and its possible biases I will decompose the results. Firstly, let’s write down a decomposition of a share of positive doping tests for an age group i. si =

Ai ∗ Ti ∗ di ∗ pi = Ti ∗ di ∗ pi Ai

(7)

The equation 7 is composed of a share of positive doping tests in cohort i 13 Source:

https://www.wada-ama.org/en/resources/the-code/world-anti-doping-code

19

si , which is given by the number of athletes Ai , number of tests per athlete Ti , ratio od doped sportsmen in given age group di and the probability of positive doping test when doped pi . Known variables are si , which is predicted by my empirical model and Ti is also known as a raw estimate from USADA data. Both di and pi are unknown. Figure 1 shows the distribution of tests per athlete per cohort as found in USADA database. Horizontal axis contains information on both number of tests(top) and cohort (bottom). On the first glance the curve is of a concave shape and is growing till athletes become circa 30 years old. It starts approximately at 1.5 tests per athlete in the beginning of career and continues to 3.5 to 4 in the time when their natural performance should be diminishing (30 to 38 years of age). The surprisingly non-decreasing second part of the graph can be explained by the fact that only those athletes that are good enough (or doped enough) to compete, stay in the top athletics. Age groups above 39 were not added since they consist of only few observations.

Figure 1: Distribution of tests per athlete per cohort from USADA database. Just a look at the graph clearly shows that the difference in testing numbers for older-groups and middle-age-groups is not significantly different, i.e. it cannot be an explanation of why older or younger athletes are estimated to have more positive doping tests. If the probability that a doped athlete will have a positive test when doped does not depend on age, then in order to make my estimations irrelevant, the curve of graph in figure 1 would have to be U-shaped. I will show that formally using equation 7. Let’s introduce another age-group m such that 20

si sm

= α. We can now decom-

pose structure of the determinants of relative difference between age groups: α= Now assume that

si sm

Ti di pi ∗ ∗ Tm dm pm

(8)

> 1 i.e. α > 1 and that

Ti Tm

< 1. Two possibilities

emerge in order to satisfy all the conditions. 1. if

pi pm

=1⇒

Ti Tm

di dm



> 1, so

di dm

>

Tm Ti

If an age-group i has higher estimated prevalence of positive tests than group m and there is no difference in probability of being caught while doped then it has to hold that the ratio of dopers between groups i and m has to be higher then ratio of tests between i and m. Simply put, if 28y old are tested 2 times more than 38y old, the prevalence of doping has to be more than 2 times higher in the older age group. Since older athletes are, according to the data, not tested less than younger ones, i.e. TTmi ≈ 1 and ssmi > 1, it has to hold that, given the assumption pi pm di dm

= 1 that

di dm

> 1. For 38y vs. 28y old athletes

si sm

≈ 2, therefore

≈ 2. Obviously, if the assumption holds and the estimates are correct,

then α also correctly corresponds to the prevalence of doping. More striking is the estimation for young athletes in the beginning of their career (IAAF data). For 18y old the equation 7 to hold

d18 d28

s18 s28

≈ 2 but

T28 T18

≈ 3, this implies that for

≈ 6, i.e. 18y would have to dope approximately

6x more than 28y. 2. if

di dm

=1⇒

Ti Tm



pi pm

> 1, so

pi pm

>

Tm Ti

Other option is that both age groups have the same prevalence of doping and the perceived higher prevalence of positive doping tests is given by different probability of being tested positive. If 38y and 28y old athletes do have the same share of dopers, probability of being caught has to be higher in group i and has to be higher than the respective proportion of tests in group m over group i. As in the previous example, since TTmi ≈ 1 and an estimated share of positive tests is ssmi ≈ 2, it has to hold that ppmi ≈ 2 in order to make the estimated coefficients incorrect. The decomposition has shown that when a number of tests per athlete is added to the analysis, assuming that the data are representative, the predicted shape of the age-doping distribution will not change. Nevertheless the estimated α can possibly be a product of variation in 21

pi pm

more than in variation of

di dm ,

therefore any inference about the prevalence of doping among different age

groups have to be taken with this in mind. It should be noted that differences in pi would have to be quite large in order to create a non-upwards sloping curve of age-distribution of doped athletes. My empirical estimates seem to be quite robust, however any inference should not go beyond the relationship between age and positive doping tests.

References [1] Berentsen, A. (2002). The economics of doping. European journal of political economy, 18(1), 109-127. [2] Bervoets, S., Decreuse, B., & Faure, M. (2014). On doping and recovery (No. 1441). Aix-Marseille School of Economics, Marseille, France. [3] Breivik, G. (1992). Doping games a game theoretical exploration of doping. International Review for the Sociology of Sport, 27(3), 235-253. [4] Coupe, T., & Gergaud, O. (2012). Suspicious blood and performance in professional cycling. Journal of Sports Economics, 1527002512441481. [5] Dilger, A., and F. Tolsdorf (2005) Doping als Wettkampfphanomen, in Events im Sport: Marketing, Manage- ment, Finanzierung, Kongresses,

Beitrage des 3. Deutschen Sportokonomieedited by H.-D. Horch,

J.Heydel,

and A.

Sierau.Cologne,Germany: Institute for Sports Economics and SportsManagement, 26979 [6] Emrich, E., & Pierdzioch, C. (2013). A note on the international coordination of antidoping policies. Journal of Sports Economics. [7] Gore, C. J., Parisotto, R., Ashenden, M. J., Stray-Gundersen, J., Sharpe, K., Hopkins, W., ... & Hahn, A. G. (2003). Secondgeneration blood tests to detect erythropoietin abuse by athletes. Haematologica, 88(3), 333-344.

22

[8] Haugen, K. K. (2004). The performance-enhancing drug game. Journal of sports economics, 5(1), 67-86. [9] Maennig, W. (2002). On the economics of doping and corruption in international sports. Journal of Sports Economics, 3(1), 61-89. [10] Wu, Q., Bayer, R., & Lenten, B. L. (2016). A Comparison of Anti-Doping Measures in Sporting Contests, working paper (No. 2016-11).

23

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.