Pearson residual and efficiency of parameter estimates in generalized linear model

Share Embed


Descripción

Journal of Statistical Planning and Inference 141 (2011) 1014–1020

Contents lists available at ScienceDirect

Journal of Statistical Planning and Inference journal homepage: www.elsevier.com/locate/jspi

Pearson residual and efficiency of parameter estimates in generalized linear model Jing Xu a,, Michael LaValley b a b

Millennium Pharmaceuticals, Inc., The Takeda Oncology Company, 35 Landsdowne Street, Cambridge, MA 02139, USA Department of Biostatistics, Boston University, 801 Massachusetts Avenue, Boston, MA 02118, USA

a r t i c l e i n f o

abstract

Article history: Received 1 June 2009 Received in revised form 30 August 2010 Accepted 2 September 2010 Available online 9 September 2010

We demonstrate that the efficiency of regression parameter estimates in the generalized linear model can be expressed as a function of Pearson residuals and likelihood based information. The relationship provides an easy way to derive sandwich variance b for a specific distribution within the exponential family. In generalized estimators on b linear models, the correlation between Pearson residual and Fisher information can be used to predict the error ratio of quasi-likelihood variance versus sandwich variance when the sample size is sufficiently large. The derived theory can help to determine which conventional approach to use in the generalized linear model for certain types of data analysis, such as analyzing heteroscedastic data in linear regression; or to analyze over-dispersed data for single parameter families of distributions. The results from reanalysis of a clinical trial data set are used to illustrate issues explored in the paper. & 2010 Elsevier B.V. All rights reserved.

Keywords: Empirical robust sandwich variance estimator Generalized linear model Heteroscedasticity consistent covariance estimator Over-dispersed data Pearson residual Quasi-likelihood inference

1. Introduction In the generalized linear model, maximum likelihood (MLE), quasi-likelihood (QL) and sandwich variance (also known as the robust covariance or empirical covariance) are conventional ways to compute the variance of regression parameter estimates. The efficiency of regression parameter estimates using these conventional approaches has been studied under certain specific distribution assumptions, such as normal, Poisson and binomial distributions (Breslow, 1990; Cox, 1983; Kauermann and Carroll, 2001; Moore, 1986; Nelder and Lee, 1992; Moore and Tsiatis, 1991; Mancl and Leroux, 1996). When there exists no over- or under-dispersion and the model is correctly specified, the MLE should be used as asymptotically it is the most efficient approach (Cox, 1983; Moore, 1986); otherwise, the QL or sandwich variance approaches should be used. In a simulation study, Breslow (1990) recommended using the sandwich variance based empirical score test in over-dispersed Poisson regression problems where the true mean/variance relationship is unknown and the sample size is sufficiently large. The study also showed that the sandwich variance estimators generally have a larger variance than likelihood and quasi-likelihood variance estimates, which affects the efficiency of regression parameter estimates when sample size is small. Kauermann and Carroll’s (2001) research confirmed this notion and it gave a theoretical justification for the under-coverage confidence intervals of regression parameters by highly variable estimates of the sandwich variance.

 Corresponding author. Tel.: + 1 617 374 7754; fax: + 1 617 551 4980.

E-mail addresses: [email protected] (J. Xu), [email protected] (M. LaValley). 0378-3758/$ - see front matter & 2010 Elsevier B.V. All rights reserved. doi:10.1016/j.jspi.2010.09.003

J. Xu, M. LaValley / Journal of Statistical Planning and Inference 141 (2011) 1014–1020

1015

In some clinical trials and epidemiology studies, it is of interest to compare the person-time rates between an active group and a control group, for which both the number of recurrent events and follow-up time are random variables and the homogeneous Poisson assumption (Lachin, 2000) does not hold, i.e., the data will be either over- or under-dispersed relative to model assumptions. Therefore, when Poisson regression model is used, quasi-likelihood or sandwich variance inference should be used. As pointed out by Kauermann and Carroll (2001), for the generalized linear model, variance estimates have two different sources of stochastic variation. The first source is estimation of the dispersion parameter, if it is unknown; the second source is the use of plug-in estimates, which are used if the variance function depends on the mean. Kauermann and Carroll’s study considered cases related to the second source via comparing relative efficiency of quasi-likelihood variance versus sandwich variance. In this paper, we investigate issues related to the first source from a Pearson residual perspective, which can provide an explanation of simulation results. We will derive a generic error ratio b with different conventional variances, which cover the (Moore and Tsiatis, 1991) or variance ratio relationship for b exponential family. We demonstrate that the correlation between the Pearson residual and the Fisher information can be used to determine the error ratio of QL versus sandwich variance estimates when the sample size is sufficiently large. The motivation of our research started from using the Poisson regression model on over- or under-dispersed person time data, with results that cover a much broader scope on efficient inference when using a generalized linear model. 2. Base Fisher information To proceed, we first define what we mean by ‘base Fisher information’ and show that different variance estimators in the generalized linear model can be expressed as functions of this base Fisher information. Suppose that yi is a scalar, xti is a 1  p vector of covariates, and b is a p  1 vector of regression coefficients. In the generalized linear model, we model a function of mean response to a set of predictors, and the parameter of interest is b: gfEðyi jxÞg ¼ gðmi Þ ¼ xti b, where g(  ) is the link function. The model assumes the variance of Yi is a function of its expectation, which is estimated from the data: varðyi Þ ¼ fVðmi Þ,

ð1Þ

where f is the (over) dispersion parameter, and Vðmi Þ is a variance function (McCullagh and Nelder, 1989). For the likelihood approach, we assume that all parameters, including f, are fully specified or can be estimated by maximum likelihood. A consistent estimator of b is given by the score equation  n  1X @mi sðbÞ ¼ ð2Þ Vðmi Þ1 ðYi mi Þ ¼ 0, f i ¼ 1 @b b estimate is and the variance of the b (   t )1 n  X @mi 1 @mi b : varðYi Þ varðb Þ ¼ @b @b i¼1

ð3Þ

b is of the form From Eqs. (1) and (3), and defining Di ¼ @mi =@b, the likelihood (hereby denoted by L) variance of b ( )1 !1 n n X X bÞ ¼ f  var ðb ðD uVðm Þ1 D Þ ¼f z , L

i

i

i

i¼1

i

ð4Þ

i¼1

where zi ¼ Di uVðmi Þ1 Di represents the individual contribution from subject i to the Fisher information under the generalized linear model when f ¼ 1. We define zi to be the base Fisher information from each subject. For Poisson or b Þ equals the inverse of the sum of the base Fisher information across subjects. binomial regression, var ðb L

For a model based quasi-likelihood (hereby denoted by Q) approach, only the mean function is fully specified. When over-dispersion is present, the variance–mean relation no longer holds for a single-parameter distribution and the likelihood cannot be correctly specified for f, instead, it is typically estimated with a moment estimator (McCullagh and Nelder, 1989): b ¼ f Q

N N b i Þ2 1 X ðyi m 1 X e2 , ¼ b iÞ nq i ¼ 1 vðm nq i ¼ 1 i

ð5Þ

where n represents the sample size, q is the number of parameters in the model, and e2i is the square of the Pearson residual for subject i. A consistent estimator of b is still obtained from Eq. (2), now a semi-parametric weighted least square quasib Þ estimate is obtained by likelihood score equation (McCullagh and Nelder, 1989; Wedderburn, 1974); but the varðb bÞ ¼ f b  dQ ðb var Q

n X i¼1

!1

zi

,

ð6Þ

1016

J. Xu, M. LaValley / Journal of Statistical Planning and Inference 141 (2011) 1014–1020

b is from Eq. (5). In both likelihood and quasi-likelihood approaches, v b Þ is only consistent when varðY Þ ¼ Vðm Þf d where f arðb i i is correct, which requires that f be correctly specified or estimated. With over- or under-dispersed data, the b estimate is b in the (quasi-) score equation (2). However, the varðb b Þ estimate may no longer be not affected since it is independent of f b in the quasi-likelihood valid if f cannot be correctly specified or estimated in the likelihood approach. Moment based f Q approach works well for most, but not all, over- or under-dispersed data, with possible loss of efficiency (Cox, 1983). The sandwich variance (hereby denoted by S) inference also obtains a consistent b estimator from Eq. (2). But it obtains b Þ as follows: assume the same GLM setting, then under certain regularity the GEE (Liang and Zeger, 1986) type of var S ðb conditions, b Þ ¼ lim 1 L1 L L1 , var S ðb 1 0 n-1 n 0

ð7Þ

where 1 L0 ¼ n

(

n X

) Di uVðyi Þ

1

Di

¼

i¼1

n f X

n

!1

zi

ð8Þ

i¼1

and L1 ¼

1 n

(

N X

) Di uVðyi Þ1 ðyi mi Þ2 Vðyi Þ1 Di :

ð9Þ

i¼1

b Þ can be found in Lachin (2000, pp. 501–502). For sandwich variance inference, A condensed matrix version of var S ðb varðyi Þ ¼ fVðmi Þ is a ‘working variance’ function, where f can either be specified or estimated by the likelihood approach or a model based quasi-likelihood approach. This means that varðYi jXi Þ may or may not equal to fVðmi Þ. When it does, the ‘working variance’ assumption is correct and the sandwich variance approach maintains high efficiency. The empirical sandwich variance estimators are consistent even when the working variance assumption is not correct. The price paid is that the sandwich variance estimates can be highly variable. 3. A generic error ratio relationship in generalized linear model With expressions and definitions from Section 2, we show that error ratios, or variance ratios, on the regression b Þ using conventional sandwich variance, quasi-likelihood and maximum likelihood variance approaches can parameter ðb be established in the generalized linear model as functions of zi and e2i . First, from expressions (4) and (6), the error ratio (ER) of likelihood approach versus quasi-likelihood approach has the form b Þ ¼ f=f b : b ,b ERðb L Q Q

ð10Þ

Expression (10) is not meaningful when either maximum likelihood or quasi-likelihood approach does not maintain correct test size. Error ratios involving sandwich variance are given by the following two theorems (proofs are given in the Appendix): Theorem 1. Under generalized linear model setting, assume f is estimated by Eq. (5) and sample size is not small, ER of the quasi-likelihood variance (denoted by Q) with respect to the sandwich variance (denoted by S) can be expressed as   P 1 Pn Pn Pn P e2i ð ni¼ 1 zi Þ 2 i ¼ 1 e  z e2 ni¼ 1 zi n j i ¼ 1 j ¼ 1 i b Þ¼ P b ,b ¼ ¼ , ð11Þ ERðb P P Q S n n n 2 2 1 Pn Pn 2 i¼1 i ¼ 1 ei  zi i ¼ 1 ei  zi i¼1 i ¼ 1 ei  zi n and asymptotically, it has the form Asy:

2 b ,b b Þ ¼ Eðe Þ  EðzÞ : ERðb Q S Eðe2  zÞ

ð12Þ

Theorem 2. Assume the same setting and regularity conditions as in Theorem 1. ER of the likelihood based variance (denoted by L) with respect to sandwich variance (denoted by S) can be expressed as Pn b Þ ¼ Pf i ¼ 1 zi , b ,b ð13Þ ERðb L S n 2 i ¼ 1 ei  zi and asymptotically, it has the form Asy:

EðfÞ  EðzÞ b ,b b ERðb , L SÞ ¼ Eðe2  zÞ

ð14Þ

J. Xu, M. LaValley / Journal of Statistical Planning and Inference 141 (2011) 1014–1020

1017

P P P where Eðe2 Þ ¼ limn-1 ðð1=nÞ ni¼ 1 e2i Þ, EðzÞ ¼ limn-1 ðð1=nÞ ni¼ 1 zi Þ, and Eðe2  zÞ ¼ limn-1 ðð1=nÞ ni¼ 1 e2i  zi Þ. In Theorem 2, f is either fixed or estimated by maximum likelihood. Since the variance estimators involved in the theorems are derived under certain regularity conditions, the above expressions only hold when sample size is sufficiently large. Monte Carlo simulation studies were conducted to test the theorems above. Due to space constraints, detailed simulation results are included in a technical report which is available upon request from the authors. Some simulation results will be highlighted in the following discussions when pertinent. 4. Applications of the generic error ratio relationship in generalized linear model Theorem 2 provides a way to derive empirical robust variance estimators for a specific exponential distribution, which is much simpler than using expressions (7)–(9), as will be exemplified in the following subsections for the linear, Poisson and logistic regression models, respectively. In Theorem 1, expressions (11) and (12) take the form of algebraic and expectation versions of the Chebychev’s Covariance Inequality in terms of e2i ’s and zi ’s (Tong, 1980; Casella and Berger, 2002; Herman et al., 2000), respectively, which enables us to study the (asymptotic) error ratio of the quasi-likelihood variance with respect to the sandwich variance using the correlation relationship of e2i ’s and zi ’s. 4.1. Applications in linear regression model When the dependent variable Yi ’s are normally distributed, the generalized linear model reduces to linear regression with identity link, variance function vðmi Þ ¼ 1, and f ¼ s2 . From expression (13) and Eq. (4), we have !1 Pn n 2 X i ¼ 1 ei zi 1 b b b b f z var ðb Þ ¼ ERðb , b Þ var ðb Þ ¼ P S

L

¼

n X i¼1

S

!1

zi

L

n X i¼1

f

! e2i zi

n i¼1

n X

zi

!1

i

i¼1

zi

i¼1

b Þ ¼ fðPn z Þ1 ¼ s2 ðXuXÞ1 , i.e., ðPn z Þ1 ¼ ðXuXÞ1 , and Pn e2 z ¼ Xudiagðe2 ÞX, therefore, for linear regression, varL ðb i¼1 i i¼1 i i¼1 i i i a sandwich variance estimator for linear regression has the following form: b Þ ¼ ðXuXÞ1 fXudiagðe2 ÞXgðXuXÞ1 : varS ðb i b i Þ2 fvðm b i Þg1 ¼ ðyi m b i Þ2 , we have Since vðmi Þ ¼ 1, and e2i ¼ ðyi m 2 1 b Þ ¼ ðXuXÞ1 fXudiagðy m d v ar S ðb , i b i Þ XgðXuXÞ

ð15Þ

b ¼ Xu ðXuXÞ1 XuY. This estimator derived from Theorem 2 has exactly the same form as the b i ¼ Xui b where m i heteroscedasticity-consistent covariance estimator first presented by White (1980), with more complicated derivations and proofs. For the simple linear regression model, Yi ¼ b0 þ b1 Xi þ ei , the least square approach is the same as the quasi-likelihood P P b 2 ¼ ð1=ðn2ÞÞ ni¼ 1 e2i ¼ ð1=ðn2ÞÞ ni¼ 1 ðyi m b i Þ2 . Both the theory and simulation results approach where z ¼ XuX, and s 2 show that: (1) if all model assumptions are met, ei ’s are random error, and they are not correlated with zi ’s, then b ,b b Þ ) 1, and ERðb b ,b b Þ also ) 1 when sample size is sufficient large; and (2) when e2’s are positively correlated with ERðb i Q S L Q zi ’s, such as under a heteroscedastic variance case, by Chebychev’s Covariance Inequality (Casella and Berger, 2002; b ,b b Þ r 1 when sample size is sufficiently large. This loss of efficiency is a price paid by the Herman et al., 2000), ERðb Q S sandwich variance approach to keep the right test sizeunder heteroscedasticity. When sample size is small, Theorems 1 and 2 do not apply. But Kauermann and Carroll’s (2001) research provided small sample properties, with conclusions as summarized in Section 1. 4.2. Applications in Poisson and logistic regression models When the dependent variable Yi ’s follow a Poisson distribution, the regression model has the log link, variance function b Þ ¼ ðPn z Þ1 ðPn e2 z Þ vðmi Þ ¼ mi , and f ¼ 1. By the same arguments for deriving expression (15), we have varS ðb i¼1 i i¼1 i i Pn Pn Pn 1 1 1 b ð i ¼ 1 zi Þ , for Poisson regression, var L ðb Þ ¼ ð i ¼ 1 zi Þ ¼ ðXuWXÞ , i.e., i ¼ 1 zi ¼ XuWX, where W is a diagonal matrix of iterative weights from iteratively re-weighted least squares, which is equivalent to Fisher scoring and leads to the b i Þgnn , where m b i denotes the maximum likelihood estimate (McCullagh and Nelder, 1989), i.e., W ¼ fdiagðwii Þgnn ¼ fdiagðm b i Þ2 fvðm b i Þg1 ¼ ðyi m b i Þ2 ðm b i Þ1 . Thus, fitted value based on the current parameter estimates. From expression (5), e2i ¼ ðyi m we have ( ) n X b i Þ2 ðyi m 2 2 b i Þ2 X, b i X ¼ Xudiagðyi m ei zi ¼ Xudiagðei ÞWX ¼ Xudiag m mbi i¼1

1018

J. Xu, M. LaValley / Journal of Statistical Planning and Inference 141 (2011) 1014–1020

therefore, the sandwich variance estimator for Poisson regression has the following form: b Þ ¼ fXudiagðm d b i ÞXg1 fXudiagðyi m b i Þ2 XgfXudiagðm b i ÞXg1 : v ar S ðb

ð16Þ

When the dependent variable Yi ’s follow binomial distribution, the regression model has logit link, variance function vðmi Þ ¼ mi ðni mi Þ=ni ¼ ni pi ð1pi Þ, and f ¼ 1. Following the same procedures as we did for Poisson regression, and since for b i ð1p b i ðni m b i Þ=ni Þgnn ¼ fdiagðni p b i ÞÞgnn , the sandwich variance estimator for logistic regression, W ¼ fdiagðwii Þgnn ¼ fdiagðm logistic regression has the following form: b Þ ¼ ½Xudiagfn p d b i ÞgX1 fXudiagðyi ni p b i ÞgX1 , b i Þ2 Xg½Xudiagfni p b i ð1p v ar S ðb i b i ð1p

ð17Þ

b i =ni denotes the fitted value based on the last iteration from iteratively re-weighted least squares in bi ¼ m where p b Þ. Both Eqs. (16) and (17) derived from Theorem 2 have exactly the same forms as presented in Simonoff calculating varL ðb (2003, sections 5.3 and 9.5), where no derivations are given. As pointed out in Section 1, Theorem 1 provides some insight on the efficiency of parameter estimates. For Poisson 2

b i , while e2i ¼ ðyi =m b i 1=m b i Þ2 . For a fitted model, e2i ’s consist of both random error and the regression, z ¼ XuWX, where wii ¼ m P error associated with over-dispersion which could no longer be fully covered by ð ni¼ 1 zi Þ1 , thus they are correlated. In the presence of over- or under-dispersion, quasi-likelihood often leads to tests that are appropriately more conservative (Armitage and Colton, 1998), as verified by simulation results, zi ’s and e2i ’s tend to be negatively correlated in such cases. Thus, from ER expression (11), and by Chebychev’s Inequality (Casella and Berger, 2002; Herman et al., 2000), if zi ’s and e2i ’s b ,b b Þ Z 1, i.e., var ðb b Þ Zvar ðb b Þ, this leads to better test levels and better efficiency for b b are negatively correlated, ERðb Q

S

Q

S

with sandwich variance compared to using QL approach in analyzing over-dispersed data when both possess right test b for over- or under-dispersed data, which occurs sometimes, z ’s sizes. In contrast, when the QL approach under estimates f i

b ,b b Þ r1. Simulation results show that for either positive or negative correlated and e2i ’s are positively correlated, and ERðb Q S cases, Theorem 1 holds both under moderate sample sizes as well as asymptotically. 5. An example To apply the theory, we re-analyze the recurrent serious infection data from a double-blind clinical trial, which was conducted to test the effectiveness of gamma interferon on chronic granulomatous disease (CGD) (Fleming and Harrington, 1991). This set of data was originally analyzed using the extended Cox regression model for hazard ratio inference on time to multiple serious infections, and it can be analyzed by Poisson regression for relative risk (RR) inference on the annualized serious infection rates. Table 1 presents a descriptive summary of data and analysis results. To test the hypothesis that g interferon may lower the serious infection risk, Poisson regression is used. Since none of the covariates are significant factors, the model contains an intercept and treatment effect only. It can be seen that the g interferon group had a significant lower relative risk of 0.68 compared to the placebo group in terms of annualized serious b ¼ 0:7422. The infection rates, and the data are under-dispersed relative to the Poisson distribution assumption as f Q

b ,b b Þ should be o1, as estimated correlation between e2i ’s and zi ’s is 0.22, which implies that f is under estimated and ERðb Q S b Þ ¼ 0:1184 and p-value= 0.0009 from the sandwich variance approach, b s ðb discussed in Section 4.2. Indeed, from Table 1, se which are bigger than those from the quasi-likelihood approach, as predicated by Theorem 1. 6. Discussion As stated in Section 1, in this research, we studied the impact of estimation of the dispersion parameter on efficiency of parameter estimates for the generalized linear model, which shows that a general relationship exists via the foundational Table 1 CGD trial serious infection data and Poisson regression results. Variable

g interferon

Placebo

Variable name

g interferon

Placebo

N per group Median F-up (yr.) Total F-up (yr.)

63 0.786 51.89

65 0.813 50.72

Serious infection Annualized Infection rate

83

120

1.60

2.37

b ¼ 0:7422 CGD data: Poisson regression on treatment effect of g interferon vs. Placebo; correlation between (e2i ’s, zi ’s) = 0.22, f Q Method

b b trt

seðbÞ

RR (95% CI)

P-value

w2 score

MLE (LR test) QL (LR test) SW (gen. score)

 0.3915  0.3915  0.3915

0.1415 0.1050 0.1184

0.68(0.51,0.89) 0.68(0.55,0.83) 0.68(0.53,0.85)

0.0057 0.0002 0.0009

7.66 13.9 10.94

J. Xu, M. LaValley / Journal of Statistical Planning and Inference 141 (2011) 1014–1020

1019

structure components of the model: the Fisher information z, dispersion parameter f, and Pearson residual e2. z contains information about variability inherent to the variance and link functions, including the variability related to the covariate b or moment based e2’s are used to estimate the random dispersion. If all structure in the model. The likelihood based f b equals to likelihood based f b , thus, model assumptions are met, e2i ’s are uncorrelated with zi ’s, and moment based f Q L different approaches yield similar efficiency; otherwise, e2i ’s consist of both random dispersion and the dispersion associated with the departure from certain model assumptions, such as the homoscedasticity assumption for linear regression, and the no over-dispersion assumption for Poisson or logistic regression; thus, e2i ’s and zi ’s will be correlated, resulting in efficiency difference in regression parameter estimates via different approaches. b for a specific One of the applications of Theorem 2 is to find an easy way in deriving sandwich variance estimators on b distribution within the exponential family; and an application of Theorem 1 is to use the correlation between Pearson residual and Fisher information to predict the error ratio of quasi-likelihood variance versus sandwich variance when sample size is sufficiently large in the generalized linear model. Simulation results show that when Theorem 1 is in effect, the ER conclusions demonstrated by Breslow (1990) for Poisson regression as summarized in Section 1 also apply to linear or logistic regression models. Simulations also demonstrated that expression (11) only applies to using Pearson residuals, but not to using deviance residuals. This is due to the fact that deviance residuals equal the Pearson residuals plus some higher order terms, and the higher order terms may alter the relationship between e2i ’s and zi ’s.

Acknowledgements The authors thank the anonymous reviewers whose comments and suggestions have helped improve the quality of this paper. Appendix A. Proof of Theorem 1

Proof. By definition, b ,b b Þ ¼ var ðb b Þ=var ðb b Þ: ERðb Q S Q S From Eqs. (3) and (7)–(9), P b 1 Di Þ1 ð ni¼ 1 Di uV i b Þ¼ b ,b : ERðb Q S P b 1 Di Þ1 fPn Di uV b 1 ðyi mb Þ2 V b 1 Di Þ1 b 1 Di gðPn Di uV ð ni¼ 1 Di uV i i¼1 i¼1 i i i i After canceling out common factors in the numerators and denominator, Pn

b 1

i ¼ 1 Di uV i Di 1 b 1 b 2b i ¼ 1 Di uV i ðyi  i Þ V i Di

b Þ¼ b ,b ERðb Q S Pn

,

m

b Vðm b is regarded as a scalar, then b ðyi jxi Þ ¼ f bi ¼ V b i Þ, and f where V Pn 1 b b Þ ¼ P f  ð i ¼ 1 Di uVðmbi Þ Di Þ b ,b : ERðb Q S n 1 2 b ðyi mbi Þ Vðmbi Þ1 Di i ¼ 1 Di uVðmi Þ b 6ð1=nÞ Pn ðy mb Þ2 fvðmb Þg1 , thus From Eq. (5), f i i i¼1 i Pn P P 2 1 Pn b b b 1 Di Þ ð ðy  m Þ Vð m Þ Þð ð ni¼ 1 ðyi mbi Þ2 Vðmbi Þ1 Þð ni¼ 1 Di uVðmbi Þ1 Di Þ i i i¼1 i i ¼ 1 Di uVðmi Þ b ,b b Þ¼ ERðb ¼ : Pn Pn Q S 1 2 1 2 nð i ¼ 1 Di uVðmbi Þ ðyi mbi Þ Vðmbi Þ Di Þ nð i ¼ 1 ðyi mbi Þ Vðmbi Þ1 Di uVðmbi Þ1 Di Þ The multiplication of n in the denominator is equivalent to double summations over the same index; the multiplication of two summations in the numerator is equivalent to double summations over different indexes; with e2i as defined in Eq. (4) and zi as defined in Section 2,     1 Pn 1 Pn Pn Pn Pn Pn P e2i  zi 2 2 i ¼ 1 i ¼ 1 e2 n z n n i¼1 j ¼ 1 ei  zj 1 ei Þð i ¼ 1 zi Þ b ,b b Þ ¼ ð i ¼P ERðb ¼ ¼ ¼ Pn i ¼2 1 i : Pn Pn Q S n 2 2 1 Pn n  ð i ¼ 1 ei  zi Þ i¼1 i ¼ 1 ei  zi i ¼ 1 ei  zi e2  zi n i¼1 i By the Strong Law of Large Numbers,         1 Pn 1 Pn 1 Pn 1 Pn 2 2 limn-1   limn-1 i ¼ 1 ei i ¼ 1 zi i ¼ 1 ei i ¼ 1 zi Eðe2 Þ  EðzÞ n n n b ,b b Þ ¼ lim n : Asy: ERðb ¼ ¼ Q S P n-1 1 Pn 1 Eðe2  zÞ n 2 2 limn-1 i ¼ 1 ei  zi i ¼ 1 ei  zi n n

&

1020

J. Xu, M. LaValley / Journal of Statistical Planning and Inference 141 (2011) 1014–1020

Appendix B. Proof of Theorem 2

Proof. By definition, b ,b b Þ ¼ var ðb b Þ=var ðb b Þ: ERðb L S L S From Eq. (1), where f is assumed to be fixed or estimated from maximum likelihood. Reference to Eqs. (3), (7)–(9), P b 1 Di Þ1 ð ni¼ 1 Di uV i b ,b b Þ¼ ERðb : L S P b 1 Di Þ1 fPn Di uV b 1 ðyi mb Þ2 V b 1 Di Þ1 b 1 Di gðPn Di uV ð ni¼ 1 Di uV i i i i i i¼1 i¼1 After canceling out common factors, Pn

b 1

i ¼ 1 Di u V i Di 1 b 1 b 2b D i ¼ 1 i uV i ðyi  i Þ V i Di

b ,b b Þ¼ ERðb L S Pn

:

m

b i as defined in Eq. (1), With V Pn 1 b ,b b Þ ¼ P f  ð i ¼ 1 Di uVðmbi Þ Di Þ ERðb , L S n 1 2 b ðyi mbi Þ Di uVðmbi Þ1 Di i ¼ 1 Vðmi Þ following e2i as defined in Eq. (4) and zi as defined in Section 2, Pn Pn b Þ ¼ Pf i ¼ 1 zi ¼ P i ¼ 1 f  zi : b ,b ERðb L S n n 2 2 i ¼ 1 ei  zi i ¼ 1 ei  zi By strong law of large numbers, Asy:

f

n-1

Pn

i ¼ 1 zi 2 i ¼ 1 ei  zi

b ,b b ERðb L S Þ ¼ lim Pn

1 Pn zi f  EðzÞ n i¼1 : ¼ 1 Pn Eðe2  zÞ 2 limn-1 i ¼ 1 ei  zi n

flimn-1

¼

&

References Armitage, P., Colton, T., 1998. Encyclopedia of Biostatistics, vol. 2. John Wiley and Sons, New York, NY. Breslow, N., 1990. Tests of hypotheses in overdispersed Poisson regression and other quasi-likelihood models. Journal of the American Statistical Association 85, 565–571. Casella, G., Berger, R.L., 2002. Statistical Inference, second ed. Duxbury Advanced Series. Cox, D.R., 1983. Some remarks on overdispersion. Biometrika 70, 274–296. Fleming, T., Harrington, D., 1991. Counting Processes and Survival Analysis. John Wiley & Sons Publishers. Herman, J., Kucera, R., Simsa, J., 2000. Equations and Inequalities. Springer-Verlag translated by K. Dilcher. Kauermann, G., Carroll, R.J., 2001. A note on the efficiency of sandwich covariance matrix estimation. Journal of the American Statistical Association 96, 1387–1398. Lachin, J.M., 2000. Biostatistical Methods, The Assessment of Relative Risks. John Wiley & Sons, Inc., New York. Liang, K.Y., Zeger, S.L., 1986. Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22. Mancl, L.A., Leroux, B.G., 1996. Efficiency of regression estimates for clustered data. Biometrics 52, 500–511. McCullagh, P., Nelder, J.A., 1989. Generalized Linear Models, second ed. University Press, Cambridge. Moore, D.F., 1986. Asymptotic properties of moment estimators for overdispersed counts and proportions. Biometrika 73, 583–588. Moore, D.F., Tsiatis, A., 1991. Robust estimation of the variance in moment methods for extra-binomial and extra-Poisson variation. Biometrics 47, 383–401. Nelder, J.A., Lee, Y., 1992. Likelihood, quasi-likelihood and pseudolikelihood: some comparisons. Journal of Royal Statistical Society 54, 273–284. Simonoff, J., 2003. Analyzing Categorical Data. Springer-Verlag, New York, Inc. Tong, Y.L., 1980. Probability Inequalities in Multivariate Distributions. Academic Press, New York. Wedderburn, R.W.M., 1974. Quasilikelihood functions, generalized linear models and the Gauss–Newton method. Biometrika 61, 439–447. White, H., 1980. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica 48, 817–838.

Lihat lebih banyak...

Comentarios

Copyright © 2017 DATOSPDF Inc.