Abstract
Quantile regression has become a powerful complement to the usual mean regression. A simple approach to use quantile regression in marginal analysis of longitudinal data is to assume working independence. However, this may incur potential efficiency loss. On the other hand, correctly specifying a working correlation in quantile regression can be difficult. We propose a new quantile regression model by combining multiple sets of unbiased estimating equations. This approach can account for correlations between the repeated measurements and produce more efficient estimates. Because the objective function is discrete and non-convex, we propose induced smoothing for fast and accurate computation of the parameter estimates, as well as their asymptotic covariance, using Newton-Raphson iteration. We further develop a robust quantile rank score test for hypothesis testing. We show that the resulting estimate is asymptotically normal and more efficient than the simple estimate using working independence. Extensive simulations and a real data analysis show the usefulness of the method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Brown, B.M., Wang, Y.G.: Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92, 149–158 (2005)
Brown, B.M., Wang, Y.G.: Induced smoothing for rank regression with censored survival times. Stat. Med. 26, 828–836 (2007)
Chen, K., Ying, Z., Zhang, H., Zhao, L.: Analysis of least absolute deviation. Biometrika 95, 107–122 (2008)
Crowder, M.: On the use of a working correlation matrix in using generalized linear models for repeated measures. Biometrika 82, 407–410 (1995)
Diggle, P.J., Heagerty, P.J., Liang, K.Y., Zeger, S.L.: Analysis of Longitudinal Data. Oxford University Press, Oxford (2002)
Fu, L., Wang, Y.G.: Quantile regression for longitudinal data with a working correlation model. Comput. Stat. Data Anal. 56, 2526–2538 (2012)
Hall, P., Sheather, S.J.: On the distribution of a studentized quantile. J. R. Stat. Soc. B 50, 381–391 (1988)
Hansen, P.L.: Large sample properties of generalized method of moments estimators. Econometrica 50, 1029–1054 (1982)
He, X., Fu, B., Fung, W.K.: Median regression of longitudinal data. Stat. Med. 22, 3655–3669 (2003)
Hendricks, W., Koenker, R.: Hierarchical spline models for conditional quantiles and the demand for electricity. J. Am. Stat. Assoc. 87, 58–68 (1992)
Hunter, D.R., Lange, K.: Quantile regression via an MM algorithm. J. Comput. Graph. Stat. 9, 60–77 (2000)
Johnson, L.M., Strawderman, R.L.: Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 96, 577–590 (2009)
Jung, S.: Quasi-likelihood for median regression models. J. Am. Stat. Assoc. 91, 251–257 (1996)
Koenker, R.: Quantile regression for longitudinal data. J. Multivar. Anal. 91, 74–89 (2004)
Koenker, R.: Quantile Regression. Cambridge University Press, Cambridge (2005)
Koenker, R., Bassett, G.: Regression quantiles. Econometrica 50, 1577–1584 (1978)
Kocherginsky, M., He, X., Mu, Y.: Practical confidence intervals for regression quantiles. J. Comput. Graph. Stat. 14, 41–55 (2005)
Liang, K.Y., Zeger, S.L.: Longitudinal data analysis using generalized linear models. Biometrika 73, 13–22 (1986)
Li, H., Yin, G.: Generalized method of moments estimation for linear regression with clustered failure time data. Biometrika 96, 293–306 (2009)
Mu, Y., Wei, Y.: A dynamic quantile regression transformation model for longitudinal data. Stat. Sin. 19, 1137–1153 (2009)
Nelder, J.A., Mead, R.: A simplex method for function minimization. Comput. J. 7, 308–313 (1965)
Qu, A., Li, R.: Nonparametric modeling and inference function for longitudinal data. Biometrics 62, 379–391 (2006)
Qu, A., Lindsay, B., Li, B.: Improving generalised estimating equations using quadratic inference functions. Biometrika 87, 823–836 (2000)
Tang, C.Y., Leng, C.: Empirical likelihood and quantile regression in longitudinal data analysis. Biometrika 98, 1001–1006 (2011)
Wang, H.: Inference on quantile regression for heteroscedastic mixed models. Stat. Sin. 19, 1247–1261 (2009)
Wang, Y.G., Carey, V.: Working correlation structure misspecification, estimation and covariate design: implications for generalised estimating equations performance. Biometrika 90, 29–41 (2003)
Wang, H., Fygenson, M.: Inference for censored quantile regression models in longitudinal studies. Ann. Stat. 37, 756–781 (2009)
Wang, H., He, X.: Detecting differential expressions in GeneChip microarray studies: a quantile approach. J. Am. Stat. Assoc. 102, 104–112 (2007)
Wang, H., Zhu, Z., Zhou, J.: Quantile regression in partially linear varying coefficient models. Ann. Stat. 37, 3841–3866 (2009)
Wei, Y., He, X.: Conditional growth charts (with discussion). Ann. Stat. 34, 2069–2131 (2006)
Xue, L., Qu, A., Zhou, J.: Consistent model selection for marginal generalized additive model for correlated data. J. Am. Stat. Assoc. 105, 1518–1530 (2010)
Acknowledgements
We thank the associate editor and two referees whose comments have led to a much improved paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
C. Leng’s research is supported in part by NUS academic research grants. W. Zhang’s research is supported by the NSF of China (Nos. 11271347, 11171321).
Appendix
Appendix
To prove the theorems, we first give a set of regularity conditions. For any matrix A, ∥A∥ denotes the modulus of the largest singular values of A. We mainly follow Johnson and Strawderman (2009) for the proof and make the following standard assumptions (Koenker 2005).
Assumption A.1
The dimension p of covariates x ij is fixed; m→∞ and max{n i } is bounded. The distribution functions \(F_{ij}(z)=P(y_{ij}-x_{ij}^{T}\beta_{\tau}\le z|x_{ij})\) are absolutely continuous, with continuous densities f ij and its first derivative uniformly bounded away from 0 and ∞ at the points 0, i=1,…,m;j=1,…,n i .
Assumption A.2
The true β τ is in the interior of a compact set Θ∈ℝp.
Assumption A.3
There exist finite matrices A lk (l,k=1,…,a) and nonsingular matrices G l (β τ ),l=1,…,a such that
-
(1)
\(\lim_{m\rightarrow\infty} \frac{1}{m}\sum_{i=1}^{m}x_{i}^{T}\varGamma_{i}M_{li}M^{T}_{ki}\varGamma_{i}x_{i}=A_{lk},\ l,k= 1, \ldots,a\).
-
(2)
\(\lim_{m\rightarrow\infty} \frac{1}{m}\sum_{i=1}^{m}x_{i}^{T}\varGamma_{i}M_{li}\varGamma_{i}x_{i}=G_{l}(\beta_{\tau}), l=1,\ldots,a\).
-
(3)
\(\lim_{m\rightarrow\infty}\frac{1}{\sqrt{m}} \max\|x_{ij}\|=0\).
Proof of Theorem 1
Without loss of generality, we consider the lth component of S(β) and let \(\beta=\beta_{\tau}+\delta/\sqrt{m}\),
where 0<δ<∞, ε i =y i −x i β τ and \(Z_{i}=I(\varepsilon_{i}<x_{i}\delta/\sqrt{m})-I(\varepsilon_{i}<0)\). For the second term, write
By Assumption A.3 (1) and (2), we have
By Cauchy-Schwartz Inequality and Assumption A.3, for all ζ∈ℝp with ζ T ζ=1,
Therefore combining (9)–(11), we have
where \(G_{m}(\beta_{\tau})=(G_{m,1}^{T}(\beta_{\tau}),\ldots ,G_{m,a}^{T}(\beta_{\tau}))^{T}\) with \(G_{m,l}(\beta_{\tau})=\frac{1}{m}\sum_{i=1}^{m} x_{i}^{T}\varGamma_{i}M_{li}\varGamma_{i}x_{i}\), l=1,…,a.
Let S ∗(β)=S(β τ )−G m (β τ )(β−β τ ) and \(Q^{*}_{m}(\beta)=\{S^{*}(\beta)\}^{T}\{\varSigma^{*}_{m}(\beta)\}^{-1}S^{*}(\beta)\) where \(\varSigma^{*}_{m}(\beta)=\frac{1}{m}\sum_{i=1}^{m}S_{i}^{*}\cdot\{S_{i}^{*}\}^{T}-S^{*}(\beta )\{S^{*}(\beta)\}^{T}\) with \(S_{i}^{*}(\beta)=S_{i}(\beta_{\tau})-G_{m,i}(\beta-\beta_{\tau}), i=1,\ldots,a\). We then have that
for any fixed t>0. By (1) of Assumption A.3 and the boundedness of \(\psi_{\tau}(y_{i}-x_{i}'\beta_{\tau})\psi^{T}_{\tau}(y_{i}-x_{i}'\beta_{\tau})\), we have
for any fixed t>0, in probability, where \(\varSigma(\beta_{\tau})= \mathit{cov}(\sqrt {m}S(\beta_{\tau}))\).
From (12) and the definition of S ∗(β), we can see that S(β) is asymptotically equivalent to S ∗(β). Thus in a neighborhood of β τ , the objective function Q m (β) is asymptotically equivalent to the smoothed objective function \(Q_{m}^{*}(\beta)\) at the rate of 1/m. We then conclude that the minimizer of Q m (β) in a neighborhood of β τ is also minimizing the smoothed objective function \(Q_{m}^{*}(\beta)\) asymptotically. Since \(\hat{\beta}\) minimizes Q m (β), and equivalently \(Q_{m}^{*}(\beta )\), we obtain that
The second derivative matrix is asymptotically positive definite, which guarantees a unique minimum. Since \(\hat{\beta}\) satisfies \(\partial Q_{m}^{*}(\beta)/\partial\beta |_{\hat{\beta}}=0\) and \(Q_{m}(\beta_{\tau})=Q_{m}^{*}(\beta_{\tau})\), and by the continuity of \(\partial Q_{m}^{*}(\beta)/\partial\beta\) at β τ , \(\hat{\beta}\) convergences to β τ in probability, as m→∞.
Since m 1/2 S(β τ ) converges to a zero-mean normal distribution with a variance-covariance matrix Σ(β τ ), letting G(β τ )=lim m→∞ G m (β τ ) and by Slutsky’s theorem, we have
□
Proof of Theorem 2
Without loss of generality, we consider the lth component of \(\tilde{S}\), \(\tilde{S}_{(l)}(\beta)=E_{\vartheta}S_{(l)}(\beta+m^{-1/2}\varOmega^{1/2}\vartheta)\), where ϑ∼N(0,I p ). Then by the differentiability of \(\tilde {S}_{(l)}(\beta)\) and Taylor expansion, for all ∥δ∥≤C for some finite constant C, we have
where \(E\tilde{S}'_{(l)}(\beta_{\tau})=\frac{1}{m}\sum_{i=1}^{m} x_{i}^{T}\varGamma_{i}M_{li}D_{i}x_{i}\) and D i is a n i ×n i diagonal matrix with elements \(E_{\varepsilon_{ij}}\phi(\sqrt{m}\frac{\varepsilon_{ij}}{r_{ij}})\frac{\sqrt{m}}{r_{ij}}\). Notice that
where ∫ϕ(x)f ij (0)dx=f ij (0) and \(|{\frac{r_{ij}}{\sqrt{m}}}\int\!\phi(x)f_{ij}(w^{*})x dx| \le M{\frac{r_{ij}}{\sqrt{m}}}\int|x|\phi (x)dx\rightarrow0\). Thus \(E\tilde{S}'_{(l)}(\beta_{\tau})=G_{ml}(\beta_{\tau})+o(1)\).
Following the proof of Theorem 1, if
holds in probability, then
and thus \(\tilde{\beta}\) converges to β τ in probability.
To see that (14) holds, write
where ϕ Ω (⋅) denotes the pdf of Ω 1/2 ϑ. Let K m (u,β τ )=∥S(β τ +m −1/2 u)−S(β τ )−m −1/2 G(β τ )u∥. Then, since \(\int_{\mathbb{R}^{p}}u\phi_{\varOmega}(u)du=0\), the triangle inequality implies
for any ϵ m >0. By Assumption A.3 and the proof of (12), it is easy to see that
for any positive sequences d m →0. Suppose ϵ m =o(m 1/2), then taking b=β τ +m 1/2 u, d m =m −1/2 ϵ m , (16) implies
An easy calculation, in combination with (17), now shows that the first integral on the right-hand side of the inequality in (15) converges in probability to zero, even if ϵ m →∞. With regard to the second term on the right-hand side of (15), we may use the definition of K m (⋅;β τ ) and the triangle inequality to write \(\sqrt{m}\int_{\|u\|> \epsilon _{m}}K_{m}(u,\beta_{\tau})\phi_{\varOmega}(u)du\le A_{1}+A_{2}\), where
For all β∈Θ, ∥S(β)∥≤A for some positive constant A<∞ by Assumptions A.2 and A.3, hence A 1≤2Am 1/2⋅P(∥Ω 1/2 ϑ∥>ϵ m )→0 as m→∞. Similarly, \(\int_{\|u\|>\epsilon_{m}}\|u\|\phi_{\varOmega}(u)du\rightarrow0\). Therefore, the second integral on the right-hand side of the inequality in (15), also converges in probability to zero. It follows that (14) converges in probability to zero as m→∞.
The asymptotic normality of \(\tilde{\beta}\) is obtained directly following the proof of Theorem 1. The proof is completed. □
Proof of Theorem 3
Following the similar argument in the proof of Theorem 1, under Assumptions A.1–A.3 and the null hypotheses H 0, we obtain
where \(\hat{\alpha}=\arg\min_{\alpha}\tilde{Q}(\alpha)\) under the null hypothesis H 0. Let
where \(M_{i}=(M_{1i}^{T},\ldots,M_{ai}^{T})^{T}\). Then following a similar argument as in the proof of Lemma A.2 in Wang and He (2007), for some constant C,
A Taylor expansion of E{r m (t)} around 0 gives
where the last step is due to the fact that Z T Δ=0 by construction. Now, (20) together with (19) and (18) yields
Note that U (1) is a a(p−q)-dimensional vector. The asymptotic normality of U (1) follows then from the Lindberg-Feller Central Limit Theorem, which together with similar argument as in the proof of Theorem 1 completes the proof. □
Rights and permissions
About this article
Cite this article
Leng, C., Zhang, W. Smoothing combined estimating equations in quantile regression for longitudinal data. Stat Comput 24, 123–136 (2014). https://doi.org/10.1007/s11222-012-9358-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-012-9358-0