Abstract
Longitudinal studies often entail categorical outcomes as primary responses. When dropout occurs, non-ignorability is frequently accounted for through shared parameter models (SPMs). In this context, several extensions from Gaussian to non-Gaussian longitudinal processes have been proposed. In this paper, we formulate an approach for non-Gaussian longitudinal outcomes in the framework of joint models. As an extension of SPMs, based on shared latent effects, we assume that the history of the response up to current time may have an influence on the risk of dropout. This history is represented by the current, expected, value of the response. Since the time a subject spends in the study is continuous, we parametrize the dropout process through a proportional hazard model. The resulting model is referred to as Generalized Linear Mixed Joint Model (GLMJM). To estimate model parameters, we adopt a maximum likelihood approach via the EM algorithm. In this context, the maximization of the observed data log-likelihood requires numerical integration over the random effect posterior distribution, which is usually not straightforward; under the assumption of Gaussian random effects, we compare Gauss-Hermite and Pseudo-Adaptive Gaussian quadrature rules. We investigate in a simulation study the behaviour of parameter estimates in the case of Poisson and Binomial longitudinal responses, and apply the GLMJM to a benchmark dataset.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Albert, P.S., Follmann, D.A.: Modeling repeated count data subject to informative dropout. Biometrics 56, 667–677 (2000)
Albert, P.S., Follmann, D.A., Wang, S.A., Suh, E.B.: A latent autoregressive model for longitudinal binary data subject to informative missingness. Biometrics 58, 631–642 (2002)
Andersen, P., Gill, R.: Cox’s regression model for counting processes: a large sample study. Ann. Stat. 10, 1100–1120 (1982)
Carlin, B.P., Louis, T.A.: Bayesian Methods for Data Analysis. Chapman & Hall/CRC Press, Boca Raton (2009)
Cox, D.R.: Regression models and life tables (with discussion). J. R. Stat. Soc., Ser. B, Stat. Methodol. 34, 187–220 (1972)
Cox, D.R., Hinkley, D.: Theoretical Statistics. Chapman & Hall, London (1974)
Geyer, J.C., Thompson, E.A.: Constrained Monte Carlo maximum likelihood for dependent data. J. R. Stat. Soc., Ser. B, Stat. Methodol. 54, 657–699 (1992)
Goldman, A., Carlin, B., Crane, L., Launer, C., Korvick, J., Deyton, L., Abrams, D.: Response of CD4+ and clinical consequences to treatment using ddI or ddC in patients with advanced HIV infection. J. Acquir. Immune Defic. Syndr. Human Retrovirol. 11, 161–169 (1996)
Hsieh, F., Tseng, Y.-K., Wang, J.-K.: Joint modeling of survival and longitudinal data: likelihood approach revisited. Biometrics 62, 1037–1043 (2006)
Jansen, I., Hens, N., Molenberghs, G., Aerts, M., Verbeke, G., Kenward, M.G.: The nature of sensitivity in missing not at random models. Comput. Stat. Data Anal. 50, 830–854 (2006)
Kronrod, A.S.: On the theory of quadratic pencils of selfadjoint operators. Dokl. Akad. Nauk SSSR 154, 283–286 (1964)
Louis, T.A.: Finding the observed information matrix when using the EM algorithm. J. R. Stat. Soc. B 44, 226–233 (1982)
Molenberghs, G., Verbeke, G.: Models for Discrete Longitudinal Data. Springer, New York (2005)
Molenberghs, G., Verbeke, G., Demetrio, C.G.B., Vieira, M.C.: A family of generalized linear models for repeated measures with normal and conjugate random effects. Stat. Sci. 25, 325–347 (2011)
Oakes, D.: Direct calculation of the information matrix via the EM algorithm. J. R. Stat. Soc., Ser. B, Stat. Methodol. 61, 479–482 (1999)
Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Numerical Recipes: The Art of Scientific Computing. Cambridge University Press, New York (2007)
Rizopoulos, D.: Jm: an R package for the joint modelling of longitudinal and time-to-event data. J. Stat. Softw. 35, 9 (2010)
Rizopoulos, D.: Fast fitting of joint models for longitudinal and event time data using a pseudo-adaptive Gaussian quadrature rule. Comput. Stat. Data Anal. 56, 491–501 (2012)
Rizopoulos, D., Ghosh, P.: A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Stat. Med. 30, 1366–1380 (2011)
Rizopoulos, D., Verbeke, G., Lesaffre, E.: Fully exponential Laplace approximations for the joint modelling of survival and longitudinal data. J. R. Stat. Soc., Ser. B, Stat. Methodol. 71, 637–654 (2009)
Rizopoulos, D., Verbeke, G., Molenberghs, G.: Multiple-imputation-based residuals and diagnostic plots for joint models of longitudinal and survival outcomes. Biometrics 66, 20–29 (2010)
Troxel, A.B., Ma, G., Heitjan, F.: Shared parameter models for the joint analysis of longitudinal data and event times. Stat. Sin. 14, 1221–1237 (2004)
Tsiatis, A.A., Davidian, M.: Joint modeling of longitudinal and time-to-event data: an overview. Stat. Sin. 14, 809–834 (2004)
Viviani, S., Rizopoulos, D., Alfó, M.: Local Sensitivity to non-ignorability in Joint Models. Stat. Model. (2012, major revision)
Wu, M., Carrol, R.: Estimation and comparison of changes in presence of informative right censoring by modelling the censoring process. Biometrics 45, 939–955 (1988)
Wulfsohn, M., Tsiatis, A.: A joint model for survival and longitudinal data measured with error. Biometrics 53, 330–339 (1997)
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Score vectors and Hessian matrices for the Poisson and the Binomial joint model
As highlighted in Sect. 2, for parameter estimation in the GLMJMs, the EM algorithm requires the computation of the observed data score vector and Hessian matrix for the complete joint model. The score can assume the form:
where ω(θ,b i )=∂{logp(T i ,δ i |b i ;θ)+logp(y i |b i ;θ)+logp(b i ;θ)}/∂θ ⊤ denotes the complete data score vector. Therefore, it can be considered as the expected value of the complete data score vector with respect to the posterior distribution of the random effects. In the M-step, at the q-th iteration, equation (12) is solved with respect to θ, with p(b i |T i ,δ i ,y i ;θ) calculated by fixing θ=θ (q−1).
Since this work has focused on the longitudinal parameter estimation under non-ignorability, in this section we report the score vector and the Hessian matrices for the specific longitudinal processes considered (Poisson and Binomial distributions).
For the Poisson longitudinal distribution, the score vector for the fixed effects in the longitudinal model can be written as
while the Hessian matrix assumes the form
For the Binomial case, on the other hand, the score vector and the Hessian matrix are given by, respectively:
and
where
while
Appendix B: The pseudo-adaptive Gaussian quadrature
As it has been pointed out in Sect. 3, the computation of the observed data log-likelihood requires the use of quadrature rules, such as Gaussian quadrature. In this Appendix, we describe an alternative numerical integration approach based on the Pseudo adaptive Gauss-Hermite rule, proposed by Rizopoulos (2012) for standard JMs.
The standard Gauss-Hermite rule (GH) approximates the integral in (3) by a weighted sum of integrand evaluations at pre-specified abscissas, see e.g. Press et al. (2007). The quality of the approximation improves with the number of quadrature points. However, as the GH requires the evaluation of the integrand over the Cartesian product of the abscissas corresponding to each dimension in the random effect vector, the computational effort increases exponentially with the random effect dimension. Another limitation of the GH approach is that the corresponding approximation depends on the location of the quadrature points with respect to the main mass of the integrand, that may be far from each other. For instance, the random effect distribution may be far from Gaussianity.
To solve these problems, an adaptive quadrature approach may be used, that appropriately centres and scales the integrand at each iteration of the optimization algorithm. However, this kind of approach implies a substantial computational burden due to the location of the mode and the computation of the Hessian matrix of the random effect posterior distribution at each iteration.
To decrease the computational effort, we propose to use a pseudo-adaptive quadrature rule. This rule is based on the Bayesian Central Limit (BCL) theorem, see Cox and Hinkley (1974); the posterior distribution of the random effects can be approximated by a multivariate Gaussian distribution as the number of repeated measurements increases. The logarithm of this distribution can in fact be written as
It is clear from (13) that as n i increases, the predominant term becomes the longitudinal density logp(y i |b i ;β). According to the BCL and under general regularity conditions, as n i →∞, we have
where \(\tilde{\mathbf{b}}_{i} = \text{arg\,max}_{b}\{\log p(\mathbf{y}_{i} | \mathbf{b}_{i}; {\boldsymbol{\beta}})\}\) and \(\tilde{\mathbf{H}}_{i} = -\partial^{2} \* \log p(\mathbf{y}_{i} | \mathbf{b}_{i}; {\boldsymbol{\beta}}) / \partial\mathbf{b} \partial\mathbf{b}^{\mathsf{T}} |_{\mathbf{b} = \tilde{\mathbf{b}}}\). Practically, as n i increases, it is sufficient to re-center and re-scale the integrand for each subject using only the information coming from the longitudinal outcome model.
Applying this procedure to the joint model, we propose to first fit a generalized linear mixed model, extract the empirical Bayes estimates \(\tilde{\mathbf{b}}_{i} = \text{arg\,max}_{b}\{\log p(\mathbf{y}_{i} | \mathbf{b}_{i}; \tilde{\mathbf{\theta}}_{y})\}\) and the corresponding covariance matrix \(\tilde{\mathbf{H}}_{i}^{-1} = (\mathbf{B}_{i}\mathbf{B}_{i}^{T})^{-1}\); successively, we apply the following transformation to compute the score vector:
where q is the number of random effects, \(\sum_{t_{1}, \ldots, t_{q}}\) is a shorthand for \(\sum_{t_{1}=1}^{K}, \ldots, \sum_{t_{q}=1}^{K}\), K denoting the common number of quadrature points, b t =(b t1,…,b tq ) are the abscissas with corresponding weights π t , \(\tilde {r}_{t} = \tilde{\mathbf{b}}_{i} + \sqrt{2} \mathbf{B}_{i}^{-1} \mathbf{b}_{t}\) are the re-scaled abscissas and \(\tilde{\mathbf{\theta}}_{i}\) denotes the maximum likelihood estimates for the longitudinal model.
This procedure is very similar to the GH rule, but it is implemented only once, at the start of the EM algorithm, leading to faster estimation.
Rights and permissions
About this article
Cite this article
Viviani, S., Alfó, M. & Rizopoulos, D. Generalized linear mixed joint model for longitudinal and survival outcomes. Stat Comput 24, 417–427 (2014). https://doi.org/10.1007/s11222-013-9378-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-013-9378-4