Generalized linear mixed joint model for longitudinal and survival outcomes

Sara Viviani¹,
Marco Alfó¹ &
Dimitris Rizopoulos²

1647 Accesses
19 Citations
Explore all metrics

Abstract

Longitudinal studies often entail categorical outcomes as primary responses. When dropout occurs, non-ignorability is frequently accounted for through shared parameter models (SPMs). In this context, several extensions from Gaussian to non-Gaussian longitudinal processes have been proposed. In this paper, we formulate an approach for non-Gaussian longitudinal outcomes in the framework of joint models. As an extension of SPMs, based on shared latent effects, we assume that the history of the response up to current time may have an influence on the risk of dropout. This history is represented by the current, expected, value of the response. Since the time a subject spends in the study is continuous, we parametrize the dropout process through a proportional hazard model. The resulting model is referred to as Generalized Linear Mixed Joint Model (GLMJM). To estimate model parameters, we adopt a maximum likelihood approach via the EM algorithm. In this context, the maximization of the observed data log-likelihood requires numerical integration over the random effect posterior distribution, which is usually not straightforward; under the assumption of Gaussian random effects, we compare Gauss-Hermite and Pseudo-Adaptive Gaussian quadrature rules. We investigate in a simulation study the behaviour of parameter estimates in the case of Poisson and Binomial longitudinal responses, and apply the GLMJM to a benchmark dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Power difference in a χ² test vs generalized linear mixed model in the presence of missing data – a simulation study

Article Open access 02 March 2020

A robust joint modeling approach for longitudinal data with informative dropouts

Article 09 March 2020

Bayesian multivariate nonlinear mixed models for censored longitudinal trajectories with non-monotone missing values

Article 26 October 2023

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Albert, P.S., Follmann, D.A.: Modeling repeated count data subject to informative dropout. Biometrics 56, 667–677 (2000)
Article MATH Google Scholar
Albert, P.S., Follmann, D.A., Wang, S.A., Suh, E.B.: A latent autoregressive model for longitudinal binary data subject to informative missingness. Biometrics 58, 631–642 (2002)
Article MATH MathSciNet Google Scholar
Andersen, P., Gill, R.: Cox’s regression model for counting processes: a large sample study. Ann. Stat. 10, 1100–1120 (1982)
Article MATH MathSciNet Google Scholar
Carlin, B.P., Louis, T.A.: Bayesian Methods for Data Analysis. Chapman & Hall/CRC Press, Boca Raton (2009)
Google Scholar
Cox, D.R.: Regression models and life tables (with discussion). J. R. Stat. Soc., Ser. B, Stat. Methodol. 34, 187–220 (1972)
MATH Google Scholar
Cox, D.R., Hinkley, D.: Theoretical Statistics. Chapman & Hall, London (1974)
Book MATH Google Scholar
Geyer, J.C., Thompson, E.A.: Constrained Monte Carlo maximum likelihood for dependent data. J. R. Stat. Soc., Ser. B, Stat. Methodol. 54, 657–699 (1992)
MathSciNet Google Scholar
Goldman, A., Carlin, B., Crane, L., Launer, C., Korvick, J., Deyton, L., Abrams, D.: Response of CD4+ and clinical consequences to treatment using ddI or ddC in patients with advanced HIV infection. J. Acquir. Immune Defic. Syndr. Human Retrovirol. 11, 161–169 (1996)
Article Google Scholar
Hsieh, F., Tseng, Y.-K., Wang, J.-K.: Joint modeling of survival and longitudinal data: likelihood approach revisited. Biometrics 62, 1037–1043 (2006)
Article MATH MathSciNet Google Scholar
Jansen, I., Hens, N., Molenberghs, G., Aerts, M., Verbeke, G., Kenward, M.G.: The nature of sensitivity in missing not at random models. Comput. Stat. Data Anal. 50, 830–854 (2006)
Article MATH MathSciNet Google Scholar
Kronrod, A.S.: On the theory of quadratic pencils of selfadjoint operators. Dokl. Akad. Nauk SSSR 154, 283–286 (1964)
MathSciNet Google Scholar
Louis, T.A.: Finding the observed information matrix when using the EM algorithm. J. R. Stat. Soc. B 44, 226–233 (1982)
MATH MathSciNet Google Scholar
Molenberghs, G., Verbeke, G.: Models for Discrete Longitudinal Data. Springer, New York (2005)
MATH Google Scholar
Molenberghs, G., Verbeke, G., Demetrio, C.G.B., Vieira, M.C.: A family of generalized linear models for repeated measures with normal and conjugate random effects. Stat. Sci. 25, 325–347 (2011)
Article MathSciNet Google Scholar
Oakes, D.: Direct calculation of the information matrix via the EM algorithm. J. R. Stat. Soc., Ser. B, Stat. Methodol. 61, 479–482 (1999)
Article MATH MathSciNet Google Scholar
Press, W., Teukolsky, S., Vetterling, W., Flannery, B.: Numerical Recipes: The Art of Scientific Computing. Cambridge University Press, New York (2007)
Google Scholar
Rizopoulos, D.: Jm: an R package for the joint modelling of longitudinal and time-to-event data. J. Stat. Softw. 35, 9 (2010)
Google Scholar
Rizopoulos, D.: Fast fitting of joint models for longitudinal and event time data using a pseudo-adaptive Gaussian quadrature rule. Comput. Stat. Data Anal. 56, 491–501 (2012)
Article MATH MathSciNet Google Scholar
Rizopoulos, D., Ghosh, P.: A Bayesian semiparametric multivariate joint model for multiple longitudinal outcomes and a time-to-event. Stat. Med. 30, 1366–1380 (2011)
Article MathSciNet Google Scholar
Rizopoulos, D., Verbeke, G., Lesaffre, E.: Fully exponential Laplace approximations for the joint modelling of survival and longitudinal data. J. R. Stat. Soc., Ser. B, Stat. Methodol. 71, 637–654 (2009)
Article MATH MathSciNet Google Scholar
Rizopoulos, D., Verbeke, G., Molenberghs, G.: Multiple-imputation-based residuals and diagnostic plots for joint models of longitudinal and survival outcomes. Biometrics 66, 20–29 (2010)
Article MATH MathSciNet Google Scholar
Troxel, A.B., Ma, G., Heitjan, F.: Shared parameter models for the joint analysis of longitudinal data and event times. Stat. Sin. 14, 1221–1237 (2004)
MATH MathSciNet Google Scholar
Tsiatis, A.A., Davidian, M.: Joint modeling of longitudinal and time-to-event data: an overview. Stat. Sin. 14, 809–834 (2004)
MATH MathSciNet Google Scholar
Viviani, S., Rizopoulos, D., Alfó, M.: Local Sensitivity to non-ignorability in Joint Models. Stat. Model. (2012, major revision)
Wu, M., Carrol, R.: Estimation and comparison of changes in presence of informative right censoring by modelling the censoring process. Biometrics 45, 939–955 (1988)
Article Google Scholar
Wulfsohn, M., Tsiatis, A.: A joint model for survival and longitudinal data measured with error. Biometrics 53, 330–339 (1997)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Statistics, Sapienza University of Rome, P.zza Aldo Moro 15, 00185, Rome, Italy
Sara Viviani & Marco Alfó
Department of Biostatistics, Erasmus University Medical Center, 3000 CA, Rotterdam, The Netherlands
Dimitris Rizopoulos

Authors

Sara Viviani
View author publications
You can also search for this author in PubMed Google Scholar
Marco Alfó
View author publications
You can also search for this author in PubMed Google Scholar
Dimitris Rizopoulos
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sara Viviani.

Appendices

Appendix A: Score vectors and Hessian matrices for the Poisson and the Binomial joint model

As highlighted in Sect. 2, for parameter estimation in the GLMJMs, the EM algorithm requires the computation of the observed data score vector and Hessian matrix for the complete joint model. The score can assume the form:

$$ \mathcal{S}({\boldsymbol{\theta}})= \sum _i \int\omega({\boldsymbol{\theta}},\mathbf{b}_i)p(\mathbf{b}_i \mid T_i, \delta_i,y_i;{\boldsymbol{\theta}})d\mathbf{b}_i, $$

(12)

where ω(θ,b _i)=∂{logp(T _i,δ _i|b _i;θ)+logp(y _i|b _i;θ)+logp(b _i;θ)}/∂θ ^⊤ denotes the complete data score vector. Therefore, it can be considered as the expected value of the complete data score vector with respect to the posterior distribution of the random effects. In the M-step, at the q-th iteration, equation (12) is solved with respect to θ, with p(b _i|T _i,δ _i,y _i;θ) calculated by fixing θ=θ ^(q−1).

Since this work has focused on the longitudinal parameter estimation under non-ignorability, in this section we report the score vector and the Hessian matrices for the specific longitudinal processes considered (Poisson and Binomial distributions).

For the Poisson longitudinal distribution, the score vector for the fixed effects in the longitudinal model can be written as

while the Hessian matrix assumes the form

For the Binomial case, on the other hand, the score vector and the Hessian matrix are given by, respectively:

and

where

$$\mathbf{A}_i = \frac{ [ 2 \mathbf{x}_i \exp \{{\boldsymbol{\beta}}^{\mathsf{T}} \mathbf{x}_i +\mathbf{b}_i^{\mathsf{T}}\mathbf{z}_i \} ] [ 1 - 2 \mathbf{x}_i \exp \{ {\boldsymbol{\beta}}^{\mathsf{T}} \mathbf{x}_i +\mathbf{b}_i^{\mathsf{T}}\mathbf{z}_i \} - 1/2 \mathbf{x}_i \exp \{ {\boldsymbol{\beta}}^{\mathsf{T}} \mathbf{x}_i +\mathbf{b}_i^{\mathsf{T}}\mathbf{z}_i \}^2 ]}{ [ 1+ \exp \{ {\boldsymbol{\beta}}^{\mathsf{T}} \mathbf{x}_i +\mathbf{b}_i^{\mathsf{T}}\mathbf{z}_i \} ]^4}, $$

while

$$\mathbf{B}_i = \frac{\exp \{{\boldsymbol{\beta}}^{\mathsf{T}} \mathbf{x}_i +\mathbf{b}_i^{\mathsf{T}}\mathbf{z}_i \}}{ [1+\exp \{ {\boldsymbol{\beta}}^{\mathsf{T}} \mathbf{x}_i +\mathbf{b}_i^{\mathsf{T}}\mathbf{z}_i \} ]^2}. $$

Appendix B: The pseudo-adaptive Gaussian quadrature

As it has been pointed out in Sect. 3, the computation of the observed data log-likelihood requires the use of quadrature rules, such as Gaussian quadrature. In this Appendix, we describe an alternative numerical integration approach based on the Pseudo adaptive Gauss-Hermite rule, proposed by Rizopoulos (2012) for standard JMs.

The standard Gauss-Hermite rule (GH) approximates the integral in (3) by a weighted sum of integrand evaluations at pre-specified abscissas, see e.g. Press et al. (2007). The quality of the approximation improves with the number of quadrature points. However, as the GH requires the evaluation of the integrand over the Cartesian product of the abscissas corresponding to each dimension in the random effect vector, the computational effort increases exponentially with the random effect dimension. Another limitation of the GH approach is that the corresponding approximation depends on the location of the quadrature points with respect to the main mass of the integrand, that may be far from each other. For instance, the random effect distribution may be far from Gaussianity.

To solve these problems, an adaptive quadrature approach may be used, that appropriately centres and scales the integrand at each iteration of the optimization algorithm. However, this kind of approach implies a substantial computational burden due to the location of the mode and the computation of the Hessian matrix of the random effect posterior distribution at each iteration.

To decrease the computational effort, we propose to use a pseudo-adaptive quadrature rule. This rule is based on the Bayesian Central Limit (BCL) theorem, see Cox and Hinkley (1974); the posterior distribution of the random effects can be approximated by a multivariate Gaussian distribution as the number of repeated measurements increases. The logarithm of this distribution can in fact be written as

(13)

It is clear from (13) that as n _i increases, the predominant term becomes the longitudinal density logp(y _i|b _i;β). According to the BCL and under general regularity conditions, as n _i→∞, we have

$$p( \mathbf{b}_i; T_i, \delta_i, \mathbf{y}_i; {\boldsymbol{\beta}}) \stackrel {p}{\rightarrow}\mathcal{N}\bigl( \tilde{\mathbf{b}}_i, \tilde{\mathbf{H}}_i^{-1}\bigr), $$

where $\tilde{\mathbf{b}}_{i} = \text{arg\,max}_{b}\{\log p(\mathbf{y}_{i} | \mathbf{b}_{i}; {\boldsymbol{\beta}})\}$ and $\tilde{\mathbf{H}}_{i} = -\partial^{2} \* \log p(\mathbf{y}_{i} | \mathbf{b}_{i}; {\boldsymbol{\beta}}) / \partial\mathbf{b} \partial\mathbf{b}^{\mathsf{T}} |_{\mathbf{b} = \tilde{\mathbf{b}}}$. Practically, as n _i increases, it is sufficient to re-center and re-scale the integrand for each subject using only the information coming from the longitudinal outcome model.

Applying this procedure to the joint model, we propose to first fit a generalized linear mixed model, extract the empirical Bayes estimates $\tilde{\mathbf{b}}_{i} = \text{arg\,max}_{b}\{\log p(\mathbf{y}_{i} | \mathbf{b}_{i}; \tilde{\mathbf{\theta}}_{y})\}$ and the corresponding covariance matrix $\tilde{\mathbf{H}}_{i}^{-1} = (\mathbf{B}_{i}\mathbf{B}_{i}^{T})^{-1}$; successively, we apply the following transformation to compute the score vector:

where q is the number of random effects, $\sum_{t_{1}, \ldots, t_{q}}$ is a shorthand for $\sum_{t_{1}=1}^{K}, \ldots, \sum_{t_{q}=1}^{K}$, K denoting the common number of quadrature points, b _t=(b _t1,…,b _tq) are the abscissas with corresponding weights π _t, $\tilde {r}_{t} = \tilde{\mathbf{b}}_{i} + \sqrt{2} \mathbf{B}_{i}^{-1} \mathbf{b}_{t}$ are the re-scaled abscissas and $\tilde{\mathbf{\theta}}_{i}$ denotes the maximum likelihood estimates for the longitudinal model.

This procedure is very similar to the GH rule, but it is implemented only once, at the start of the EM algorithm, leading to faster estimation.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Viviani, S., Alfó, M. & Rizopoulos, D. Generalized linear mixed joint model for longitudinal and survival outcomes. Stat Comput 24, 417–427 (2014). https://doi.org/10.1007/s11222-013-9378-4

Download citation

Received: 26 July 2012
Accepted: 18 January 2013
Published: 07 February 2013
Issue Date: May 2014
DOI: https://doi.org/10.1007/s11222-013-9378-4