Research Interests:
In the survival analysis context, when an intervention either reduces a harmful exposure or introduces a beneficial treatment, it seems useful to quantify the gain in survival attributable to the intervention as an alternative to the... more
In the survival analysis context, when an intervention either reduces a harmful exposure or introduces a beneficial treatment, it seems useful to quantify the gain in survival attributable to the intervention as an alternative to the reduction in risk. To accomplish this we introduce two new concepts, the attributable survival and attributable survival time, and study their properties. Our analysis includes comparison with the attributable risk function as well as hazard-based alternatives. We also extend the setting to the case where the intervention takes place at discrete points in time, and may either eliminate exposure or introduce a beneficial treatment in only a proportion of the available group. This generalization accommodates the more realistic situation where the treatment or exposure is dynamic. We apply these methods to assess the effect of introducing highly active antiretroviral therapy for the treatment of clinical AIDS at the population level.
Research Interests:
Research Interests:
The conventional random effects model for meta-analysis of proportions approximates within-study variation using a normal distribution. Due to potential approximation bias, particularly for the estimation of rare events such as some... more
The conventional random effects model for meta-analysis of proportions approximates within-study variation using a normal distribution. Due to potential approximation bias, particularly for the estimation of rare events such as some adverse drug reactions, the conventional method is considered inferior to the exact methods based on binomial distributions. In this paper, we compare two existing exact approaches-beta binomial (B-B) and normal-binomial (N-B)-through an extensive simulation study with focus on the case of rare events that are commonly encountered in medical research. In addition, we implement the empirical ("sandwich") estimator of variance into the two models to improve the robustness of the statistical inferences. To our knowledge, it is the first such application of sandwich estimator of variance to meta-analysis of proportions. The simulation study shows that the B-B approach tends to have substantially smaller bias and mean squared error than N-B for rare events with occurrences under five percent, while N-B outperforms B-B for relatively common events. Use of the sandwich estimator of variance improves the precision of estimation for both models. We illustrate the two approaches by applying them to two published meta-analysis from the fields of orthopedic surgery and prevention of adverse drug reactions.
Research Interests:
Systematic reviews of diagnostic tests often involve a mixture of case-control and cohort studies. The standard methods for evaluating diagnostic accuracy only focus on sensitivity and specificity and ignore the information on disease... more
Systematic reviews of diagnostic tests often involve a mixture of case-control and cohort studies. The standard methods for evaluating diagnostic accuracy only focus on sensitivity and specificity and ignore the information on disease prevalence contained in cohort studies. Consequently, such methods cannot provide estimates of measures related to disease prevalence, such as population averaged or overall positive and negative predictive values, which reflect the clinical utility of a diagnostic test. In this paper, we propose a hybrid approach that jointly models the disease prevalence along with the diagnostic test sensitivity and specificity in cohort studies, and the sensitivity and specificity in case-control studies. In order to overcome the potential computational difficulties in the standard full likelihood inference of the proposed hybrid model, we propose an alternative inference procedure based on the composite likelihood. Such composite likelihood based inference does no...
Research Interests: Statistics and Prevalence
Research Interests:
This paper describes the core features of the R package mmeta, whichimplements the exact posterior inference of odds ratio, relative risk, and risk difference given either a single 2 × 2 table or multiple 2 × 2 tables when the risks... more
This paper describes the core features of the R package mmeta, whichimplements the exact posterior inference of odds ratio, relative risk, and risk difference given either a single 2 × 2 table or multiple 2 × 2 tables when the risks within the same study are independent or correlated.
Research Interests:
Diagnostic systematic review is a vital step in the evaluation of diagnostic technologies. In many applications, it involves pooling pairs of sensitivity and specificity of a dichotomized diagnostic test from multiple studies. We propose... more
Diagnostic systematic review is a vital step in the evaluation of diagnostic technologies. In many applications, it involves pooling pairs of sensitivity and specificity of a dichotomized diagnostic test from multiple studies. We propose a composite likelihood (CL) method for bivariate meta-analysis in diagnostic systematic reviews. This method provides an alternative way to make inference on diagnostic measures such as sensitivity, specificity, likelihood ratios, and diagnostic odds ratio. Its main advantages over the standard likelihood method are the avoidance of the nonconvergence problem, which is nontrivial when the number of studies is relatively small, the computational simplicity, and some robustness to model misspecifications. Simulation studies show that the CL method maintains high relative efficiency compared to that of the standard likelihood method. We illustrate our method in a diagnostic review of the performance of contemporary diagnostic imaging technologies for d...
Research Interests:
We have developed a statistical method named IsoDOT to assess differential isoform expression (DIE) and differential isoform usage (DIU) using RNA-seq data. Here isoform usage refers to relative isoform expression given the total... more
We have developed a statistical method named IsoDOT to assess differential isoform expression (DIE) and differential isoform usage (DIU) using RNA-seq data. Here isoform usage refers to relative isoform expression given the total expression of the corresponding gene. IsoDOT performs two tasks that cannot be accomplished by existing methods: to test DIE/DIU with respect to a continuous covariate, and to test DIE/DIU for one case versus one control. The latter task is not an uncommon situation in practice, e.g., comparing the paternal and maternal alleles of one individual or comparing tumor and normal samples of one cancer patient. Simulation studies demonstrate the high sensitivity and specificity of IsoDOT. We apply IsoDOT to study the effects of haloperidol treatment on the mouse transcriptome and identify a group of genes whose isoform usages respond to haloperidol treatment.
Research Interests:
Research Interests:
Research Interests:
Research Interests:
Research Interests:
In a meta-analysis of diagnostic accuracy studies, the sensitivities and specificities of a diagnostic test may depend on the disease prevalence since the severity and definition of disease may differ from study to study due to the design... more
In a meta-analysis of diagnostic accuracy studies, the sensitivities and specificities of a diagnostic test may depend on the disease prevalence since the severity and definition of disease may differ from study to study due to the design and the population considered. In this paper, we extend the bivariate nonlinear random effects model on sensitivities and specificities to jointly model the disease prevalence, sensitivities and specificities using trivariate nonlinear random-effects models. Furthermore, as an alternative parameterization, we also propose jointly modeling the test prevalence and the predictive values, which reflect the clinical utility of a diagnostic test. These models allow investigators to study the complex relationship among the disease prevalence, sensitivities and specificities; or among test prevalence and the predictive values, which can reveal hidden information about test performance. We illustrate the proposed two approaches by reanalyzing the data from a meta-analysis of radiological evaluation of lymph node metastases in patients with cervical cancer and a simulation study. The latter illustrates the importance of carefully choosing an appropriate normality assumption for the disease prevalence, sensitivities and specificities, or the test prevalence and the predictive values. In practice, it is recommended to use model selection techniques to identify a best-fitting model for making statistical inference. In summary, the proposed trivariate random effects models are novel and can be very useful in practice for meta-analysis of diagnostic accuracy studies.
Research Interests: Statistics, Statistical Analysis, Nonlinear dynamics, Magnetic Resonance Imaging, Biometry, and 13 moreHumans, Model Selection, Female, Meta Analysis, Radiography, Sensitivity, Public health systems and services research, Statistical models, Diagnostic Accuracy, Sensitivity and Specificity, Predictive value of tests, Diagnostic Tests, and Uterine Cervical Neoplasms
Often in randomized clinical trials and observational cohort studies, a non-negative continuously distributed response variable is measured in treatment and control groups. In the presence of true zeros for the response variable, a... more
Often in randomized clinical trials and observational cohort studies, a non-negative continuously distributed response variable is measured in treatment and control groups. In the presence of true zeros for the response variable, a two-part zero-inflated log-normal model (which assumes that the data has a probability mass at zero and a continuous response for values greater than zero) is usually recommended. However, in some environmental health and human immunodeficiency virus (HIV) studies, quantitative assays for metabolites of toxicants, or quantitative HIV RNA measurements are subject to left-censoring due to values falling below the limit of detection (LD). Here, a zero-inflated log-normal mixture model is often suggested since true zeros are indistinguishable from left-censored values due to the LD. When the probabilities of true zeros in the two groups are not restricted to be equal, the information contributed by values falling below LD is used only to estimate the probability of true zeros in the context of mixture distributions. We derived the required sample size to assess the effect of a treatment in the context of mixture models with equal and unequal variances based on the left-truncated log-normal distribution. Methods for calculation of statistical power are also presented. We calculate the required sample size and power for a recent study estimating the effect of oltipraz on reducing urinary levels of the hydroxylated metabolite aflatoxin M(1) (AFM(1)) in a randomized, placebo-controlled, double-blind phase IIa chemoprevention trial in Qidong, China. A Monte Carlo simulation study is conducted to investigate the performance of the proposed methods.
Research Interests: Statistics, Power, Humans, Computer Simulation, Statistical Power, and 11 moreSample Size, Mixture models, Public health systems and services research, Drug evaluation, Food Contamination, Statistical models, Mixture Distribution, Sensitivity and Specificity, Monte Carlo Method, Cohort Studies, and Randomized Controlled Trials as Topic
Research Interests:
Research Interests:
To evaluate the probabilities of a disease state, ideally all subjects in a study should be diagnosed by a definitive diagnostic or gold standard test. However, since definitive diagnostic tests are often invasive and expensive, it is... more
To evaluate the probabilities of a disease state, ideally all subjects in a study should be diagnosed by a definitive diagnostic or gold standard test. However, since definitive diagnostic tests are often invasive and expensive, it is generally unethical to apply them to subjects whose screening tests are negative. In this article, we consider latent class models for screening studies with two imperfect binary diagnostic tests and a definitive categorical disease status measured only for those with at least one positive screening test. Specifically, we discuss a conditional-independent and three homogeneous conditional-dependent latent class models and assess the impact of misspecification of the dependence structure on the estimation of disease category probabilities using frequentist and Bayesian approaches. Interestingly, the three homogeneous-dependent models can provide identical goodness-of-fit but substantively different estimates for a given study. However, the parametric form of the assumed dependence structure itself is not 'testable' from the data, and thus the dependence structure modeling considered here can only be viewed as a sensitivity analysis concerning a more complicated non-identifiable model potentially involving a heterogeneous dependence structure. Furthermore, we discuss Bayesian model averaging together with its limitations as an alternative way to partially address this particularly challenging problem. The methods are applied to two cancer screening studies, and simulations are conducted to evaluate the performance of these methods. In summary, further research is needed to reduce the impact of model misspecification on the estimation of disease prevalence in such settings.
Research Interests: Statistics, Screening, Bayesian Inference, Humans, Computer Simulation, and 13 moreMaximum Likelihood, Female, Mammography, Prevalence, Middle Aged, Adult, Public health systems and services research, Statistical models, Diagnostic Test, Bayes Theorem, Breast Neoplasms, Latent class model, and Uterine Cervical Neoplasms
Likelihood-based approaches, which naturally incorporate left censoring due to limit of detection, are commonly utilized to analyze censored multivariate normal data. However, the maximum likelihood estimator (MLE) typically... more
Likelihood-based approaches, which naturally incorporate left censoring due to limit of detection, are commonly utilized to analyze censored multivariate normal data. However, the maximum likelihood estimator (MLE) typically underestimates variance parameters. The restricted maximum likelihood estimator (REML), which corrects the underestimation of variance parameters, cannot be easily extended to analyze censored multivariate normal data. In the light of the connection between the REML and a Bayesian approach discovered in 1974 by Dr Harville, this paper describes a Bayesian approach to censored multivariate normal data. This Bayesian approach is justified through its link to the REML via Laplace's approximation and its performance is evaluated through a simulation study. We consider the Bayesian approach as a valuable alternative because it yields less biased variance parameter estimates than the MLE, and because a solid REML is technically difficult when data are left censored.
Research Interests: Algorithms, Statistics, Probability, Multivariate Analysis, Muscle strength, and 13 moreComputer Simulation, Mice, Animals, Bayesian methods, Duchenne Muscular Dystrophy, Public health systems and services research, Statistical models, Bayes Theorem, Multivariate Normal Distribution, Likelihood Functions, Statistical Distributions, Bayesian approach, and Censoring
Research Interests:
To account for between-study heterogeneity in meta-analysis of diagnostic accuracy studies, bivariate random effects models have been recommended to jointly model the sensitivities and specificities. As study design and population vary,... more
To account for between-study heterogeneity in meta-analysis of diagnostic accuracy studies, bivariate random effects models have been recommended to jointly model the sensitivities and specificities. As study design and population vary, the definition of disease status or severity could differ across studies. Consequently, sensitivity and specificity may be correlated with disease prevalence. To account for this dependence, a trivariate random effects model had been proposed. However, the proposed approach can only include cohort studies with information estimating study-specific disease prevalence. In addition, some diagnostic accuracy studies only select a subset of samples to be verified by the reference test. It is known that ignoring unverified subjects may lead to partial verification bias in the estimation of prevalence, sensitivities, and specificities in a single study. However, the impact of this bias on a meta-analysis has not been investigated. In this paper, we propose a novel hybrid Bayesian hierarchical model combining cohort and case-control studies and correcting partial verification bias at the same time. We investigate the performance of the proposed methods through a set of simulation studies. Two case studies on assessing the diagnostic accuracy of gadolinium-enhanced magnetic resonance imaging in detecting lymph node metastases and of adrenal fluorine-18 fluorodeoxyglucose positron emission tomography in characterizing adrenal masses are presented.
Research Interests:
Research Interests:
Research Interests:
Research Interests:
Research Interests:
Melanoma cell lines and normal human melanocytes (NHM) were assayed for p53-dependent G1 checkpoint response to ionizing radiation (IR)-induced DNA damage. Sixty-six percent of melanoma cell lines displayed a defective G1 checkpoint.... more
Melanoma cell lines and normal human melanocytes (NHM) were assayed for p53-dependent G1 checkpoint response to ionizing radiation (IR)-induced DNA damage. Sixty-six percent of melanoma cell lines displayed a defective G1 checkpoint. Checkpoint function was correlated with sensitivity to IR with checkpoint-defective lines being radio-resistant. Microarray analysis identified 316 probes whose expression was correlated with G1 checkpoint function in melanoma lines (P≤0.007) including p53 transactivation targets CDKN1A, DDB2, and RRM2B. The 316 probe list predicted G1 checkpoint function of the melanoma lines with 86% accuracy using a binary analysis and 91% accuracy using a continuous analysis. When applied to microarray data from primary melanomas, the 316 probe list was prognostic of 4-yr distant metastasis-free survival. Thus, p53 function, radio-sensitivity, and metastatic spread may be estimated in melanomas from a signature of gene expression.
Research Interests:
Research Interests: Medicine, Humans, Infection Control, New, Regression Analysis, and 15 moreMichigan, Cohort Study, Incidence, Oak, Intensive Care Unit, Adult, New England, Bacteremia, Intensive Care Units, Poisson Distribution, New England Journalof Medicine, Confidence Interval, Cohort Studies, Nosocomial infection, and Inservice training
Research Interests:
Research Interests:
ABSTRACT This paper deals with the problem of estimating the Pearson correlation coefficient when one variable is subject to left or right censoring. In parallel to the classical results on the Pearson correlation coefficient, we derive a... more
ABSTRACT This paper deals with the problem of estimating the Pearson correlation coefficient when one variable is subject to left or right censoring. In parallel to the classical results on the Pearson correlation coefficient, we derive a workable formula, through tedious computation and intensive simplification, of the asymptotic variances of the maximum likelihood estimators in two cases: (1) known means and variances and (2) unknown means and variances. We illustrate the usefulness of the asymptotic results in experimental designs.
Research Interests:
Research Interests:
A marginal approach and a variance-component mixed effect model approach (here called a conditional approach) are commonly used to analyze variables that are subject to limit of detection. We examine the theoretical relationship and... more
A marginal approach and a variance-component mixed effect model approach (here called a conditional approach) are commonly used to analyze variables that are subject to limit of detection. We examine the theoretical relationship and investigate the numerical performance of these two approaches. We make some recommendations based on our results. The marginal approach is recommended for bivariate normal variables, and the variance-component mixed effect model is preferable for other multivariate analysis in most circumstances. Two approaches are illustrated through one case study from a preclinical experiment.
Research Interests:
The traditional fixed margin approach to evaluating an experimental treatment through an active-controlled noninferiority trial is simple and straightforward. However, its utility relies heavily on the constancy assumption of the... more
The traditional fixed margin approach to evaluating an experimental treatment through an active-controlled noninferiority trial is simple and straightforward. However, its utility relies heavily on the constancy assumption of the experimental data. The recently developed covariate-adjustment method permits more flexibility and improved discriminatory capacity compared to the fixed margin approach. However, one major limitation of this covariate-adjustment methodology is its adherence on the patient-level data, which may not be accessible to investigators in practice. In this article, under some assumptions, we examine the feasibility of a partial covariate-adjustment approach based on data typically available from journal publications or other public data when the patient-level data are unavailable. We illustrate the usefulness of this approach through two real examples. We also provide design considerations on the efficiency of the partial covariate-adjustment approach.
Research Interests:
Research Interests: Humans, Hospital costs, United States, Female, Male, and 13 moreFollow-up studies, Cost effectiveness, Myocardial Revascularization, Aged, Middle Aged, Quality adjusted life years, Myocardial Infarction, Cost Benefit Analysis, Health Care Costs, JAMA, Unstable Angina, Tyrosine, and Conservation strategies
Research Interests:
Research Interests:
To examine the urban and rural variation in walking patterns and pedestrian crashes. The rates of pedestrians being struck by motor vehicles was estimated according to miles walked and resident years. New York State, USA during 2001... more
To examine the urban and rural variation in walking patterns and pedestrian crashes. The rates of pedestrians being struck by motor vehicles was estimated according to miles walked and resident years. New York State, USA during 2001 through 2002. 35 732 pedestrians struck by vehicles. The adjusted rate ratio (aRR) of pedestrian-vehicle crash and pedestrian injury based on resident years and miles walked according to urban and rural areas. Compared with rural areas, the aRR for a pedestrian-vehicle collision, based on resident years, was 2.0 (95% CI 1.7 to 2.3) in small urban areas, 1.8 (95% CI 1.5 to 2.3) in mid-size urban areas, and 4.2 (95% CI 3.6 to 4.8) in the large urban area. The aRR based on miles walked was 2.3 (95% CI 1.6 to 3.2) in small urban areas, 2.0 (95% CI 1.4 to 2.9) in mid-size urban areas, and 1.9 (95% CI 1.4 to 2.7) in the large area. The aRR for a fatal pedestrian injury, based on miles walked, was 2.1 (95% CI 1.3 to 3.6) in small urban areas, 1.9 (95% CI 1.3 to 2.9) in mid-size urban areas, and 0.9 (95% CI 0.6 to 1.3) in the large urban area. The rate of pedestrian crashes and injuries in small and mid-size urban areas was twice that in rural areas, whether based on resident years or miles walked. The high rate of pedestrian crashes in the large urban area based on resident years could be partly explained by the fact that residents in such areas walk about twice as much as residents in rural areas. The rate of fatal pedestrian injury based on miles walked was similar in the large urban area and rural areas.