Abstract
Data containing many zeroes is popular in statistical applications, such as survey data. A confidence interval based on the traditional normal approximation may lead to poor coverage probabilities, especially when the nonzero values are highly skewed and the sample size is small or moderately large. The empirical likelihood (EL), a powerful nonparametric method, was proposed to construct confidence intervals under such a scenario. However, the traditional empirical likelihood experiences the issue of under-coverage problem which causes the coverage probability of the EL-based confidence intervals to be lower than the nominal level, especially in small sample sizes. In this paper, we investigate the numerical performance of three modified versions of the EL: the adjusted empirical likelihood, the transformed empirical likelihood, and the transformed adjusted empirical likelihood for data with various sample sizes and various proportions of zero values. Asymptotic distributions of the likelihood-type statistics have been established as the standard chi-square distribution. Simulations are conducted to compare coverage probabilities with other existing methods under different distributions. Real data has been given to illustrate the procedure of constructing confidence intervals.







Similar content being viewed by others
References
Bao Y, Vinciotti V, Wit E, ’t Hoen PAC (2014) Joint modeling of ChIP-seq data via a Markov random field model. Biostatistics 15:296–310
Chen J, Chen S-Y, Rao JNK (2003) Empirical likelihood confidence intervals for a population containing many zero values. Can J Stat 31(1):53–68
Chen J, Variyath AM, Abraham B (2008) Adjusted empirical likelihood and its properties. J Comput Graph Stat 17(2):426–443
Chen SX, Qin J (2003) Empirical likelihood-based confidence intrevals for data with possible zero observations. Stat Probab Lett 65:29–37
Cox DR, Snell EJ (1979) On sampling and the estimation of rare errors. Biometrika 66:125–132
Emerson S, Owen A (2009) Calibration of the empirical likelihood method for a vector mean. Electron J Stat 3:1161–1192
Jing B-Y, Tsao M, Zhou W (2017) Transforming the empirical liklihood towards better accuracy. Can J Stat 45(3):340–352
Kvanli AH, Shen YK, Deng LY (1998) Construction of confidence intervals for the mean of a population contaning many zero values. J Bus Econ Stat 16:362–368
Liang W, Dai H, He S (2019) Mean empirical likelihood. Computat Stat Data Anal 138:155–169
Liu Y, Chen J (2010) Adjusted empirical likelihood with high-order precision. Ann Stat 38:1341–1362
Miao Z, Deng K, Wang X, Zhang X (2018) DEsingle for detecting three types of differential expression in single-cell RNA-seq data. Bioinformatics 34:3223–3224
Owen AB (1988) Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75:237–249
Owen AB (1990) Empirical likelihood confidence regions. Ann Stat 18:90–120
Owen AB (2001) Empirical likelihood. Champan & Hall, New York
Sang J, Wang L, Cao J (2019) Weighted empirical likelihood inference for dynamical correlations. Comput Stat Data Anal 131:194–206
Tamura H (1988) Estimation of rare errors using expert judgement. Biometrika 75:1–9
Tsao M (2013) Extending the empirical likelihood by domain expansion. Can J Stat 41(2):257–274
Welch A, Zhou XH (2004) Estimating the retransformed mean in a heteroscedastic two-part model. UW Biostatistics Working Paper Series
Zhao P, Rao JNK, Wu C (2020) Bayesian empirical likelihood inference withcomplex survey data. J R Stat Soc Ser B 82:155–174
Zhou XH, Tu W (2000) Confidence intervals for the mean of diagnositic test charge data containing zeros. Biometrics 56:1118–1125
Acknowledgements
The authors would like to thank three anonymous referees and the Associate Editor for their constructive comments and suggestions which helped to improve this manuscript significantly.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Stewart, P., Ning, W. Modified empirical likelihood-based confidence intervals for data containing many zero observations. Comput Stat 35, 2019–2042 (2020). https://doi.org/10.1007/s00180-020-00993-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00180-020-00993-1