Skip to main content
Daniel Jeske

    Daniel Jeske

    Using general results available in the literature, we derive the likelihood ratio test for a particular partial ordering of means that naturally arises in a biological context. We then show that the conceptual and computational complexity... more
    Using general results available in the literature, we derive the likelihood ratio test for a particular partial ordering of means that naturally arises in a biological context. We then show that the conceptual and computational complexity of the derivation can be substantially reduced by equivalently deriving the test using the intersection-union principle for decomposing a complex null hypothesis into elemental forms. A Monte Carlo algorithm for obtaining the p-value of the test is proposed. The test procedure is illustrated with a data set of the competitive ability of several cowpea genotypes, where previous experiments have indicated the proposed partial order of the means. A simulation study is used to examine the power of the test.
    ABSTRACT This paper outlines a systematic data mining procedure for exploring large free-style text datasets to discover useful features and develop tracking statistics, generally referred to as performance measures or risk indicators.... more
    ABSTRACT This paper outlines a systematic data mining procedure for exploring large free-style text datasets to discover useful features and develop tracking statistics, generally referred to as performance measures or risk indicators. The procedure includes text mining, risk analysis, classification for error measurements and nonparametric multivariate analysis. Two aviation safety report repositories PTRS from the FAA and AAS from the NTSB will be used to illustrate applications of our research to aviation risk management and general decision-support systems. Some specific text analysis methodologies and tracking statistics will be discussed. Approaches to incorporating misclassified data or error measurements into tracking statistics will be discussed as well.
    ... Many change-point algorithms have been proposed in the literature and used in practice. (Basseville and Nikiforov, 1993), (Lai, 1995) and (Chen and Gupta, 2001) are useful references that describe in some detail a number of... more
    ... Many change-point algorithms have been proposed in the literature and used in practice. (Basseville and Nikiforov, 1993), (Lai, 1995) and (Chen and Gupta, 2001) are useful references that describe in some detail a number of alternative change-point algorithms. ...
    ... The work in this paper improves the development of a recent proposal (Jeske et al. ... This section reviews the original motivation, origin and usefulness of the iterative proportional fitting (IPF) algorithm (Deming and Stefan 19404.... more
    ... The work in this paper improves the development of a recent proposal (Jeske et al. ... This section reviews the original motivation, origin and usefulness of the iterative proportional fitting (IPF) algorithm (Deming and Stefan 19404. Deming, WE and Stephan, FF. 1940. ...
    Reference samples are frequently used to estimate in‐control parameters, which are then used as the true in‐control parameters during the monitoring phase of Statistical Process Control (SPC) applications. The SPC literature has... more
    Reference samples are frequently used to estimate in‐control parameters, which are then used as the true in‐control parameters during the monitoring phase of Statistical Process Control (SPC) applications. The SPC literature has recognized that even small errors in parameter estimates determined from reference samples can have a large impact on the conditional (given the values of the estimated parameters) in‐control average run length. However, there is little quantitative guidance on how large the reference sample should be to minimize this impact. In this paper, under the context of a recently developed Cumulative Sum (CUSUM) designed to detect translations in exponential distributions, a reference sample size formula for controlling relative error of the conditional in‐control average run length is derived. The result in this paper is a stepping stone for reference sample size formulas in more general settings. Copyright © 2016 John Wiley & Sons, Ltd.
    We introduce the concept of proxy failure times for situations where system test data only consists of the fraction of test cases that fail for a set of execution scenarios. We show how proxy failure times can be simulated if external... more
    We introduce the concept of proxy failure times for situations where system test data only consists of the fraction of test cases that fail for a set of execution scenarios. We show how proxy failure times can be simulated if external information about the user frequency of the test cases is available. We develop statistical inference procedures for fitting the
    ABSTRACT When teaching regression classes real-life examples help emphasize the importance of understanding theoretical concepts related to methodologies. This can be appreciated after a little reflection on the difficulty of constructing... more
    ABSTRACT When teaching regression classes real-life examples help emphasize the importance of understanding theoretical concepts related to methodologies. This can be appreciated after a little reflection on the difficulty of constructing novel questions in regression that test on concepts rather than mere calculations. Interdisciplinary collaborations can be fertile contexts for questions of this type. In this article, we offer a case study that students will find: (1) practical with respect to the question being addressed, (2) compelling in the way it shows how a solid understanding of theory helps answer the question, and (3) enlightening in the way it shows how statisticians contribute to problem solving in interdisciplinary environments. Supplementary materials for this article are available online.
    ABSTRACT We develop truncated sequential probability ratio test (SPRT) procedures for multivariate normal data. The framework includes a general cost structure and arbitrary mean and covariance structures. The truncated SPRT solutions... more
    ABSTRACT We develop truncated sequential probability ratio test (SPRT) procedures for multivariate normal data. The framework includes a general cost structure and arbitrary mean and covariance structures. The truncated SPRT solutions have a practical and easy-to-use decision boundary representation. In the homogeneous case, a very fast recursive algorithm is presented for calculating the decision boundaries. Misclassification rates and expected sample size are investigated and the results are compared with a nonsequential procedure. A real-life data set on kidney dysfunction following heart surgery is used to illustrate the truncated SPRT procedure.
    This paper proposes a statistical method for identifying high‐density regions of pests, so‐called hot spots, within an orchard. Our method uses scanning windows to search for clusters of high counts within the sampled data. The proposed... more
    This paper proposes a statistical method for identifying high‐density regions of pests, so‐called hot spots, within an orchard. Our method uses scanning windows to search for clusters of high counts within the sampled data. The proposed method enables a localized alternative for treatment that could be faster, less costly, and more environmentally friendly. R code that implements the hot spot identification method is provided as online supplementary material. The method is illustrated through simulated examples and a real data on counts of cottony cushion scales from an orchard.
    Capitalization, sharing positive personal information in a relationship, has gained considerable attention for benefitting the discloser of positive information. However, this study is the first to examine the effects of capitalization on... more
    Capitalization, sharing positive personal information in a relationship, has gained considerable attention for benefitting the discloser of positive information. However, this study is the first to examine the effects of capitalization on the listener who celebrates the news, the celebrator. Thirty-nine college students participated in a daily diary capitalization intervention for 4 weeks in which every other week they celebrated capitalization. In diaries, participants reported experiencing more positive emotions when celebrating than when not celebrating. The discloser’s positive reaction to the celebration mediated the relationship between the number of celebrations and the celebrator’s higher positive emotions. In addition, more celebrations per day and feeling more authentic during celebrations predicted higher positive emotions. However, the perceived closeness of the relationship between the discloser and the celebrator was not associated with the effect of celebrating. Impli...
    We introduce the concept of proxy failure times for situations where system test data only consists of the fraction of test cases that fail for a set of execution scenarios. We show how proxy failure times can be simulated if external... more
    We introduce the concept of proxy failure times for situations where system test data only consists of the fraction of test cases that fail for a set of execution scenarios. We show how proxy failure times can be simulated if external information about the user frequency of the test cases is available. We develop statistical inference procedures for fitting the
    Three methods to construct prediction intervals in a generalized linear mixed model (GLMM) are the methods based on pseudo-likelihood, Laplace, and Quadrature approximations. All three of these methods are available in the SAS procedure... more
    Three methods to construct prediction intervals in a generalized linear mixed model (GLMM) are the methods based on pseudo-likelihood, Laplace, and Quadrature approximations. All three of these methods are available in the SAS procedure GLIMMIX. The pseudo-likelihood method involves approximate linearization of the GLMM into a linear mixed model (LMM) framework, and the other two methods utilize approximate conditional mean squared error (MSE) formulas for the empirical best predictor (eBP). A new method has been proposed based on the unconditional MSE of the eBP, working entirely within the GLMM context; then inherent computational challenges were confronted by proposing a Monte Carlo algorithm to evaluate the plug-in estimators of the unconditional MSE. For three illustrated examples, the negative binomial, the Poisson and the Bernoulli GLMMs, numerical results showed that our prediction interval methodology improves the coverage probability over the three methods available in GLI...
    A mixture of a distribution of responses from untreated patients and a shift of that distribution is a useful model for the responses from a group of treated patients. The mixture model accounts for the fact that not all the patients in... more
    A mixture of a distribution of responses from untreated patients and a shift of that distribution is a useful model for the responses from a group of treated patients. The mixture model accounts for the fact that not all the patients in the treated group will respond to the treatment and consequently their responses follow the same distribution as the responses from untreated patients. The treatment effect in this context consists of both the fraction of the treated patients that are responders and the magnitude of the shift in the distribution for the responders. In this paper, we investigate properties of the method of moment estimators for the treatment effect and demonstrate their usefulness for obtaining approximate confidence intervals without any parametric assumptions about the distribution of responses.
    e16593Background: A number of methods have been proposed for the prediction of earlybiochemical recurrence since recurrence within 2.5-3 yrs after surgery is associated with higher risk of prostate...
    Continuous sampling plans are used to ensure a high level of quality for items produced in long-run contexts. The basic idea of these plans is to alternate between 100% inspection and a reduced rate of inspection frequency. Any inspected... more
    Continuous sampling plans are used to ensure a high level of quality for items produced in long-run contexts. The basic idea of these plans is to alternate between 100% inspection and a reduced rate of inspection frequency. Any inspected item that is found to be defective is replaced with a non-defective item. Because not all items are inspected, some defective items will escape to the customer. Analytical formulas have been developed that measure both the customer perceived quality and also the level of inspection effort. The analysis of continuous sampling plans does not apply to short-run contexts, where only a finite-size batch of items is to be produced. In this paper, a simulation algorithm is designed and implemented to analyze the customer perceived quality and the level of inspection effort for short-run contexts. A parameter representing the effectiveness of the test used during inspection is introduced to the analysis, and an analytical approximation is discussed. An appl...
    When the potential for making accurate classifications with a statistical classifier is limited, a neutral zone classifier can be constructed by adding a no-decision option as a classification outcome. We show how a neutral zone... more
    When the potential for making accurate classifications with a statistical classifier is limited, a neutral zone classifier can be constructed by adding a no-decision option as a classification outcome. We show how a neutral zone classifier can be constructed from a receiving operating characteristic (ROC) curve. We extend the ROC curve graphic to highlight important performance characteristics of a neutral zone classifier. Additional utility of neutral zone classifiers is illustrated by showing how they can be incorporated into the first stage of a two-stage classification process. At the first stage, a classification is attempted from easily collected or inexpensive features. If the classification falls into the neutral zone, additional relatively more expensive features can be obtained and used to make a definitive classification at the second stage. The methods discussed in the paper are illustrated with an application pertaining to prostate cancer.
    Studies of spatiotemporal dynamics are central to efforts to characterize the epidemiology of infectious disease, such as mechanism of pathogen spread and pathogen or vector sources in the landscape, and are critical to the development of... more
    Studies of spatiotemporal dynamics are central to efforts to characterize the epidemiology of infectious disease, such as mechanism of pathogen spread and pathogen or vector sources in the landscape, and are critical to the development of effective disease management programs. To that end, we conducted a multi-year study of 20 vineyard blocks in coastal northern California to relate the dynamics of a mealybug vector, Pseudococcus maritimus (Ehrhorn) (Hemiptera: Pseudococcidae), to incidence of grapevine leafroll disease (GLD). In each vineyard block, a subset of vines were scored visually for relative mealybug abundance, disease was quantified by visual assessment, and virus presence was verified using standard laboratory molecular assays. GLD incidence was analyzed with a classification and regression tree, and with a hierarchical model that also captured variability among blocks and heterogeneity within blocks. Both analyses found strong interannual variability in incidence, with ...
    Early biochemical recurrence after prostate cancer surgery is associated with higher risk of aggressive disease and cancer specific death. Many new tests are being developed that will predict the presence of indicators of aggressive... more
    Early biochemical recurrence after prostate cancer surgery is associated with higher risk of aggressive disease and cancer specific death. Many new tests are being developed that will predict the presence of indicators of aggressive disease like early biochemical recurrence. Since recurrence occurs in less than 10% of patients treated for prostate cancer, validation of such tests will require expensive testing on large patient groups. Moreover, clinical application of the validated test requires that each new patient be tested. In this report we introduce a two-stage classifier system that minimizes the number of patients that must be tested in both the validation and clinical application of any new test for recurrence. Expressed prostatic secretion specimens were prospectively collected from 450 patients prior to robot-assisted radical prostatectomy for prostate cancer. Patients were followed for 2.5 years for evidence of biochemical recurrence. Standard clinical parameters, the le...
    The question considered in this paper is how large does an in-control reference sample need to be in order to control the effects of using estimated parameters when using a normal-theory cumulative sum (CUSUM) tracking statistic? Previous... more
    The question considered in this paper is how large does an in-control reference sample need to be in order to control the effects of using estimated parameters when using a normal-theory cumulative sum (CUSUM) tracking statistic? Previous research has demonstrated the effect of estimation errors on the conditional in-control average run length of the CUSUM. The contributions of this paper are simple analytical tools that determine the required reference sample size needed to ensure probabilistic control of the relative error of the conditional in-control average run length. The availability of these tools rounds out the design phase of the CUSUM by enabling a practical procedure for determining the needed size of the reference sample. Copyright © 2016 John Wiley & Sons, Ltd.

    And 50 more