Daniel Fernández

SUNY: University at Albany, Epidemiology and Biostatistics, Faculty Member

Followers

Following

Co-authors

Public Views

Phone: +1 9175074618

less

Interests

Uploads

Papers by Daniel Fernández

A goodness-of-fit test for the ordered stereotype model

This paper presents a new goodness-of-fit test for an ordered stereotype model used for an ordina... more This paper presents a new goodness-of-fit test for an ordered stereotype model used for an ordinal response variable. The proposed test is based on the well-known Hosmer–Lemeshow test and its version for the proportional odds regression model. The latter test statistic is calculated from a grouping scheme assuming that the levels of the ordinal response are equally spaced which might be not true. One of the main advantages of the ordered stereotype model is that it allows us to determine a new uneven spacing of the ordinal response categories, dictated by the data. The proposed test takes the use of this new adjusted spacing to partition data. A simulation study shows good performance of the proposed test under a variety of scenarios. Finally, the results of the application in two examples are presented.

Download

Joint modeling of survival and longitudinal ordered data using a semiparametric approach

Medical research frequently focuses on the relationship between quality of life and survival time... more Medical research frequently focuses on the relationship between quality of life and survival time of subjects. Quality of life may be one of the most important factors that could be used to predict survival, making it worth identifying factors that jointly affect survival and quality of life. We propose a semiparametric joint model that consists of item response and survival components, where these two components are linked through latent variables. Several popular ordinal models are considered and compared in the item response component, while the Cox proportional hazards model is used in the survival component. We estimate the baseline hazard function and model parameters simultaneously, through a profile likelihood approach. We illustrate the method using an example from a clinical study.

Download

Biclustering Models for Two-Mode Ordinal Data

by Daniel Fernández and Eleni Matechou

The work in this paper introduces finite mixture models that can be used to simultaneously cluste... more The work in this paper introduces finite mixture models that can be used to simultaneously cluster the rows and columns of two-mode ordinal categorical response data, such as those resulting from Likert scale responses. We use the popular proportional odds parameterisation and propose models which provide insights into major patterns in the data. Model-fitting is performed using the EM algorithm, and a fuzzy allocation of rows and columns to corresponding clusters is obtained. The clustering ability of the models is evaluated in a simulation study and demonstrated using two real data sets.

Download

Categorising Count Data into Ordinal Responses with Application to Ecological Communities

Count data sets may involve overdispersion from a set of species and underdispersion from another... more Count data sets may involve overdispersion from a set of species and underdispersion from another set which would require fitting different models (e.g. a negative bino-mial model for the overdispersed set and a binomial model for the underdispersed one). Additionally, many count data sets have very high counts and very low counts. Cate-gorising these counts into ordinal categories makes the actual counts less influential in the model fitting, giving broad categories which enable us to detect major broadly based patterns of turnover or nestedness shown by groups of species. In this paper, a strategy of categorising count data into ordinal data was carried out and also we implemented measures to compare different cluster structures. The application of this categorising strategy and a comparison of clustering results between count and categorised ordinal data in two ecological community data sets are shown. A major advantage of using our ordinal approach is that it allows for the inclusion of all different levels of dispersion in the data in one methodology, without treating the data differently. This reduction of the parameters on modelling different levels of dispersion does not substantially change the results in clustering structure. In the two data sets used in this paper, we observed ordinal clustering structure up to 93.1 % similar to those from the count data approaches. This has the important implication of supporting simpler, faster data collection using ordinal scales only. Supplementary materials accompanying this paper appear on-line.

Download

Estimation of spatial sampling effort based on presence-only data and accessibility

Sampling bias contained in data of biological surveys is very common. Bias is clearly a function ... more Sampling bias contained in data of biological surveys is very common. Bias is clearly a function of roads, cities, rivers, or other physical features that determines accessibility of collectors, and many data sets of species are presence-only. We set out to estimate spatial sampling bias in a region, based on presence-only data, by explicitly incorporating information on these accessibility factors, and by considering a target group of species that may share a common search pattern. In order to indirectly estimate the number of individuals, we also resort to the concept of species richness. A probabilistic (multinomial) model is proposed, enabling standard likelihood inference procedures to be implemented. Simulation scenarios for exploration of the model and experimentation with the estimation procedure are included. Illustrative examples over a region of Mexico with mammals and butterflies are also reported with insightful results. Our model is able to estimate the sampling bias in a region and enhance the inferences regarding presence-only data.

Download

Mixture-based clustering for the ordered stereotype model

• New methodology for clustering rows and columns from a matrix of ordinal data. • Establishes li... more • New methodology for clustering rows and columns from a matrix of ordinal data. • Establishes likelihood-based methods via finite mixtures with the stereotype model. • Tests the reliability of this methodology through a simulation study. • Illustrates this new approach with two examples. • Reviews and compares the performance several model choice measures. a b s t r a c t Many of the methods which deal with the reduction of dimensionality in matrices of data are based on mathematical techniques such as distance-based algorithms or matrix decomposition and eigenvalues. Recently a group of likelihood-based finite mixture models for a data matrix with binary or count data, using basic Bernoulli or Poisson building blocks has been developed. This is extended and establishes likelihood-based multivariate methods for a data matrix with ordinal data which applies fuzzy clustering via finite mixtures to the ordered stereotype model. Model-fitting is performed using the expectation–maximization (EM) algorithm, and a fuzzy allocation of rows, columns, and rows and columns simultaneously to corresponding clusters is obtained. A simulation study is presented which includes a variety of scenarios in order to test the reliability of the proposed model. Finally, the results of the application of the model in two real data sets are shown.

Download

Introducing Spaced Mosaic Plots

by Daniel Fernández, Richard Arnold, and Shirley Pledger

Recent research has developed a group of likelihood-based finite mixture models for a data matrix... more Recent research has developed a group of likelihood-based finite mixture models for a data matrix with ordinal data, establishing likelihood-based multivari-ate methods which applies fuzzy clustering via finite mixtures to the ordered stereotype model. There are many visualisation tools which depict reduction of dimensionality in matrices of ordinal data. This technical report introduces the spaced mosaic plot which is one new graphical tool for ordinal data when the ordinal stereotype model is used. It takes advantage of the fitted score parameters to determine the spacing between two adjacent ordinal categories. We develop a function in R and its documentation is presented. Finally, the description of a spaced mosaic plot is shown.

Download