Abstract
We propose a family of multivariate heavy-tailed distributions that allow variable marginal amounts of tailweight. The originality comes from introducing multidimensional instead of univariate scale variables for the mixture of scaled Gaussian family of distributions. In contrast to most existing approaches, the derived distributions can account for a variety of shapes and have a simple tractable form with a closed-form probability density function whatever the dimension. We examine a number of properties of these distributions and illustrate them in the particular case of Pearson type VII and t tails. For these latter cases, we provide maximum likelihood estimation of the parameters and illustrate their modelling flexibility on simulated and real data clustering examples.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Andrews, J.L., McNicholas, P.D.: Model-based clustering, classification, and discriminant analysis via mixtures of multivariate t-distributions. Stat. Comput. 22(5), 1021–1029 (2012)
Archambeau, C., Verleysen, M.: Robust Bayesian clustering. Neural Netw. 20(1), 129–138 (2007)
Arnaud, E., Christensen, H., Lu, Y.-C., Barker, J., Khalidov, V., Hansard, M., Holveck, B., Mathieu, H., Narasimha, R., Taillant, E., Forbes, F., Horaud, R.: The CAVA corpus: synchronised stereoscopic and binaural datasets with head movements. In: 10th International Conference on Multimodal Interfaces, ICMI 2008, pp. 109–116. Chania, Crete, Greece (2008). ACM
Azzalini, A., Genton, M.G.: Robust likelihood methods based on the skew-t and related distributions. Int. Stat. Rev. 76(1), 106–129 (2008)
Barndorff-Nielsen, O., Kent, J., Sorensen, M.: Normal variance-mean mixtures and z distributions. Int. Stat. Rev. 50(2), 145–159 (1982)
Bishop, C.M., Svensen, M.: Robust Bayesian mixture modelling. Neurocomputing 64, 235–252 (2005)
Bouveyron, C., Girard, S., Schmid, C.: High dimensional data clustering. Comput. Stat. Data Anal. 52(1), 502–519 (2007)
Browne, R., McNicholas, P.: Orthogonal Stiefel manifold optimization for eigen-decomposed covariance parameter estimation in mixture models. In: Statistics and Computing (2012). Published online doi:10.1007/s11222-012-9364-2
Celeux, G., Govaert, G.: Gaussian parsimonious clustering models. Pattern Recognit. 28(5), 781–793 (1995)
Cuesta-Albertos, J.A., Gordaliza, A., Matran, C.: Trimmed k-means: an attempt to robustify quantizers. Ann. Stat. 25(2), 553–576 (1997)
Cuesta-Albertos, J.A., Matrán, C., Mayo-Iscar, A.: Robust estimation in the normal mixture model based on robust clustering. J. R. Stat. Soc., Ser. B, Stat. Methodol. 70(4), 779–802 (2008)
Daul, S., DeGiorgi, E., Lindskog, F., McNeil, A.J.: The grouped t-copula with an application to credit risk. Risk 16, 73–76 (2003)
Demarta, S., McNeil, A.J.: The t copula and related copulas. Int. Stat. Rev. 73(1), 111–129 (2005)
Eltoft, T., Kim, T., Lee, T.-W.: Multivariate scale mixture of Gaussians modeling. In: Rosca, J., Erdogmus, D., Principe, J., Haykin, S. (eds.) Independent Component Analysis and Blind Signal Separation. Lecture Notes in Computer Science, vol. 3889, pp. 799–806. Springer, Berlin/Heidelberg (2006)
Fang, H.-B., Fang, K.-T., Kotz, S.: The meta-elliptical distributions with given marginals. J. Multivar. Anal. 82(1), 1–16 (2002)
Finegold, M., Drton, M.: Robust graphical modeling of gene networks using classical and alternative t-distributions. Ann. Appl. Stat. 5(2A), 1057–1080 (2011)
Flury, B.N.: Common principal components in K groups. J. Am. Stat. Assoc. 79(388), 892–898 (1984)
Flury, B.N., Gautschi, W.: An algorithm for simultaneous orthogonal transformation of several positive definite symmetric matrices to nearly diagonal form. SIAM J. Sci. Stat. Comput. 7(1), 169–184 (1986)
Forbes, F., Doyle, S., Garcia-Lorenzo, D., Barillot, C., Dojat, M.: A weighted multi-sequence Markov model for brain lesion segmentation. In: 13th International Conference on Artificial Intelligence and Statistics (AISTATS10), pp. 13–15. Sardinia, Italy (2010)
Fraley, C., Raftery, A.E.: Model-based clustering, discriminant analysis, and density estimation. J. Am. Stat. Assoc. 97(458), 611–631 (2002)
Giordani, R., Mun, X., Tran, M.-N., Kohn, R.: Flexible multivariate density estimation with marginal adaptation. J. Comput. Graph. Stat. (2012). Published on line doi:10.1080/10618600.2012.672784
Johnson, N.L., Kotz, S., Balakrishnan, N.: Continuous Univariate Distributions, vol. 2, 2nd edn. Wiley, New York (1994)
Jones, M.C.: A dependent bivariate t distribution with marginals on different degrees of freedom. Stat. Probab. Lett. 56(2), 163–170 (2002)
Karlis, D., Santourian, A.: Model-based clustering with non-elliptically contoured distributions. Stat. Comput. 19(1), 73–83 (2009)
Khalidov, V.: Conjugate mixture models for the modelling of visual and auditory perception. PhD thesis, Grenoble University (October 2010)
Khalidov, V., Forbes, F., Horaud, R.: Conjugate mixture models for clustering multimodal data. Neural Comput. 23(2), 517–557 (2011)
Kotz, S., Nadarajah, S.: Multivariate t Distributions and their Applications. Cambridge (2004)
McLachlan, G.J., Peel, D.: Finite Mixture Models. Wiley, New York (2000a)
McLachlan, G.J., Peel, D.: Robust mixture modelling using the t distribution. Stat. Comput. 10(4), 339–348 (2000b)
Nadarajah, S., Dey, D.K.: Multitude of multivariate t distributions. J. Theor. Appl. Stat. 39(2), 149–181 (2005)
Nadarajah, S., Kotz, S.: Multitude of bivariate t distributions. J. Theor. Appl. Stat. 38(6), 527–539 (2004)
Shaw, W.T., Lee, K.T.A.: Bivariate Student distributions with variable marginal degrees of freedom and independence. J. Multivar. Anal. 99(6), 1276–1287 (2008)
Shephard, N.: From characteristic function to distribution function: a simple framework for the theory. Econom. Theory 7(4), 519–529 (1991)
Shoham, S.: Robust clustering by deterministic agglomeration EM of mixtures of multivariate t-distributions. Pattern Recognit. 35(5), 1127–1142 (2002)
Witkovský, V.: On the exact computation of the density and of the quantiles of linear combinations of t and F random variables. J. Stat. Plan. Inference 94(1), 1–13 (2001)
Author information
Authors and Affiliations
Corresponding author
Electronic Supplementary Material
Below is the link to the electronic supplementary material.
Rights and permissions
About this article
Cite this article
Forbes, F., Wraith, D. A new family of multivariate heavy-tailed distributions with variable marginal amounts of tailweight: application to robust clustering. Stat Comput 24, 971–984 (2014). https://doi.org/10.1007/s11222-013-9414-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-013-9414-4