Abstract
Motivated by real applications, heterogeneous learning has emerged as an important research area, which aims to model the coexistence of multiple types of heterogeneity. In this paper, we propose a heterogeneous representation learning model with structured sparsity regularization (HERES) to learn from multiple types of heterogeneity. It aims to leverage the rich correlations (e.g., task relatedness, view consistency, and label correlation) and the prior knowledge (e.g., the soft-clustering of tasks) of heterogeneous data to improve learning performance. To this end, HERES integrates multi-task, multi-view, and multi-label learning into a principled framework based on representation learning to model the complex correlations and employs the structured sparsity to encode the prior knowledge of data. The objective is to simultaneously minimize the reconstruction loss of using the factor matrices to recover the heterogeneous data, and the structured sparsity imposed on the model. The resulting optimization problem is challenging due to the non-smoothness and non-separability of structured sparsity. We reformulate the problem by using the auxiliary function and prove that the reformulation is separable, which leads to an efficient algorithm family for solving structured sparsity penalized problems. Furthermore, we propose various HERES models based on different loss functions and subsume them into the weighted HERES, which is able to handle missing data. The experimental results in comparison with state-of-the-art methods demonstrate the effectiveness of the proposed approach.
Similar content being viewed by others
References
Argyriou A, Evgeniou T, Pontil M (2006) Multi-task feature learning. In: NIPS, pp 41–48
Argyriou A, Micchelli CA, Pontil M, Shen L, Xu Y (2011) Efficient first order methods for linear composite regularizers. CoRR, arXiv:1104.1436
Bhatia K, Jain H, Kar P, Varma M, Jain P (2015) Sparse local embeddings for extreme multi-label classification. In: NIPS, pp 730–738
Blum A, Mitchell T (1998) Combining labeled and unlabeled data with co-training. In: COLT, pp 92–100
Caruana R (1997) Multitask learning. Mach. Learn. 28(1):41–75
Chang X, Nie F, Yang Y, Huang H (2014) A convex formulation for semi-supervised multi-label feature selection. In: AAAI, pp 1171–1177
Chen X, Lin Q, Kim S, Carbonell JG, Xing EP (2011) Smoothing proximal gradient method for general structured sparse learning. In: UAI, pp 105–114
Chua T, Tang J, Hong R, Li H, Luo Z, Zheng Y (2009) NUS-WIDE: a real-world web image database from national university of Singapore. In: CIVR
Elisseeff A, Weston J (2001) A kernel method for multi-labelled classification. In: NIPS, pp 681–687
Farquhar JDR, Hardoon DR, Meng H, Shawe-Taylor J, Szedmák S (2005) Two view learning: SVM-2K, theory and practice. In: NIPS
Gong P, Ye J, Zhang C (2012) Robust multi-task feature learning. In: KDD, pp 895–903
Gong P, Zhou J, Fan W, Ye J (2014) Efficient multi-task feature learning with calibration. In: KDD, pp 761–770
Guo Y (2013) Convex subspace representation learning from multi-view data. In: AAAI
Han L, Zhang Y (2015) Learning tree structure in multi-task learning. In: KDD, pp 397–406
He J, Lawrence R (2011) A graph-based framework for multi-task multi-view learning. In: ICML, pp 25–32
Jacob L, Obozinski G, Vert J (2009) Group Lasso with overlap and graph Lasso. In: ICML, pp 433–440
Jenatton R, Audibert J, Bach FR (2011) Structured variable selection with sparsity-inducing norms. J Mach Learn Res 12:2777–2824
Ji S, Tang L, Yu S, Ye J (2008) Extracting shared subspace for multi-label classification. In: KDD, pp 381–389
Ji S, Ye J (2009) An accelerated gradient method for trace norm minimization. In: ICML, pp 457–464
Kim S, Xing EP (2010) Tree-guided group Lasso for multi-task regression with structured sparsity. In: ICML, pp 543–550
Kong D, Ding CHQ, Huang H (2011) Robust nonnegative matrix factorization using L21-norm. In: CIKM, pp 673–682
Kong X, Ng MK, Zhou Z-H (2013) Transductive multilabel learning via label set propagation. IEEE Trans Knowl Data Eng 25(3):704–719
Lewis DD, Yang Y, Rose TG, Li F (2004) Rcv1: a new benchmark collection for text categorization research. J Mach Learn Res 5:361–397
Li Y, Tian X, Liu T, Tao D (2015) Multi-task model and feature joint learning. In: IJCAI, pp 3643–3649
Mairal J, Jenatton R, Obozinski G, Bach FR (2010) Network flow algorithms for structured sparsity. In: NIPS, pp 1558–1566
Mencía EL, Fürnkranz J (2008) Efficient pairwise multilabel classification for large-scale problems in the legal domain. In: ECML-PKDD, pp 126–135
Mosci S, Villa S, Verri A, Rosasco L (2010) A primal-dual algorithm for group sparse regularization with overlapping groups. In: NIPS, pp 2604–2612
Nie F, Huang H, Cai X, Ding CHQ (2010) Efficient and robust feature selection via joint \(\ell _{2,1}\)-norms minimization. In: NIPS, pp 1813–1821
Qin ZT, Goldfarb D (2012) Structured sparsity via alternating direction methods. J Mach Learn Res 13:1435–1468
Simon N, Friedman J, Hastie T, Tibshirani R (2013) A sparse-group Lasso. J Comput Graph Stat 22(2):231
Sindhwani V, Rosenberg DS (2008) An RKHS for multi-view learning and manifold co-regularization. In: ICML, pp 976–983
Sridharan K, Kakade SM (2008) An information theoretic framework for multi-view learning. In: COLT, pp 403–414
Tibshirani R (1996) Regression shrinkage and selection via the Lasso. J R Stat Soc Ser B Methodol 58(1):267–288
Tseng P (2001) Convergence of a block coordinate descent method for nondifferentiable minimization. J Optim Theory Appl 109(3):475–494
White M, Yu Y, Zhang X, Schuurmans D (2012) Convex multi-view subspace learning. In: NIPS, pp 1682–1690
Xu C, Tao D, Xu C (2015) Multi-view intact space learning. IEEE Trans Pattern Anal Mach Intell 37(12):2531–2544
Yang H, He J (2014) Learning with dual heterogeneity: a nonparametric bayes model. In: KDD, pp 582–590
Yang P, He J (2015) Model multiple heterogeneity via hierarchical multi-latent space learning. In: KDD, pp 1375–1384
Yang P, He J (2016) Heterogeneous representation learning with structured sparsity regularization. In: ICDM, pp 539–548
Yang P, He J, Yang H, Fu H (2014) Learning from label and feature heterogeneity. In: ICDM, pp 1079–1084
Yang S, Sun Q, Ji S, Wonka P, Davidson I, Ye J (2015) Structural graphical Lasso for learning mouse brain connectivity. In: KDD, pp 1385–1394
Yang X, Kim S, Xing EP (2009) Heterogeneous multitask learning with joint sparsity constraints. In: NIPS, pp 2151–2159
Yu H-F, Jain P, Kar P, Dhillon IS (2014) Large-scale multi-label learning with missing labels. In: ICML, pp 593–601
Yuan L, Liu J, Ye J (2013) Efficient methods for overlapping group Lasso. IEEE Trans Pattern Anal Mach Intell 35(9):2104–2116
Yuan M, Lin Y (2006) Model selection and estimation in regression with grouped variables. J R Stat Soc Ser B Stat Methodol 68(1):49–67
Zhang J, Huan J (2012) Inductive multi-task learning with multiple view data. In: KDD, pp 543–551
Zhang M-L, Zhou Z-H (2007) ML-KNN: a lazy learning approach to multi-label learning. Pattern Recognit 40(7):2038–2048
Zhang M-L, Zhou Z-H (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837
Zhou J, Chen J, Ye J (2011) Clustered multi-task learning via alternating structure optimization. In: NIPS, pp 702–710
Zhou J, Liu J, Narayan VA, Ye J (2012) Modeling disease progression via fused sparse group Lasso. In: KDD, pp 1095–1103
Acknowledgements
This work is supported by National Natural Science Foundation of China under Grant No. 61473123, National Science Foundation under Grant No. IIS-1552654, ONR under Grant No. N00014-15-1-2821, NASA under Grant No. NNX17AJ86A, and an IBM Faculty Award. The views and conclusions are those of the authors and should not be interpreted as representing the official policies of the funding agencies or the government.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Yang, P., Tan, Q., Zhu, Y. et al. Heterogeneous representation learning with separable structured sparsity regularization. Knowl Inf Syst 55, 671–694 (2018). https://doi.org/10.1007/s10115-017-1094-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-017-1094-5