[go: up one dir, main page]

skip to main content
10.1145/1557019.1557129acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Information theoretic regularization for semi-supervised boosting

Published: 28 June 2009 Publication History

Abstract

We present novel semi-supervised boosting algorithms that incrementally build linear combinations of weak classifiers through generic functional gradient descent using both labeled and unlabeled training data. Our approach is based on extending information regularization framework to boosting, bearing loss functions that combine log loss on labeled data with the information-theoretic measures to encode unlabeled data. Even though the information-theoretic regularization terms make the optimization non-convex, we propose simple sequential gradient descent optimization algorithms, and obtain impressively improved results on synthetic, benchmark and real world tasks over supervised boosting algorithms which use the labeled data alone and a state-of-the-art semi-supervised boosting algorithm.

Supplementary Material

JPG File (p1017-zheng.jpg)
MP4 File (p1017-zheng.mp4)

References

[1]
K. Benett, A. Demiriz and R. Maclin. Exploiting unlabeled data in ensemble methods. The Eighth International Conference on Knowledge Discovery and Data Mining, 289--296, 2002.
[2]
D. Bertsekas. Nonlinear Programming, 2nd Edition, Athena Scientific, 1999.
[3]
A. Blum and T. Mitchell. Combining labeled and unlabeled data with co-training. The Workshop on Computational Learning Theory, 92--100, 1998.
[4]
S. Boyd and L. Vandenberghe. Convex Optimization, Cambridge University Press, 2004.
[5]
L. Breiman. Prediction games and arcing classifiers. Neural Computation, 11:1493--1517, 1999.
[6]
V. Castelli and T. Cover. The relative value of labeled and unlabeled samples in pattern recognition with an unknown mixing parameter. IEEE Trans. on Information Theory, 42(6):2102--2117, 1996.
[7]
G. Celeux and G. Govaert. A classification EM algorithm for clustering and two stochastic versions. Computational Statistics and Data Analysis, 14:315--332, 1992.
[8]
O. Chapelle, B. Scholköpf and A. Zien. Semi-Supervised Learning, MIT Press, 2006.
[9]
K. Chen and S. Wang. Regularized boost for semi-supervised learning. Advances in Neural Information Processing Systems 20, 2007.
[10]
I. Cohen and F. Cozman. Risks of semi-supervised learning. Semi-Supervised Learning, O. Chapelle, B. Scholköpf and A. Zien,55--70, MIT Press, 2006.
[11]
M. Collins, R. Schapire and Y. Singer. Logistic regression, AdaBoost and Bregman distances. Machine Learning, 48(1-3):253--285, 2002.
[12]
A. Corduneanu and T. Jaakkola. Data dependent regularization. Semi-Supervised Learning, O. Chapelle, B. Scholköpf and A. Zien, 163--182, MIT Press, 2006.
[13]
T. Cover and J. Thomas. Elements of Information Theory, John Wiley&Sons, 1991.
[14]
F. d'Alché-Buc, Y. Grandvalet and C. Ambroise. Semi-supervised marginBoost. Advances in Neural Information Processing Systems 14, 553--560, 2002.
[15]
S. Della Pietra, V. Della Pietra and J. Lafferty. Inducing features of random fields. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(4):380--393, 1997.
[16]
Y. Freund and R. Schapire. Experiments with a new boosting algorithm. The Thirteenth International Conference on Machine Learning, 148--156, 1996.
[17]
Y. Freund and R. Schapire. A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences, 55(1):119--139, 1997.
[18]
J. Friedman, T.Hastie and R. Tibshirani. Additive logistic regression: A statistical view of boosting. The Annals of Statistics, 28(2):337--407, 2000.
[19]
Y. Grandvalet and Y. Bengio. Semi-supervised learning by entropy minimization. Advances in Neural Information Processing Systems, 17:529--536, 2004.
[20]
G. Haffari, Y. Wang, S. Wang, G. Mori and F. Jiao. Boosting with incomplete information. The Twenty-Fifth International Conference on Machine Learning, 368--375, 2008.
[21]
T. Hastie, R. Tibshirani, J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition, Springer, 2009.
[22]
J. Janodet, R. Nock, M. Sebban and H. Suchier. Boosting grammatical inference with confidence oracles. The Twenty-First International Conference on Machine Learning, 54--61, 2004.
[23]
F. Jiao, S. Wang, C. Lee, R. Greiner and D. Schuurmans. Semi-supervised conditional random fields for improved sequence segmentation and labeling. The Joint 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, 209--216, 2006.
[24]
G. Lebanon and J. Lafferty. Boosting and maximum likelihood for exponential models. Advances in Neural Information Processing Systems 14, 447--454, 2002.
[25]
C. Lee, S. Wang, F. Jiao, D. Schuurmans and R. Greiner. Learning to model spatial dependency: Semi-supervised discriminative random fields. Advances in Neural Information Processing, 19, 793--800, 2007.
[26]
L. Mason, J. Baxter, P. Bartlett and M. Frean. Functional gradient techniques for combining hypotheses. In Advances in Large Margin Classifiers, A. Smola, P. Bartlett, B. Scholköpf and D. Schuurmans, editors, 221--246, MIT Press, 2000.
[27]
K. Nigam, A. McCallum, S. Thrun and T. Mitchell. Text classification from labeled and unlabeled documents using EM. Machine Learning. 39(2/3):135--167, 2000.
[28]
S. Roberts, R. Everson and I. Rezek. Maximum certainty data partitioning. Pattern Recognition, 33(5):833--839, 2000.
[29]
R. Schapire. The strength of weak learnability. Machine Learning, 5(2):197--227, 1990.
[30]
H. Valizadegan, R. Jin and A. Jain. Semi-supervised boosting for multi-class classification. The European Conference on Machine Learning and Knowledge Discovery in Databases, 522--537, 2008.
[31]
Y. Wang, G. Haffari, S. Wang and G. Mori. Rate distortion based semi-supervised discriminative learning. Technical Report, 2009.
[32]
D. Zhou, O. Bousquet, T. Navin Lal, J. Weston and B. Schölkopf. Learning with local and global consistency. Advances in Neural Information Processing Systems, 16:321--328, 2004.
[33]
J. Zhu, S. Rosset, H. Zhou and T. Hastie. Multiclass AdaBoost. Technical Report, 2005.
[34]
X. Zhu, Z. Ghahramani and J. Lafferty. Semi-supervised learning using Gaussian fields and harmonic functions. The 20th International Conference on Machine Learning, 912--919, 2003.

Cited By

View all
  • (2018)Large-scale predictive modeling and analytics through regression queries in data management systemsInternational Journal of Data Science and Analytics10.1007/s41060-018-0163-5Online publication date: 27-Dec-2018
  • (2017)A Cluster-Based Semisupervised Ensemble for Multiclass ClassificationIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2017.27432191:6(408-420)Online publication date: Dec-2017
  • (2015)A direct boosting approach for semi-supervised classificationProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832810(4025-4032)Online publication date: 25-Jul-2015
  • Show More Cited By

Index Terms

  1. Information theoretic regularization for semi-supervised boosting

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    KDD '09: Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
    June 2009
    1426 pages
    ISBN:9781605584959
    DOI:10.1145/1557019
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 June 2009

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. ensemble
    2. semi-supervised learning

    Qualifiers

    • Research-article

    Conference

    KDD09

    Acceptance Rates

    Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 14 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Large-scale predictive modeling and analytics through regression queries in data management systemsInternational Journal of Data Science and Analytics10.1007/s41060-018-0163-5Online publication date: 27-Dec-2018
    • (2017)A Cluster-Based Semisupervised Ensemble for Multiclass ClassificationIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2017.27432191:6(408-420)Online publication date: Dec-2017
    • (2015)A direct boosting approach for semi-supervised classificationProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832747.2832810(4025-4032)Online publication date: 25-Jul-2015
    • (2014)Structured Sparse Boosting for Graph ClassificationACM Transactions on Knowledge Discovery from Data10.1145/26293289:1(1-22)Online publication date: 25-Aug-2014
    • (2014)Named Entity Extraction via Automatic Labeling and Tri-training: Comparison of Selection MethodsInformation Retrieval Technology10.1007/978-3-319-12844-3_21(244-255)Online publication date: 2014
    • (2013)A robust semi-supervised boosting method using linear programming2013 IEEE Global Conference on Signal and Information Processing10.1109/GlobalSIP.2013.6737086(1101-1104)Online publication date: Dec-2013
    • (2012)Semisupervised Classification With Cluster RegularizationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2012.221448823:11(1779-1792)Online publication date: Nov-2012
    • (2012)Semi-Supervised Logistic Discrimination Via Graph-Based RegularizationNeural Processing Letters10.1007/s11063-012-9231-336:3(203-216)Online publication date: 29-May-2012
    • (2012)Boosting Algorithms: A Review of Methods, Theory, and ApplicationsEnsemble Machine Learning10.1007/978-1-4419-9326-7_2(35-85)Online publication date: 19-Jan-2012
    • (2010)Boosting with structure information in the functional spaceProceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining10.1145/1835804.1835886(643-652)Online publication date: 25-Jul-2010
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media