The Use of k-Means Algorithm to Improve Kernel Method via Instance Selection

Lulu Wang²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9403))

Included in the following conference series:

International Conference on Knowledge Science, Engineering and Management

2971 Accesses

Abstract

The kernel method is well known for its success in solving the curse of dimension of linearly inseparable problems. But as an instance- based learning algorithm it suffers from high memory requirement and low efficiency in that it needs to store all of the training instances. And when there are noisy instances classification accuracy can suffer. In this paper we present an approach to alleviate both of the problems mentioned above by using k-means algorithm to select only k representativeness instances of the training data. And we view the selected k instances as the new data set, where the choice of the value of k is influenced by the size and the character of the data set. It turn out that with a carefully selected k we can still get a good performance while the number of the instances stored are greatly decreased.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Very large-scale data classification based on K-means clustering and multi-kernel SVM

Article 29 January 2018

An Improved Kernel K-means Clustering Algorithm

K-Means

References

Wu, X.: Top 10 algorithms in data mining. Knowledge and Information Systems 14(1), 1–37 (2008)
Article Google Scholar
Shawe-Taylor, J., Cristianini, N.: Kernel Methods for Pattern Analysis. Cambridge University Press (2004)
Google Scholar
Gretton, A., et al.: A kernel method for the two-sample-problem. In: Advances in Neural Information Processing Systems (2006)
Google Scholar
Brighton, H., Mellish, C.: Advances in instance selection for instance-based learning algorithms. Data Mining and Knowledge Discovery 6(2), 153–172 (2002)
Article MathSciNet MATH Google Scholar
Aha, D.W., Kibler, D., Albert, M.K.: Instance-based learning algorithms. Machine Learning 6(1), 37–66 (1991)
Article Google Scholar
Lange, T., Buhmann, J.M.: Fusion of similarity data in clustering. In: Proceeding of Advances in Neural Information Processing Systems (2005)
Google Scholar
Zavrel, J., Daelemans, W.: Memory-based learning: using similarity for smoothing. In: Proceedings of the Eighth Conference on European Chapter of the Association for Computational Linguistics (1997)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, JiLin University, Changchun, 130012, China
Lulu Wang

Authors

Lulu Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lulu Wang .

Editor information

Editors and Affiliations

Chinese Academy of Sciences, Beijing, China
Songmao Zhang
Ludwig-Maximilians-Universität München, Munich, Germany
Martin Wirsing
Southwest University, Chongqing, China
Zili Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, L. (2015). The Use of k-Means Algorithm to Improve Kernel Method via Instance Selection. In: Zhang, S., Wirsing, M., Zhang, Z. (eds) Knowledge Science, Engineering and Management. KSEM 2015. Lecture Notes in Computer Science(), vol 9403. Springer, Cham. https://doi.org/10.1007/978-3-319-25159-2_50

Download citation

DOI: https://doi.org/10.1007/978-3-319-25159-2_50
Published: 03 November 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-25158-5
Online ISBN: 978-3-319-25159-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics