Abstract
The function of data reduction is to make data sets smaller, while preserving classification structures of interest. A novel approach to data reduction based on spatial partitioning is proposed in this paper. This algorithm projects conventional database relations into multidimensional data space. The advantage of this approach is to change the data reduction process into a spatial merging process of data in the same class, as well as a spatial partitioning process of data in different classes, in multidimensional data space. A series of partitioned regions are eventually obtained and can easily be used in data classification. The proposed method was evaluated using 7 real world data sets. The results were quite remarkable compared with those obtained by C4.5 and DR. The efficiency of the proposed algorithm was better than DR without loss of test accuracy and reduction ratio.
Chapter PDF
Similar content being viewed by others
References
Weiss, S. M., and Indurkhya, N. (1997). Predictive Data Mining: A Practical Guide. Morgan Kaufmann Publishers, Inc.
Fayyad, U. M. (1997). Editorial. Data Mining and Knowledge Discovery-An International Journal 1(3).
Hui Wang, Ivo Duntsch, David Bell. (1998). Data reduction based on hyper relations. In proceedings of KDD98, New York, pages 349–353.
Duntsch, I., and Gediga, G. (1997). Algebraic aspects of attribute dependencies in information systems. Fundamenta Informaticae 29:119–133.
Gratzer, G. (1978). General Lattice Theory. Basel: Birkhauser.
Ullman, J. D. (1983). Principles of Database Systems. In proceedings of IDEAL2000, HongKong, pages 78–84. Springer-Verlag Berlin Heidelberg 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Guo, G., Wang, H., Bell, D., Wu, Q. (2001). Data Reduction Based on Spatial Partitioning. In: Alexandrov, V.N., Dongarra, J.J., Juliano, B.A., Renner, R.S., Tan, C.J.K. (eds) Computational Science - ICCS 2001. ICCS 2001. Lecture Notes in Computer Science, vol 2074. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45718-6_28
Download citation
DOI: https://doi.org/10.1007/3-540-45718-6_28
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42233-4
Online ISBN: 978-3-540-45718-3
eBook Packages: Springer Book Archive