Abstract
In fuzzy clustering algorithms each object has a fuzzy membership associated with each cluster indicating the degree of association of the object to the cluster. Here we present a fuzzy subspace clustering algorithm, FSC, in which each dimension has a weight associated with each cluster indicating the degree of importance of the dimension to the cluster. Using fuzzy techniques for subspace clustering, our algorithm avoids the difficulty of choosing appropriate cluster dimensions for each cluster during the iterations. Our analysis and simulations strongly show that FSC is very efficient and the clustering results produced by FSC are very high in accuracy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Jain, A., Murty, M., Flynn, P.: Data clustering: A review. ACM Computing Surveys 31, 264–323 (1999)
Cao, Y., Wu, J.: Projective ART for clustering data sets in high dimensional spaces. Neural Networks 15, 105–120 (2002)
Agrawal, R., Gehrke, J., Gunopulos, D., Raghavan, P.: Automatic subspace clustering of high dimensional data for data mining applications. In: SIGMOD Record ACM Special Interest Group on Management of Data, pp. 94–105 (1998)
Aggarwal, C., Wolf, J., Yu, P., Procopiuc, C., Park, J.: Fast algorithms for projected clustering. In: Proceedings of the 1999 ACM SIGMOD international conference on Management of data, pp. 61–72. ACM Press, New York (1999)
Domeniconi, C., Papadopoulos, D., Gunopulos, D., Ma, S.: Subspace clustering of high dimensonal data. In: Proceedings of the SIAM International Conference on Data Mining, Lake Buena Vista, Florida (2004)
Goil, S., Nagesh, H., Choudhary, A.: MAFIA: Efficient and scalable subspace clustering for very large data sets. Technical Report CPDC-TR-9906-010, Center for Parallel and Distributed Computing, Department of Electrical & Computer Engineering, Northwestern University (1999)
Aggarwal, C., Yu, P.: Finding generalized projected clusters in high dimensional spaces. In: Chen, W., Naughton, J.F., Bernstein, P.A. (eds.) Proceedings of the 2000 ACM SIGMOD International Conference on Management of Data, Dallas, Texas, USA, May 16-18, 2000, vol. 29, pp. 70–81. ACM, New York (2000)
Woo, K., Lee, J.: FINDIT: a fast and intelligent subspace clustering algorithm using dimension voting. PhD thesis, Korea Advanced Institue of Science and Technology, Department of Electrical Engineering and Computer Science (2002)
Cheng, C., Fu, A., Zhang, Y.: Entropy-based subspace clustering for mining numerical data. In: Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 84–93. ACM Press, New York (1999)
Kaufman, L., Rousseeuw, P.: Finding Groups in Data–An Introduction to Cluster Analysis. Wiley series in probability and mathematical statistics. John Wiley & Sons, Inc., New York (1990)
Yang, J., Wang, W., Wang, H., Yu, P.: δ-clusters: capturing subspace correlation in a large data set. In: Proceedings. 18th International Conference on Data Engineering, pp. 517–528 (2002)
Procopiuc, C., Jones, M., Agarwal, P., Murali, T.: A monte carlo algorithm for fast projective clustering. In: Proceedings of the 2002 ACM SIGMOD international conference on Management of data, pp. 418–427. ACM Press, New York (2002)
Gan, G., Wu, J.: Subspace clustering for high dimensional categorical data. ACM SIGKDD Explorations Newsletter 6, 87–94 (2004)
Agarwal, P., Mustafa, N.: k-means projective clustering. In: Proceedings of the Twenty-third ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems(PODS), Paris, France, pp. 155–165. ACM Press, New York (2004)
Liu, B., Xia, Y., Yu, P.: Clustering through decision tree construction. In: Proceedings of the ninth international conference on Information and knowledge management, McLean, Virginia, USA, pp. 20–29. ACM Press, New York (2000)
Hartigan, J.: Clustering Algorithms. John Wiley & Sons, Toronto (1975)
Huang, Z., Ng, M.: A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems 7, 446–452 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gan, G., Wu, J., Yang, Z. (2006). A Fuzzy Subspace Algorithm for Clustering High Dimensional Data. In: Li, X., Zaïane, O.R., Li, Z. (eds) Advanced Data Mining and Applications. ADMA 2006. Lecture Notes in Computer Science(), vol 4093. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11811305_30
Download citation
DOI: https://doi.org/10.1007/11811305_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-37025-3
Online ISBN: 978-3-540-37026-0
eBook Packages: Computer ScienceComputer Science (R0)