Abstract
Clustering has been recognized as one of the important tasks in data mining. One important class of clustering is distance based method. To reduce the computational and storage burden of the classical clustering methods, many distance based hybrid clustering methods have been proposed. However, these methods are not suitable for cluster analysis in dynamic environment where underlying data distribution and subsequently clustering structures change over time. In this paper, we propose a distance based incremental clustering method, which can find arbitrary shaped clusters in fast changing dynamic scenarios. Our proposed method is based on recently proposed al-SL method, which can successfully be applied to large static datasets. In the incremental version of the al-SL (termed as IncrementalSL), we exploit important characteristics of al-SL method to handle frequent updates of patterns to the given dataset. The IncrementalSL method can produce exactly same clustering results as produced by the al-SL method. To show the effectiveness of the IncrementalSL in dynamically changing database, we experimented with one synthetic and one real world datasets.
Chapter PDF
Similar content being viewed by others
References
Ester, M., Kriegel, H.P., Sander, J., Xu, X.: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In: Proceedings of 2nd ACM SIGKDD, pp. 226–231 (1996)
Sneath, A., Sokal, P.H.: Numerical Taxonomy. Freeman, London (1973)
Jain, A.K.: Data clustering: 50 years beyond k-means. Pattern Recognition Letters 31(8), 651–666 (2010)
Patra, B.K.: Mining Arbitrary Shaped Clusters in Large Dataset. PhD thesis, Indian Institute of Technology Guwahati, Guwahati, INDIA (2012)
Murty, M.N., Krishna, G.: A hybrid clustering procedure for concentric and chain-like clusters. Int. J. Comput. Inform. Sci. 10(6), 397–412 (1981)
Wong, M.A.: A hybrid clustering algorithm for identifying high density clusters. Journal of the American Statistical Association 77(380), 841–847 (1982)
Vijaya, P.A., Murty, M.N., Subramanian, D.K.: Efficient bottom-up hybrid hierarchical clustering techniques for protein sequence classification. Pattern Recognition 39(12), 2344–2355 (2006)
Lin, C.R., Chen, M.S.: Combining partitional and hierarchical algorithms for robust and efficient data clustering with cohesion self-merging. IEEE Trans. on Knowl. and Data Eng. 17(2), 145–159 (2005)
Chaoji, V., Hasan, M.A., Salem, S., Zaki, M.J.: Sparcl: an effective and efficient algorithm for mining arbitrary shape-based clusters. Knowl. Inf. Syst. 21(2), 201–229 (2009)
Patra, B.K., Nandi, S., Viswanath, P.: A distance based clustering method for arbitrary shaped clusters in large datasets. Pattern Recognition 44(12), 2862–2870 (2011)
Hartigan, J.A.: Clustering Algorithms. John Wiley & Sons, Inc., New York (1975)
Spath, H.: Cluster Analysis Algorithms for Data Reduction and Classification of Objects. Ellis Horwood, UK (1980)
Theodoridis, S., Koutroumbas, K.: Pattern Recognition, 3rd edn. Academic Press, Inc., Orlando (2006)
Zhang, T., Ramakrishnan, R., Livny, M.: Birch: An efficient data clustering method for very large databases. In: Proceedings of ACM SIGMOD International Conference on Management of Data, SIGMOD 1996, pp. 103–114 (1996)
Charikar, M., Chekuri, C., Feder, T., Motwani, R.: Incremental clustering and dynamic information retrieval. In: Proceedings of the Twenty-Ninth Annual ACM Symposium on Theory of Computing, pp. 626–635 (1997)
Chen, C.-Y., Hwang, S.-C., Oyang, Y.-J.: An incremental hierarchical data clustering algorithm based on gravity theory. In: Chen, M.-S., Yu, P.S., Liu, B. (eds.) PAKDD 2002. LNCS (LNAI), vol. 2336, pp. 237–250. Springer, Heidelberg (2002)
Widyantoro, D., Ioerger, T., Yen, J.: An incremental approach to building a cluster hierarchy. In: Proceedings of IEEE International Conference on Data Mining, ICDM 2003, pp. 705–708 (2002)
Ester, M., Kriegel, H.P., Sander, J., Wimmer, M., Xu, X.: Incremental clustering for mining in a data warehousing environment. In: Proceedings of 24th International Conference on Very Large Data Bases (VLDB 1998), pp. 323–333 (1998)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Patra, B.K., Ville, O., Launonen, R., Nandi, S., Babu, K.S. (2013). Distance based Incremental Clustering for Mining Clusters of Arbitrary Shapes. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds) Pattern Recognition and Machine Intelligence. PReMI 2013. Lecture Notes in Computer Science, vol 8251. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45062-4_31
Download citation
DOI: https://doi.org/10.1007/978-3-642-45062-4_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45061-7
Online ISBN: 978-3-642-45062-4
eBook Packages: Computer ScienceComputer Science (R0)