Abstract
In line with the technological developments, the current data tends to be multidimensional and high dimensional, which is more complex than conventional data and need dimension reduction. Dimension reduction is important in cluster analysis and creates a new representation for the data that is smaller in volume and has the same analytical results as the original representation. To obtain an efficient processing time while clustering and mitigate curse of dimensionality, a clustering process needs data reduction. This paper proposes an alternative model for extracting multidimensional data clustering based on comparative dimension reduction. We implemented five dimension reduction techniques such as ISOMAP (Isometric Feature Mapping), KernelPCA, LLE (Local Linear Embedded), Maximum Variance Unfolded (MVU), and Principal Component Analysis (PCA). The results show that dimension reductions significantly shorten processing time and increased performance of cluster. DBSCAN within Kernel PCA and Super Vector within Kernel PCA have highest cluster performance compared with cluster without dimension reduction.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Maimon, O., Rokach, L.: Decomposition Methodology For Knowledge Discovery And Data Mining, pp. 253–255. World Scientific Publishing Co, Pte, Ltd., Danvers (2005)
Fodor, I.K.: A Survey of Dimension Reduction Techniques. LLNL Technical Report, UCRL-ID-148494, p.1–18 (2002)
Chakrabarti, K., Mehrotra, S.: Local Dimensionality Reduction: A New Approach To Indexing High Dimensional Space. In: Proceeding of the 26th VLDB Conference, Cairo, Egypt, pp. 89–100 (2000)
Ding, C., He, X., Zha, H., Simon, H.: Adaptive Dimension Reduction For Clustering High Dimensional Data, pp. 1–8. Lawrence Berkeley National Laboratory (2002)
Globerson, A., Tishby, N.: Sufficient Dimensionality Reduction. Journal of Machine Learning, 1307–1331 (2003)
Jin, L., Wan, W., Wu, Y., Cui, B., Yu, X., Wu, Y.: A Robust High-Dimensional Data Reduction Method. The International Journal of Virtual Reality 9(1), 55–60 (2010)
Sembiring, R.W., Zain, J.M., Embong, A.: Clustering High Dimensional Data Using Subspace And Projected Clustering Algorithm. International Journal of Computer Science & Information Technology (IJCSIT) 2(4), 162–170 (2010)
Sembiring, R.W., Zain, J.M.: Cluster Evaluation Of Density Based Subspace Clustering. Journal of Computing 2(11), 14–19 (2010)
Nisbet, R., Elder, J., Miner, G.: Statistical Analysis & Data Mining Application, pp. 111–269. Elsevier Inc., California (2009)
Maimon, O., Rokach, L.: Data Mining And Knowledge Discovery Handbook, pp. 94–97. Springer Science+Business Media Inc., Heidelberg (2005)
Kumar, C.A.: Analysis Of Unsupervised Dimensionality Reduction Technique. ComSIS 6(2), 218–227 (2009)
van der Maaten, L. J. P., Postma, E.O., van den Herik, H.J.: Dimensionality Reduction: A Comparative Review. Published online, pp. 1–22 (2008), http://www.cs.unimaas.nl/l.vandermaaten/dr/dimensionreduction_draft.pdf
Xu, R., Wunsch II, D.C.: Clustering, pp. 237–239. John Wiley & Sons, Inc., New Jersey (2009)
Larose, D.T.: Data Mining Methods And Models, pp. 1–15. John Wiley & Sons Inc., New Jersey (2006)
Wang, J.: Encyclopedia Of Data Warehousing And Data Mining, p. 812. Idea Group Reference, Hershey (2006)
Tenenbaum, J., De Silva, V., Langford, J.C.: A Global Geometric Framework For Nonlinear Dimensionality Reduction. Science 290(5500), 2319–2323 (2000)
Mukund, B.: The Isomap Algorithm and Topological Scaling. Science 295, 7 (2002)
Schölkopf, B., Smola, A., Muller, K.R.: Non Linear Kernel Principal Component Analysis. Vision And Learning, Neural Computation 10, 1299–1319 (1998)
Saul, L.K.: An Introduction To Locally Linear Embedding, AT&T Labs–Research pp. 1–13 (2000), http://www.cs.nyu.edu/~roweis/lle/papers/lleintroa4.pdf
Weinberger, K.Q., Saul, L.K.: An Introduction To Nonlinear Dimensionality Reduction By Maximum Variance Unfolding. In: AAAI 2006 Proceedings of The 21st National Conference On Artificial Intelligence, vol. 2, pp. 1683–1686 (2006)
Poncelet, P., Teisseire, M., Masseglia, F.: Data Mining Patterns: New Methods And Application, Information Science Reference, Hershey PA, pp. 120–121 (2008)
Jolliffe, I.T.: Principal Component Analysis, pp. 7–26. Springer, New York (2002)
Smith, L.I.: A Tutorial On Principal Component Analysis (2002), http://www.cs.otago.ac.nz/cosc453/student_tutorials/principal_components.pdfp.12-16
Ghodsi, A.: Dimensionality Reduction, A Short Tutorial, Technical Report 2006-14, Department of Statistics and Actuarial Science, University of Waterloo, pp. 5–6 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Sembiring, R.W., Mohamad Zain, J., Embong, A. (2011). Alternative Model for Extracting Multidimensional Data Based-On Comparative Dimension Reduction. In: Zain, J.M., Wan Mohd, W.M.b., El-Qawasmeh, E. (eds) Software Engineering and Computer Systems. ICSECS 2011. Communications in Computer and Information Science, vol 180. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22191-0_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-22191-0_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22190-3
Online ISBN: 978-3-642-22191-0
eBook Packages: Computer ScienceComputer Science (R0)