Abstract
Hierarchical Clustering is an unsupervised learning task, whi-ch seeks to build a set of clusters ordered by the inclusion relation. It is usually assumed that the result is a tree-like structure with no overlapping clusters, i.e., where clusters are either disjoint or nested. In Hierarchical Conceptual Clustering (HCC), each cluster is provided with a conceptual description which belongs to a predefined set called the pattern language. Depending on the application domain, the elements in the pattern language can be of different nature: logical formulas, graphs, tests on the attributes, etc. In this paper, we tackle the issue of overlapping concepts in the agglomerative approach of HCC. We provide a formal characterization of pattern languages that ensures a result without overlaps. Unfortunately, this characterization excludes many pattern languages which may be relevant for agglomerative HCC. Then, we propose two variants of the basic agglomerative HCC approach. Both of them guarantee a result without overlaps; the second one refines the given pattern language so that any two disjoint clusters have mutually exclusive descriptions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
Available at UC Irvine Machine Learning Database http://archive.ics.uci.edu/ml/.
- 2.
References
Baader, F., Calvanese, D., McGuinness, D.L., Nardi, D., Patel-Schneider, P.F. (eds.): The Description Logic Handbook: Theory, Implementation, and Applications. Cambridge University Press, New York (2003)
Brito, P., Diday, E.: Pyramidal representation of symbolic objects. In: Schader, M., Gaul, W. (eds.) Knowledge, Data and Computer-Assisted Decision. ATO ASI Series, pp. 3–16. Springer, Heidelberg (1990). https://doi.org/10.1007/978-3-642-84218-4_1
Crampton, J., Loizou, G.: The completion of a poset in a lattice of antichains. Int. Math. J. 1(3), 223–238 (2001)
Davey, B.A., Priestly, H.A.: Introduction to Lattices and Order. Cambridge University Press, Cambridge (2002)
Funes, A., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: An instantiation of hierarchical distance-based conceptual clustering for propositional learning. In: Theeramunkong, T., Kijsirikul, B., Cercone, N., Ho, T.-B. (eds.) PAKDD 2009. LNCS (LNAI), vol. 5476, pp. 637–646. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-01307-2_63
Funes, A.M., Ferri, C., Hernández-Orallo, J., Ramírez-Quintana, M.J.: Hierarchical distance-based conceptual clustering. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008. LNCS (LNAI), vol. 5211, pp. 349–364. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-87479-9_41
Ganter, B., Kuznetsov, S.O.: Pattern structures and their projections. In: Delugach, H.S., Stumme, G. (eds.) ICCS-ConceptStruct 2001. LNCS (LNAI), vol. 2120, pp. 129–142. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44583-8_10
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-59830-2
Jonyer, I., Holder, L.B., Cook, D.J.: Graph-based hierarchical conceptual clustering. Int. J. Artif. Intell. Tools 10(1–2), 107–135 (2001). https://doi.org/10.1142/S0218213001000441
Kuznetsov, S.O.: Machine learning and formal concept analysis. In: Eklund, P. (ed.) ICFCA 2004. LNCS (LNAI), vol. 2961, pp. 287–312. Springer, Heidelberg (2004). https://doi.org/10.1007/978-3-540-24651-0_25
Leeuwenberg, A., Buzmakov, A., Toussaint, Y., Napoli, A.: Exploring pattern structures of syntactic trees for relation extraction. In: Baixeries, J., Sacarea, C., Ojeda-Aciego, M. (eds.) ICFCA 2015. LNCS (LNAI), vol. 9113, pp. 153–168. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19545-2_10
Pérez-Suárez, A., Martínez-Trinidad, J.F., Carrasco-Ochoa, J.A.: A review of conceptual clustering algorithms. Artif. Intell. Rev. 52(2), 1267–1296 (2018). https://doi.org/10.1007/s10462-018-9627-1
Xiong, H., Steinbach, M.S., Tan, P., Kumar, V.: HICAP: hierarchical clustering with pattern preservation. In: Proceedings of the Fourth SIAM International Conference on Data Mining, Lake Buena Vista, Florida, USA, 22–24 April 2004, pp. 279–290 (2004)
Zhou, B., Wang, H., Wang, C.: A hierarchical clustering algorithm based on GiST. In: Huang, D.-S., Heutte, L., Loog, M. (eds.) ICIC 2007. CCIS, vol. 2, pp. 125–134. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74282-1_15
Acknowledgments
This research was supported by the QAPE company under the research project Kover.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Brabant, Q., Mouakher, A., Bertaux, A. (2020). Preventing Overlaps in Agglomerative Hierarchical Conceptual Clustering. In: Alam, M., Braun, T., Yun, B. (eds) Ontologies and Concepts in Mind and Machine. ICCS 2020. Lecture Notes in Computer Science(), vol 12277. Springer, Cham. https://doi.org/10.1007/978-3-030-57855-8_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-57855-8_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-57854-1
Online ISBN: 978-3-030-57855-8
eBook Packages: Computer ScienceComputer Science (R0)