Abstract
Topic modeling with community detection can be used to explore the latent semantic structure of documents, we can utilize a network, i.e., a graph to depict the semantic relation between words. In some network based topic models, in order to obtain a network with obvious community structure, the similarity between words (vertices) is essential. Word embeddings trained from a large corpus empirically perform as well as in rich semantic representation, thus this research is intended to construct a novel similarity in a network based topic model (NAM). In this paper, we first intuitively propose a similarity measure based on shifted cosine similarity between word embeddings. This similarity is exploited to replace the similarity based on typical point-wise mutual information (PMI). Secondly, based on different similarity measures, topics of corpus in a global period are induced by NAM. Finally, we use NAM to capture the dynamic changes of political topics in China and interpret the dynamic processes using historical background. Although our similarity measure introduces semantic differences caused by the difference between data sets and has one more parameter, the experimental results show the effectiveness of our new proposed measure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bastian, M., Heymann, S., Jacomy, M.: Gephi: an open source software for exploring and manipulating networks. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 3 (2009)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Blondel, V.D., Guillaume, J.L., Lambiotte, R., Lefebvre, E.: Fast unfolding of communities in large networks. J. Stat. Mech: Theory Exp. 2008(10), P10008 (2008)
Bouma, G.: Normalized (pointwise) mutual information in collocation extraction. In: Proceedings of GSCL, pp. 31–40 (2009)
Cointet, J.P., Mogoutov, A., Bourret, P., El Abed, R., Cambrosio, A.: Les réseaux de l’expression génique-émergence et développement d’un domaine clé de la génomique. médecine/sciences, 28, 7–13 (2012)
Das, R., Zaheer, M., Dyer, C.: Gaussian LDA for topic models with word embeddings. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pp. 795–804 (2015)
Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, vol. 51, pp. 50–57 (1999)
Li, C., Wang, H., Zhang, Z., Sun, A., Ma, Z.: Topic modeling for short texts with auxiliary word embeddings. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 165–174 (2016)
Li, D., et al.: Adding community and dynamic to topic models. J. Informet. 6(2), 237–253 (2012)
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3111–3119 (2013)
Mimno, D., Wallach, H., Talley, E., Leenders, M., McCallum, A.: Optimizing semantic coherence in topic models. In: Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pp. 262–272 (2011)
Newman, M.E.: Modularity and community structure in networks. Proc. Natl. Acad. Sci. 103(23), 8577–8582 (2006)
Röder, M., Both, A., Hinneburg, A.: Exploring the space of topic coherence measures. In: Proceedings of the Eighth ACM International Conference on Web Search and Data Mining, pp. 399–408 (2015)
Rule, A., Cointet, J.P., Bearman, P.S.: Lexical shifts, substantive changes, and continuity in state of the union discourse, 1790–2014. Proc. Natl. Acad. Sci. 112(35), 10837–10844 (2015)
Sun, J.: Jieba Chinese word segmentation tool (2012)
Weeds, J., Weir, D.: Co-occurrence retrieval: a flexible framework for lexical distributional similarity. Comput. Linguist. 31(4), 439–475 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Luo, Y., Wan, T., Qin, Z. (2022). Topic Modeling of Political Dynamics with Shifted Cosine Similarity. In: Honda, K., Entani, T., Ubukata, S., Huynh, VN., Inuiguchi, M. (eds) Integrated Uncertainty in Knowledge Modelling and Decision Making. IUKM 2022. Lecture Notes in Computer Science(), vol 13199. Springer, Cham. https://doi.org/10.1007/978-3-030-98018-4_22
Download citation
DOI: https://doi.org/10.1007/978-3-030-98018-4_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98017-7
Online ISBN: 978-3-030-98018-4
eBook Packages: Computer ScienceComputer Science (R0)