Abstract
In the interaction between humans and computers as well as in the interaction among humans, topical units (TUs) have an important role. This motivated our investigation of topical unit recognition. To lay foundations for this, we first create a classifier for topical units using Deep Neural Nets with rectifier units (DRNs) and the probabilistic sampling method. Evaluating the resulting models on the HuComTech corpus using the Unweighted Average Recall (UAR) measure, we find that this method produces significantly higher classification scores than those that can be achieved using Support Vector Machines, and what DRNs can produce in the absence of probabilistic sampling. We also examine experimentally the number of topical unit labels to be used. We demonstrate that not having to discriminate between variations of topic change leads to better classification scores. However, there can be applications where this distinction is necessary, for which case we introduce a hierarchical classification method. Results show that this method increases the UAR scores by more than 7%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The current study is an extended version of this conference paper.
- 2.
The difference between the two is due to the different length in the occurrences of various topical units in the annotation. A 3.2 s long topic elaboration and a 32 s long topic elaboration both count as one occurrence in the annotation, but the number of frames associated with the former will be 10, while the number of frames associated with the latter will be 100.
- 3.
It should be mentioned here that before applying our machine learning methods, we normalized all non-binary features so as to have a zero mean and unit variance.
References
Abuczki A (2011) A multimodal analysis of the sequential organization of verbal and nonverbal interaction. Argumentum 7:261–279
Abuczki A, Baiat GE (2013) An overview of multimodal corpora, annotation tools and schemes. Argumentum 9:86–98
Babbar R, Partalas I, Gaussier E, Amini MR (2013) On flat versus hierarchical classification in large-scale taxonomies. In: Advances in neural information processing systems, vol 26. Curran Associates, Inc., pp 1824–1832
Baiat GE, Szekrényes I (2012) Topic change detection based on prosodic cues in unimodal setting. In: Proceedings of the CogInfoCom, pp 527–530
Baranyi P, Csapó A, Gyula S (2015) Cognitive infocommunications (CogInfoCom). Springer International, Cham, Switzerland
Chang CC, Lin CJ (2011) LIBSVM: a library for support vector machines. ACM Trans Intell Syst Technol 2:27:1–27:27
Demichelis P, Rinotti A, Martin JCD (2005) Performance analysis of distributed speech recognition over 802.11 wireless networks on the Timit database. In: Proceedings of the VTC, pp 2751–2754
Grósz T, Nagy I (2014) Document classification with deep rectifier neural networks and probabilistic sampling. In: Proceedings of the TSD, pp 108–115
Hunyadi L, Szekrényes I, Borbély A, Kiss H (2012) Annotation of spoken syntax in relation to prosody and multimodal pragmatics. In: Proceedings of the CogInfoCom, pp 537–541
Jr CS, Freitas A (2009) Novel top-down approaches for hierarchical classification and their application to automatic music genre classification. In: Proceedings of the IEEE SMC, pp 182–196
Kai-Fu Lee HWH (1989) Speaker-independent phone recognition using Hidden Markov Models. IEEE Trans Acoust Speech Signal Process 37(37):1641–1648
Kovács G, Grósz T, Váradi T (2016) Topical unit classification using deep neural nets and probabilistic sampling. In: Proceedings of the CogInfoCom, pp 199–204
Lawrence S, Burns I, Back A, Tsoi AC, Giles CL (1998) Neural network classification and prior class probabilities. In: Orr GB, Müller KR (eds) Neural networks: tricks of the trade. Springer, Heidelberg, Berlin, pp 299–313
Pápay K, Szeghalmy S, Szekrényes I (2011) Hucomtech multimodal corpus annotation. Argumentum 7:330–347
Rosenberg A (2012) Classifying skewed data: importance weighting to optimize average recall. In: Proceedings of the Interspeech, pp 2242–2245
Sapru A, Bourlard H (2014) Detecting speaker roles and topic changes in multiparty conversations using latent topic models. In: Proceedings of the Interspeech, pp 2882–2886
Schmidt AP, Stone TKM (2013) Detection of topic change in IRC chat logs. http://www.trevorstone.org/school/ircsegmentation.pdf
Shriberg E, Stolcke A, Hakkani-Tür D, Tür G (2000) Prosody-based automatic segmentation of speech into sentences and topics. Speech Commun 32(1–2):127–154
Silla CN Jr, Freitas AA (2011) A survey of hierarchical classification across different application domains. Data Min Knowl Discov 22(1–2):31–72
Su J (2011) An analysis of content-free dialogue representation, supervised classification methods and evaluation metrics for meeting topic segmentation. PhD thesis, Trinity College
Szekrényes I (2015) Prosotool, a method for automatic annotation of fundamental frequency. In: Proceedings of the CogInfoCom, pp 291–296
Tóth L (2013) Phone recognition with deep sparse rectifier neural networks. In: Proceedings of the ICASSP, pp 6985–6989
Tóth L, Kocsor A (2005) Training HMM/ANN hybrid speech recognizers by probabilistic sampling. In: Proceedings of the ICANN, pp 597–603
Tür G, Hakkani-Tür DZ, Stolcke A, Shriberg E (2001) Integrating prosodic and lexical cues for automatic topic segmentation. CoRR 31–57
Zellers M, Post B (2009) Fundamental frequency and other prosodic cues to topic structure. In: Workshop on the discourse-prosody interface, pp 377–386
Acknowledgements
The research reported in the paper was conducted with the support of the Hungarian Scientific Research Fund (OTKA) grant # K116938. Tamás Grósz was supported by the ÚNKP-16-3 new national excellence programme of the Ministry of Human Capacities.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Kovács, G., Grósz, T., Váradi, T. (2019). Using Deep Rectifier Neural Nets and Probabilistic Sampling for Topical Unit Classification. In: Klempous, R., Nikodem, J., Baranyi, P. (eds) Cognitive Infocommunications, Theory and Applications. Topics in Intelligent Engineering and Informatics, vol 13. Springer, Cham. https://doi.org/10.1007/978-3-319-95996-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-95996-2_1
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-95995-5
Online ISBN: 978-3-319-95996-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)