Abstract
Learning activities, especially face-to-face conversational coaching may invite students experience a set of learning-centered mental states including concentration, confusion, frustration, and boredom, those mental sates have been widely used as vital proxies for inferring their learning processes and are closely linked with learning outcomes. Recognizing students’ learning-centered mental states, particularly effectively detecting negative mental states such as confusion and frustration in teacher-student conversation could help teacher effectively monitor students’ learning situations in order to direct personalized and adaptive coaching resources to maximum students’ learning outcome. Most of research focused on analyzing students’ mental states using univariate modality when they completing pre-designed tasks in a computer related environment. It is still an open question on how to effectively measure students’ multiple mental states when they interacting with human teacher in coach-led conversations from various aspects. To achieve this goal, in this work, we developed an advanced multi-sensor-based system to record multi-modal conversational data of student-teacher conversations generated in a real university lab. We then attempt to derive a series of interpretable features from multiple perspectives including facial and physiological (heart rate) to characterize students’ multiple mental states. A set of supervised classifiers were built based on those features with different modality fusion methods to recognize multiple mental states of students. Our results have provided the experimental evidence to validate the outstanding predictive ability of our proposed features and the possibility of using multimodal data to recognize students’ multiple mental states in ‘in-the-wild’ student-teacher conversation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Apple Inc.: ARKit—Apple developer documentation. https://developer.apple.com/documentation/arkit. Accessed 04 Dec 2017
Busso, C., et al.: IEMOCAP: interactive emotional dyadic motion capture database. Lang. Resour. Eval. 42(4), 335 (2008)
Burt, K.B., Obradović, J.: The construct of psychophysiological reactivity: statistical and psychometric issues. Dev. Rev. 33(1), 29–57 (2013)
Craig, S., Graesser, A., Sullins, J., Gholson, B.: Affect and learning: an exploratory look into the role of affect in learning with AutoTutor. J. Educ. Media 29(3), 241–250 (2004)
Calvo, R.A., D’Mello, S.: Affect detection: an interdisciplinary review of models, methods, and their applications. IEEE Trans. Affect. Comput. 1(1), 18–37 (2010)
Cowley, B., Ravaja, N., Heikura, T.: Cardiovascular physiology predicts learning effects in a serious game activity. Comput. Educ. 60(1), 299–309 (2013)
Devillers, L., Vidrascu, L.: Real-life emotion recognition in speech. In: Müller, C. (ed.) Speaker Classification II. LNCS (LNAI), vol. 4441, pp. 34–42. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74122-0_4
D’Mello, S.K., Craig, S.D., Sullins, J., Graesser, A.C.: Predicting affective states expressed through an emote-aloud procedure from AutoTutor’s mixed-initiative dialogue. Int. J. Artif. Intell. Educ. 16(1), 3–28 (2006)
D’Mello, S.K., Craig, S.D., Witherspoon, A., Mcdaniel, B., Graesser, A.: Automatic detection of learner’s affect from conversational cues. User Model. User-Adap. Inter. 18(1–2), 45–80 (2008)
D’Mello, S., Graesser, A.: Dynamics of affective states during complex learning. Learn. Instr. 22(2), 145–157 (2012)
D’Mello, S., Mills, C.: Emotions while writing about emotional and non-emotional topics. Motiv. Emot. 38(1), 140–156 (2013). https://doi.org/10.1007/s11031-013-9358-1
Forbes-Riley, K., Litman, D.: When does disengagement correlate with learning in spoken dialog computer tutoring? In: Biswas, G., Bull, S., Kay, J., Mitrovic, A. (eds.) AIED 2011. LNCS (LNAI), vol. 6738, pp. 81–89. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21869-9_13
Feidakis, M., Daradoumis, T., Caballé, S.: Building emotion-aware features in computer supported collaborative learning (CSCL) systems. In: Alpine Rendez-Vous (ARV) Workshop on Tools and Technologies for Emotion Awareness in Computer-Mediated Collaboration and Learning (ARV 2013) (2013)
Gomes, J., Yassine, M., Worsley, M., Blikstein, P.: Analysing engineering expertise of high school students using eye tracking and multimodal learning analytics. In: Educational Data Mining (2013)
Grafsgaard, J., Wiggins, J., Boyer, K., Wiebe, E., Lester, J.: Embodied affect in tutorial dialogue: student gesture and posture. In: Lane, H.C., Yacef, K., Mostow, J., Pavlik, P. (eds.) AIED 2013. LNCS (LNAI), vol. 7926, pp. 1–10. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-39112-5_1
Hussain, M. S., AlZoubi, O., Calvo, R., D’Mello, S.: Affect detection from multichannel physiology during learning sessions with AutoTutor. In: Biswas, G., Bull, S., Kay, J., Mitrovic, A. (eds.) AIED 2011. LNCS (LNAI), vol. 6738, pp. 131–138. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21869-9_19
Kononenko, I.: Estimating attributes: analysis and extensions of RELIEF. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994). https://doi.org/10.1007/3-540-57868-4_57
Kapoor, A., Picard, R.W.: Multimodal affect recognition in learning environments. In: Proceedings of the 13th Annual ACM International Conference on Multimedia, pp. 677–682 (2005)
de Koning, B.B., Tabbers, H.K., Rikers, R.M., Paas, F.: Attention guidance in learning from a complex animation: seeing is understanding? Learn. Instr. 20(2), 111–122 (2010)
Landis, J.R., Koch, G.G.: The measurement of observer agreement for categorical data. Biometrics 33(1), 159 (1977)
Li, Y., Su, H., Shen, X., Li, W., Cao, Z., Niu, S.: DailyDialog: a manually labelled multi-turn dialogue dataset (2017). arXiv preprint arXiv:1710-03957
Luft, C.D., Nolte, G., Bhattacharya, J.: High-learners present larger mid-frontal theta power and connectivity in response to incorrect performance feedback. J. Neurosci. 33(5), 2029–2038 (2013)
O’Brien, H.L., Toms, E.G.: The development and evaluation of a survey to measure user engagement. J. Am. Soc. Inform. Sci. Technol. 61(1), 50–69 (2010)
Pardos, Z.A., Baker, R.S., San Pedro, M.O., Gowda, S.M., Gowda, S.M.: Affective states and state tests: investigating how affect and engagement during the school year predict end-of-year learning outcomes. J. Learn. Anal. 1(1), 107–128 (2014)
Parsons, J., Taylor, L.: Student engagement: what do we know and what should we do? University of Alberta (2012)
Peng, S., Chen, L., Gao, C., Tong, R.J.: Predicting students’ attention level with inter-pretable facial and head dynamic features in an online tutoring system (student abstract). In: AAAI, pp. 13895–13896 (2020)
Peng, S., Ohira, S., Nagao, K.: Automatic evaluation of students’ discussion skill based on their heart rate. In: McLaren, B.M., Reilly, R., Zvacek, S., Uhomoibhi, J. (eds.) CSEDU 2018. CCIS, vol. 1022, pp. 572–585. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-21151-6_27
Peng, S., Ohira, S., Nagao, K.: Prediction of students’ answer relevance in discussion based on their heart-rate data. Int. J. Innov. Res. Educ. Sci. (IJIRES) 6(3), 414–424 (2019)
Peng, S., Ohira, S., Nagao, K.: Reading students’ multiple mental states in conversation from facial and heart rate cues. In: CSEDU (1), pp. 68–76 (2020)
Rodrigo, M.M.T., et al.: The effects of an interactive software agent on student affective dynamics while using; an intelligent tutoring system. IEEE Trans. Affect. Comput. 3(2), 224–236 (2012)
Stevens, R., Galloway, T., Berka, C.: EEG-related changes in cognitive workload, engagement and distraction as students acquire problem solving skills. In: Conati, C., McCoy, K., Paliouras, G. (eds.) UM 2007. LNCS (LNAI), vol. 4511, pp. 187–196. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-73078-1_22
Urbanowicz, R.J., Olson, R.S., Schmitt, P., Meeker, M., Moore, J.H.: Benchmarking relief-based feature selection methods for bioinformatics data mining. J. Biomed. Inform. 85, 168–188 (2018)
Whitehill, J., et al.: Towards an optimal affect-sensitive instructional system of cognitive skills. In: CVPR 2011 Workshops, pp. 20–25. IEEE (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Peng, S., Ohira, S., Nagao, K. (2021). Recognition of Students’ Multiple Mental States in Conversation Based on Multimodal Cues. In: Lane, H.C., Zvacek, S., Uhomoibhi, J. (eds) Computer Supported Education. CSEDU 2020. Communications in Computer and Information Science, vol 1473. Springer, Cham. https://doi.org/10.1007/978-3-030-86439-2_24
Download citation
DOI: https://doi.org/10.1007/978-3-030-86439-2_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86438-5
Online ISBN: 978-3-030-86439-2
eBook Packages: Computer ScienceComputer Science (R0)