Abstract
Cognitive models of memory retrieval aim to describe human learning and forgetting over time. Such models have been successfully applied in digital systems that aid in memorizing information by adapting to the needs of individual learners. The memory models used in these systems typically measure the accuracy and latency of typed retrieval attempts. However, recent advances in speech technology have led to the development of learning systems that allow for spoken inputs. Here, we explore the possibility of improving a cognitive model of memory retrieval by using information present in speech signals during spoken retrieval attempts. We asked 44 participants to study vocabulary items by spoken rehearsal, and automatically extracted high-level prosodic speech features—patterns of stress and intonation—such as pitch dynamics, speaking speed and intensity from over 7,000 utterances. We demonstrate that some prosodic speech features are associated with accuracy and response latency for retrieval attempts, and that speech feature informed memory models make better predictions of future performance relative to models that only use accuracy and response latency. Our results have theoretical relevance, as they show how memory strength is reflected in a specific speech signature. They also have important practical implications as they contribute to the development of memory models for spoken retrieval that have numerous real-world applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
The logistic regression coefficients in Table 1.1 and 1.2 can be converted to probabilities using an inverse logit transform. For example, in Table 1.1, a one standard deviation increase in pitch slope was associated with a decrease in accuracy from \(e^{(1.885)}/(1 + e^{(1.885)}) = 0.868\) to \(e^{(1.885 - 0.260)}/(1 + e^{(1.885 - 0.260)}) = 0.835\).
References
Anderson, J.R., Bothell, D., Lebiere, C., Matessa, M.: An integrated theory of list memory. J. Mem. Lang. 38(4), 341–380 (1998)
Baayen, R.H., Davidson, D.J., Bates, D.M.: Mixed-effects modeling with crossed random effects for subjects and items. J. Mem. Lang. 59(4), 390–412 (2008)
Boersma, P.: Praat: doing phonetics by computer (2006). http://www.praat.org/
Byrne, M.D., Anderson, J.R.: Perception and action. Atomic Comp. Thought 16, 23–28 (1998)
Golonka, E.M., Bowles, A.R., Frank, V.M., Richardson, D.L., Freynik, S.: Technologies for foreign language learning: a review of technology types and their effectiveness. Comput. Assist. Lang. Learn. 27(1), 70–105 (2014)
Goupil, L., Aucouturier, J.J.: Distinct signatures of subjective confidence and objective accuracy in speech prosody. Cognition 212, 104661 (2021)
Goupil, L., Ponsot, E., Richardson, D., Reyes, G., Aucouturier, J.J.: Listeners’ perceptions of the certainty and honesty of a speaker are associated with a common prosodic signature. Nat. Commun. 12(1), 1–17 (2021)
Goutte, C., Gaussier, E.: A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1_25
Jiang, N.: Lexical development and representation in a second language. Appl. Linguis. 21(1), 47–77 (2000)
Jiang, X., Pell, M.D.: The sound of confidence and doubt. Speech Commun. 88, 106–126 (2017)
Lindsey, R.V., Shroyer, J.D., Pashler, H., Mozer, M.C.: Improving students’ long-term knowledge retention through personalized review. Psychol. Sci. 25(3), 639–647 (2014)
Liu, Z.T., Rehman, A., Wu, M., Cao, W.H., Hao, M.: Speech emotion recognition based on formant characteristics feature extraction and phoneme type convergence. Inf. Sci. 563, 309–325 (2021)
Nedungadi, P., Remya, M.: Incorporating forgetting in the personalized, clustered, Bayesian knowledge tracing (pc-BKT) model. In: 2015 International Conference on Cognitive Computing and Information Processing (CCIP), pp. 1–5. IEEE (2015)
Papousek, J., Pelánek, R., Stanislav, V.: Adaptive practice of facts in domains with varied prior knowledge. In: Educational Data Mining 2014, pp. 6–13 (2014)
Pavlik, P.I., Anderson, J.R.: Using a model to compute the optimal schedule of practice. J. Exp. Psychol. Appl. 14(2), 101 (2008)
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2020)
Reed, B.S.: Analysing Conversation: An Introduction to Prosody. Macmillan International Higher Education (2010)
Sense, F., Behrens, F., Meijer, R.R., Van Rijn, H.: An individual’s rate of forgetting is stable over time but differs across materials. Top. Cogn. Sci. 8(1), 305–321 (2016)
Sense, F., Meijer, R.R., Van Rijn, H.: Exploration of the rate of forgetting as a domain-specific individual differences measure. Front. Educ. 3(112) (2018)
Sense, F., van der Velde, M., Van Rijn, H.: Predicting university students’ exam performance using a model-based adaptive fact-learning system. J. Learn. Anal. 8, 1–15 (2021)
Settles, B., Brust, C., Gustafson, E., Hagiwara, M., Madnani, N.: Second language acquisition modeling. In: Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 56–65 (2018)
Settles, B., Meeder, B.: A trainable spaced repetition model for language learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1848–1858 (2016)
Van Rijn, H., van Maanen, L., van Woudenberg, M.: Passing the test: improving learning gains by balancing spacing and testing effects. In: Proceedings of the 9th International Conference of Cognitive Modeling, vol. 2, pp. 7–6 (2009)
Van Rossum, G., Drake, F.L.: Introduction To Python 3: Python Documentation Manual Part 1. CreateSpace (2009)
Ververidis, D., Kotropoulos, C.: Sequential forward feature selection with low computational cost. In: 2005 13th European Signal Processing Conference, pp. 1–4. IEEE (2005)
Walsh, M.M., et al.: Mechanisms underlying the spacing effect in learning: a comparison of three computational models. J. Exp. Psychol. Gen. 147(9), 1325 (2018)
Wennerstrom, A.: The Music of Everyday Speech: Prosody and Discourse Analysis. Oxford University Press, Oxford (2001)
Wilschut, T., Sense, F., van Rijn, H.: Speaking to remember: model-based adaptive vocabulary learning using automatic speech recognition. Available at SSRN 4227060 (2022)
Xu, Y.: Speech prosody: a methodological review. J. Speech Sci. 1(1), 85–115 (2011)
Yujian, L., Bo, L.: A normalized Levenshtein distance metric. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1091–1095 (2007)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Wilschut, T., Sense, F., Scharenborg, O., van Rijn, H. (2023). Improving Adaptive Learning Models Using Prosodic Speech Features. In: Wang, N., Rebolledo-Mendez, G., Matsuda, N., Santos, O.C., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2023. Lecture Notes in Computer Science(), vol 13916. Springer, Cham. https://doi.org/10.1007/978-3-031-36272-9_21
Download citation
DOI: https://doi.org/10.1007/978-3-031-36272-9_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36271-2
Online ISBN: 978-3-031-36272-9
eBook Packages: Computer ScienceComputer Science (R0)