Improving Adaptive Learning Models Using Prosodic Speech Features

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13916))

Included in the following conference series:

International Conference on Artificial Intelligence in Education

3683 Accesses
10 Altmetric

Abstract

Cognitive models of memory retrieval aim to describe human learning and forgetting over time. Such models have been successfully applied in digital systems that aid in memorizing information by adapting to the needs of individual learners. The memory models used in these systems typically measure the accuracy and latency of typed retrieval attempts. However, recent advances in speech technology have led to the development of learning systems that allow for spoken inputs. Here, we explore the possibility of improving a cognitive model of memory retrieval by using information present in speech signals during spoken retrieval attempts. We asked 44 participants to study vocabulary items by spoken rehearsal, and automatically extracted high-level prosodic speech features—patterns of stress and intonation—such as pitch dynamics, speaking speed and intensity from over 7,000 utterances. We demonstrate that some prosodic speech features are associated with accuracy and response latency for retrieval attempts, and that speech feature informed memory models make better predictions of future performance relative to models that only use accuracy and response latency. Our results have theoretical relevance, as they show how memory strength is reflected in a specific speech signature. They also have important practical implications as they contribute to the development of memory models for spoken retrieval that have numerous real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The logistic regression coefficients in Table 1.1 and 1.2 can be converted to probabilities using an inverse logit transform. For example, in Table 1.1, a one standard deviation increase in pitch slope was associated with a decrease in accuracy from $e^{(1.885)}/(1 + e^{(1.885)}) = 0.868$ to $e^{(1.885 - 0.260)}/(1 + e^{(1.885 - 0.260)}) = 0.835$.

References

Anderson, J.R., Bothell, D., Lebiere, C., Matessa, M.: An integrated theory of list memory. J. Mem. Lang. 38(4), 341–380 (1998)
Article Google Scholar
Baayen, R.H., Davidson, D.J., Bates, D.M.: Mixed-effects modeling with crossed random effects for subjects and items. J. Mem. Lang. 59(4), 390–412 (2008)
Article Google Scholar
Boersma, P.: Praat: doing phonetics by computer (2006). http://www.praat.org/
Byrne, M.D., Anderson, J.R.: Perception and action. Atomic Comp. Thought 16, 23–28 (1998)
Google Scholar
Golonka, E.M., Bowles, A.R., Frank, V.M., Richardson, D.L., Freynik, S.: Technologies for foreign language learning: a review of technology types and their effectiveness. Comput. Assist. Lang. Learn. 27(1), 70–105 (2014)
Article Google Scholar
Goupil, L., Aucouturier, J.J.: Distinct signatures of subjective confidence and objective accuracy in speech prosody. Cognition 212, 104661 (2021)
Google Scholar
Goupil, L., Ponsot, E., Richardson, D., Reyes, G., Aucouturier, J.J.: Listeners’ perceptions of the certainty and honesty of a speaker are associated with a common prosodic signature. Nat. Commun. 12(1), 1–17 (2021)
Google Scholar
Goutte, C., Gaussier, E.: A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1_25
Chapter Google Scholar
Jiang, N.: Lexical development and representation in a second language. Appl. Linguis. 21(1), 47–77 (2000)
Article Google Scholar
Jiang, X., Pell, M.D.: The sound of confidence and doubt. Speech Commun. 88, 106–126 (2017)
Article Google Scholar
Lindsey, R.V., Shroyer, J.D., Pashler, H., Mozer, M.C.: Improving students’ long-term knowledge retention through personalized review. Psychol. Sci. 25(3), 639–647 (2014)
Article Google Scholar
Liu, Z.T., Rehman, A., Wu, M., Cao, W.H., Hao, M.: Speech emotion recognition based on formant characteristics feature extraction and phoneme type convergence. Inf. Sci. 563, 309–325 (2021)
Article Google Scholar
Nedungadi, P., Remya, M.: Incorporating forgetting in the personalized, clustered, Bayesian knowledge tracing (pc-BKT) model. In: 2015 International Conference on Cognitive Computing and Information Processing (CCIP), pp. 1–5. IEEE (2015)
Google Scholar
Papousek, J., Pelánek, R., Stanislav, V.: Adaptive practice of facts in domains with varied prior knowledge. In: Educational Data Mining 2014, pp. 6–13 (2014)
Google Scholar
Pavlik, P.I., Anderson, J.R.: Using a model to compute the optimal schedule of practice. J. Exp. Psychol. Appl. 14(2), 101 (2008)
Article Google Scholar
R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2020)
Google Scholar
Reed, B.S.: Analysing Conversation: An Introduction to Prosody. Macmillan International Higher Education (2010)
Google Scholar
Sense, F., Behrens, F., Meijer, R.R., Van Rijn, H.: An individual’s rate of forgetting is stable over time but differs across materials. Top. Cogn. Sci. 8(1), 305–321 (2016)
Article Google Scholar
Sense, F., Meijer, R.R., Van Rijn, H.: Exploration of the rate of forgetting as a domain-specific individual differences measure. Front. Educ. 3(112) (2018)
Google Scholar
Sense, F., van der Velde, M., Van Rijn, H.: Predicting university students’ exam performance using a model-based adaptive fact-learning system. J. Learn. Anal. 8, 1–15 (2021)
Google Scholar
Settles, B., Brust, C., Gustafson, E., Hagiwara, M., Madnani, N.: Second language acquisition modeling. In: Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 56–65 (2018)
Google Scholar
Settles, B., Meeder, B.: A trainable spaced repetition model for language learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1848–1858 (2016)
Google Scholar
Van Rijn, H., van Maanen, L., van Woudenberg, M.: Passing the test: improving learning gains by balancing spacing and testing effects. In: Proceedings of the 9th International Conference of Cognitive Modeling, vol. 2, pp. 7–6 (2009)
Google Scholar
Van Rossum, G., Drake, F.L.: Introduction To Python 3: Python Documentation Manual Part 1. CreateSpace (2009)
Google Scholar
Ververidis, D., Kotropoulos, C.: Sequential forward feature selection with low computational cost. In: 2005 13th European Signal Processing Conference, pp. 1–4. IEEE (2005)
Google Scholar
Walsh, M.M., et al.: Mechanisms underlying the spacing effect in learning: a comparison of three computational models. J. Exp. Psychol. Gen. 147(9), 1325 (2018)
Article Google Scholar
Wennerstrom, A.: The Music of Everyday Speech: Prosody and Discourse Analysis. Oxford University Press, Oxford (2001)
Google Scholar
Wilschut, T., Sense, F., van Rijn, H.: Speaking to remember: model-based adaptive vocabulary learning using automatic speech recognition. Available at SSRN 4227060 (2022)
Google Scholar
Xu, Y.: Speech prosody: a methodological review. J. Speech Sci. 1(1), 85–115 (2011)
Article Google Scholar
Yujian, L., Bo, L.: A normalized Levenshtein distance metric. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1091–1095 (2007)
Article Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Experimental Psychology, University of Groningen, Groningen, The Netherlands
Thomas Wilschut & Hedderik van Rijn
InfiniteTactics, LLC, Beavercreek, USA
Florian Sense
Department of Multimedia and Computing, Delft University of Technology, Delft, The Netherlands
Odette Scharenborg

Authors

Thomas Wilschut
View author publications
You can also search for this author in PubMed Google Scholar
Florian Sense
View author publications
You can also search for this author in PubMed Google Scholar
Odette Scharenborg
View author publications
You can also search for this author in PubMed Google Scholar
Hedderik van Rijn
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thomas Wilschut .

Editor information

Editors and Affiliations

University of Southern California, Los Angeles, CA, USA
Ning Wang
University of British Columbia, Vancouver, BC, Canada
Genaro Rebolledo-Mendez
North Carolina State University, Raleigh, NC, USA
Noboru Matsuda
Despacho 3.01, UNED-Grupo de Investigación aDeNu, Madrid, Spain
Olga C. Santos
University of Leeds, Leeds, UK
Vania Dimitrova

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wilschut, T., Sense, F., Scharenborg, O., van Rijn, H. (2023). Improving Adaptive Learning Models Using Prosodic Speech Features. In: Wang, N., Rebolledo-Mendez, G., Matsuda, N., Santos, O.C., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2023. Lecture Notes in Computer Science(), vol 13916. Springer, Cham. https://doi.org/10.1007/978-3-031-36272-9_21

Download citation

DOI: https://doi.org/10.1007/978-3-031-36272-9_21
Published: 26 June 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-36271-2
Online ISBN: 978-3-031-36272-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics