[go: up one dir, main page]

Skip to main content

Improving Adaptive Learning Models Using Prosodic Speech Features

  • Conference paper
  • First Online:
Artificial Intelligence in Education (AIED 2023)

Abstract

Cognitive models of memory retrieval aim to describe human learning and forgetting over time. Such models have been successfully applied in digital systems that aid in memorizing information by adapting to the needs of individual learners. The memory models used in these systems typically measure the accuracy and latency of typed retrieval attempts. However, recent advances in speech technology have led to the development of learning systems that allow for spoken inputs. Here, we explore the possibility of improving a cognitive model of memory retrieval by using information present in speech signals during spoken retrieval attempts. We asked 44 participants to study vocabulary items by spoken rehearsal, and automatically extracted high-level prosodic speech features—patterns of stress and intonation—such as pitch dynamics, speaking speed and intensity from over 7,000 utterances. We demonstrate that some prosodic speech features are associated with accuracy and response latency for retrieval attempts, and that speech feature informed memory models make better predictions of future performance relative to models that only use accuracy and response latency. Our results have theoretical relevance, as they show how memory strength is reflected in a specific speech signature. They also have important practical implications as they contribute to the development of memory models for spoken retrieval that have numerous real-world applications.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The logistic regression coefficients in Table 1.1 and 1.2 can be converted to probabilities using an inverse logit transform. For example, in Table 1.1, a one standard deviation increase in pitch slope was associated with a decrease in accuracy from \(e^{(1.885)}/(1 + e^{(1.885)}) = 0.868\) to \(e^{(1.885 - 0.260)}/(1 + e^{(1.885 - 0.260)}) = 0.835\).

References

  1. Anderson, J.R., Bothell, D., Lebiere, C., Matessa, M.: An integrated theory of list memory. J. Mem. Lang. 38(4), 341–380 (1998)

    Article  Google Scholar 

  2. Baayen, R.H., Davidson, D.J., Bates, D.M.: Mixed-effects modeling with crossed random effects for subjects and items. J. Mem. Lang. 59(4), 390–412 (2008)

    Article  Google Scholar 

  3. Boersma, P.: Praat: doing phonetics by computer (2006). http://www.praat.org/

  4. Byrne, M.D., Anderson, J.R.: Perception and action. Atomic Comp. Thought 16, 23–28 (1998)

    Google Scholar 

  5. Golonka, E.M., Bowles, A.R., Frank, V.M., Richardson, D.L., Freynik, S.: Technologies for foreign language learning: a review of technology types and their effectiveness. Comput. Assist. Lang. Learn. 27(1), 70–105 (2014)

    Article  Google Scholar 

  6. Goupil, L., Aucouturier, J.J.: Distinct signatures of subjective confidence and objective accuracy in speech prosody. Cognition 212, 104661 (2021)

    Google Scholar 

  7. Goupil, L., Ponsot, E., Richardson, D., Reyes, G., Aucouturier, J.J.: Listeners’ perceptions of the certainty and honesty of a speaker are associated with a common prosodic signature. Nat. Commun. 12(1), 1–17 (2021)

    Google Scholar 

  8. Goutte, C., Gaussier, E.: A probabilistic interpretation of precision, recall and F-score, with implication for evaluation. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 345–359. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1_25

    Chapter  Google Scholar 

  9. Jiang, N.: Lexical development and representation in a second language. Appl. Linguis. 21(1), 47–77 (2000)

    Article  Google Scholar 

  10. Jiang, X., Pell, M.D.: The sound of confidence and doubt. Speech Commun. 88, 106–126 (2017)

    Article  Google Scholar 

  11. Lindsey, R.V., Shroyer, J.D., Pashler, H., Mozer, M.C.: Improving students’ long-term knowledge retention through personalized review. Psychol. Sci. 25(3), 639–647 (2014)

    Article  Google Scholar 

  12. Liu, Z.T., Rehman, A., Wu, M., Cao, W.H., Hao, M.: Speech emotion recognition based on formant characteristics feature extraction and phoneme type convergence. Inf. Sci. 563, 309–325 (2021)

    Article  Google Scholar 

  13. Nedungadi, P., Remya, M.: Incorporating forgetting in the personalized, clustered, Bayesian knowledge tracing (pc-BKT) model. In: 2015 International Conference on Cognitive Computing and Information Processing (CCIP), pp. 1–5. IEEE (2015)

    Google Scholar 

  14. Papousek, J., Pelánek, R., Stanislav, V.: Adaptive practice of facts in domains with varied prior knowledge. In: Educational Data Mining 2014, pp. 6–13 (2014)

    Google Scholar 

  15. Pavlik, P.I., Anderson, J.R.: Using a model to compute the optimal schedule of practice. J. Exp. Psychol. Appl. 14(2), 101 (2008)

    Article  Google Scholar 

  16. R Core Team: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria (2020)

    Google Scholar 

  17. Reed, B.S.: Analysing Conversation: An Introduction to Prosody. Macmillan International Higher Education (2010)

    Google Scholar 

  18. Sense, F., Behrens, F., Meijer, R.R., Van Rijn, H.: An individual’s rate of forgetting is stable over time but differs across materials. Top. Cogn. Sci. 8(1), 305–321 (2016)

    Article  Google Scholar 

  19. Sense, F., Meijer, R.R., Van Rijn, H.: Exploration of the rate of forgetting as a domain-specific individual differences measure. Front. Educ. 3(112) (2018)

    Google Scholar 

  20. Sense, F., van der Velde, M., Van Rijn, H.: Predicting university students’ exam performance using a model-based adaptive fact-learning system. J. Learn. Anal. 8, 1–15 (2021)

    Google Scholar 

  21. Settles, B., Brust, C., Gustafson, E., Hagiwara, M., Madnani, N.: Second language acquisition modeling. In: Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 56–65 (2018)

    Google Scholar 

  22. Settles, B., Meeder, B.: A trainable spaced repetition model for language learning. In: Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 1848–1858 (2016)

    Google Scholar 

  23. Van Rijn, H., van Maanen, L., van Woudenberg, M.: Passing the test: improving learning gains by balancing spacing and testing effects. In: Proceedings of the 9th International Conference of Cognitive Modeling, vol. 2, pp. 7–6 (2009)

    Google Scholar 

  24. Van Rossum, G., Drake, F.L.: Introduction To Python 3: Python Documentation Manual Part 1. CreateSpace (2009)

    Google Scholar 

  25. Ververidis, D., Kotropoulos, C.: Sequential forward feature selection with low computational cost. In: 2005 13th European Signal Processing Conference, pp. 1–4. IEEE (2005)

    Google Scholar 

  26. Walsh, M.M., et al.: Mechanisms underlying the spacing effect in learning: a comparison of three computational models. J. Exp. Psychol. Gen. 147(9), 1325 (2018)

    Article  Google Scholar 

  27. Wennerstrom, A.: The Music of Everyday Speech: Prosody and Discourse Analysis. Oxford University Press, Oxford (2001)

    Google Scholar 

  28. Wilschut, T., Sense, F., van Rijn, H.: Speaking to remember: model-based adaptive vocabulary learning using automatic speech recognition. Available at SSRN 4227060 (2022)

    Google Scholar 

  29. Xu, Y.: Speech prosody: a methodological review. J. Speech Sci. 1(1), 85–115 (2011)

    Article  Google Scholar 

  30. Yujian, L., Bo, L.: A normalized Levenshtein distance metric. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1091–1095 (2007)

    Article  Google Scholar 

  31. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc.: Ser. B (Stat. Methodol.) 67(2), 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Wilschut .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wilschut, T., Sense, F., Scharenborg, O., van Rijn, H. (2023). Improving Adaptive Learning Models Using Prosodic Speech Features. In: Wang, N., Rebolledo-Mendez, G., Matsuda, N., Santos, O.C., Dimitrova, V. (eds) Artificial Intelligence in Education. AIED 2023. Lecture Notes in Computer Science(), vol 13916. Springer, Cham. https://doi.org/10.1007/978-3-031-36272-9_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-36272-9_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-36271-2

  • Online ISBN: 978-3-031-36272-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics