Abstract
This chapter describes the Vidiam project, which covered the development of a dialogue management system for multimodal question answering (QA) dialogues, as carried out in the IMIX project. The approach followed was datadriven, i.e., corpus-based. Since research in QA dialogue of multimodal information retrieval is still new, no suitable corpora were available to base a system on. This chapter reports on the collection and analysis of three QA dialogue corpora, involving textual follow-up utterances, multimodal follow-up questions, and speech dialogues. Based on the data, a dialogue act typology was created, which helps translate user utterances to practical interactive QA strategies. The chapter goes on to explain how the dialogue manager and its components: dialogue act recognition; interactive QA strategy handling; reference resolution; and multimodal fusion, were built and evaluated using off-line analysis of the corpus data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bertomeu N, Uszkoreit H, Frank A, Krieger HU, J¨org B (2006) Contextual phenomena and thematic relations in database QA dialogues: results from a Wizard-of-Oz experiment. In: Workshop on Interactive Question Answering, HLT-NAACL 06, pp 1–8
Bouma G, Mur J, van Noord G, van der Plas L, Tiedemann J (2006) Question answering for dutch using dependency relations. In: Proceedings of the CLEF2005 workshop
Cohen J (1960) A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20:37–46
De Boni M, Manandhar S (2004) Implementing clarification dialogues in open domain question answering. Journal of Natural Language Engineering
Forner P, Pe˜nas, Agirre E, Alegrian I For˘ascu C, Moreau N, Osenova P, Prokopidis P, Rocha P, Sacaleanu B, Sutcliffe R, Tjong Kim Sang E (2009) Overview of the clef 2008 multilingual question answering track. In: CLEF’08: Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access, Springer-Verlag, Berlin, Heidelberg, pp 262–295
Fukumoto J (2006) Answering questions of information access dialogue (iad) task using ellipsis handling of follow-up questions. In: Workshop on Interactive Question Answering, HLT-NAACL 06
Fukumoto J, Niwa T, Itoigawa M, MatsudaM(2004) RitsQA: List answer detection and context task with ellipses handling. In: Working notes of the Fourth NTCIR Workshop Meeting, pp 310–314
Galibert O, Illouz G, Rosset S (2005) Ritel: an open-domain, human-computer dialog system. In: Interspeech 2005, pp 909–912
Gildea D, Palmer M (2001) The necessity of parsing for predicate argument recognition. In: Proceedings of the 40th Annual Meeting on Association for C omputational Linguistics, Philadelphia, Annual Meeting of the ACL, URL http://www.egr.msu.edu/~jchai/QAPapers/gildea-acl02.pdf
Hickl A,Wang P, Lehmann J, Harabagiu SM (2006) FERRET: Interactive questionanswering for real-world environments. In: ACL 2006, pp 25–28
Hofs D, Theune M, Op den Akker R (2010) Natural interaction with a virtual guide in a virtual environment: A multimodal dialogue system. Journal on Multimodal User Interfaces 3 (1-2):141–153
Inui K, Yamashita A, Matsumoto Y (2003) Dialogue management for languagebased information seeking. In: Proc. First International Workshop on Language Understanding and Agents for Real World Interaction, pp 32–38
Kato T, Fukumoto J, Masui F (2004) Question answering challenge for information access dialogue – overview of NTCIR4 QAC2 subtask 3. In: Working notes of the Fourth NTCIR Workshop Meeting
Lappin S, Leass HJ (1994) An algorithm for pronominal anaphora resolution. Computational Linguistics 20(4):535–561, URL citeseer.ist.psu.edu/ lappin94algorithm.html
Lin CJ, Chen HH (2001) Description of NTU system at TREC-10 QA track. In: TREC 10
Lin J, Quan D, Sinha V, Bakshi K, Huynh D, Katz B, Karger DR (2003) What makes a good answer? the role of context in question answering. In: Proceedings of the Ninth IFIP TC13 International Conference on Human-Computer Interaction (INTERACT-2003)
Martin JC, Buisine S, Pitel G, Bernsen NO (2006) Fusion of children’s speech and 2D gestures when conversing with 3D characters. Special issue on multimodal interfaces of the Signal Processing journal 86(12):3596–3624
Oh JH, Lee KS, Chang DS, Seo CW, Choi KS (2001) Trec-10 experiments at kaist: Batch filtering and question answering. In: TREC
Reithinger N, Bergweiler S, Engel R, Herzog G, Pfleger N, Romanelli M, Sonntag D (2005) A look under the hood: design and development of the first smartweb system demonstrator. In: ICMI ’05: Proceedings of the 7th international conference on Multimodal interfaces, ACM Press, New York, NY, USA, pp 159– 166, DOI http://doi.acm.org/10.1145/1088463.1088492
van Schooten B, op den Akker R (2005) Follow-up utterances in QA dialogue. Traitement Automatique des Langues 46(3):181–206
van Schooten B, op den Akker R (2007) Multimodal follow-up questions to multimodal answers in a QA system. In: Tenth international symposium on social communication, Universidad de Oriente Santiago de Cuba, pp 469–474
van Schooten B, Rosset S, Galibert O, Max A, op den Akker R, Illouz G (2007) Handling speech input in the Ritel QA dialogue system. In: Interspeech 2007
van Schooten B, op den Akker R, Rosset S, Galibert O, Max A, Illouz G (2009) Follow-up question handling in the IMIX and Ritel systems: a comparative study. JNLE 15(1):97–118
Small S, Liu T, Shimizu N, Strzalkowski T (2003) HITIQA: an interactive question answering system: A preliminary report. In: Proceedings of the ACL 2003 Workshop on Multilingual Summarization and Question Answering
Theune M, Krahmer E, van Schooten B, op den Akker R, van Hooijdonk C, Marsi E, Bosma W, Hofs D, Nijholt A (2007) Questions, pictures, answers: Introducing pictures in question-answering systems. In: Tenth international symposium on social communication, Universidad de Oriente Santiago de Cuba, pp 450–463
Voorhees EM (2001) Overview of TREC 2001. In: TREC
Voorhees EM (2005) Overview of the TREC 2005 question answering track. Tech. rep., NIST
Wang D, Zhang J, Dai G (2006) A multimodal fusion framework for children’s storytelling systems. In: Edutainment, pp 585–588
Willems DJM, Rossignol SYP, Vuurpijl LG (2005) Features for mode detection in natural online pen input. In: BIGS 2005: Proceedings of the 12th Biennial Conference of the International Graphonomics Society, pp 113–117
Witten IH, Frank E (2005) Data Mining: Practical machine learning tools and techniques, 2nd Edition. Morgan Kaufmann
Yang F, Feng J, Di Fabbrizio G (2006) A data driven approach to relevancy recognition for contextual question answering. In: Workshop on Interactive Question Answering, HLT-NAACL 06, pp 33–40
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
van Schooten, B., op den Akker, R. (2011). Vidiam: Corpus-based Development of a Dialogue Manager for Multimodal Question Answering. In: van den Bosch, A., Bouma, G. (eds) Interactive Multi-modal Question-Answering. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17525-1_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-17525-1_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-17524-4
Online ISBN: 978-3-642-17525-1
eBook Packages: Computer ScienceComputer Science (R0)