State-of-the-Art in Open-Domain Conversational AI: A Survey
<p>Method for both investigations in this study.</p> "> Figure 2
<p>Forgetfulness example of BlenderBot Generative BST 2.7B model (blue bar) [<a href="#B15-information-13-00298" class="html-bibr">15</a>].</p> "> Figure 3
<p>Forgetfulness example of Meena (blue bar) [<a href="#B5-information-13-00298" class="html-bibr">5</a>].</p> "> Figure 4
<p>Self-contradiction from multi-turn self-chat (blue bar) of <span style="color:SteelBlue">DialoGPT</span> with user prompt [<a href="#B21-information-13-00298" class="html-bibr">21</a>].</p> "> Figure 5
<p>Bar chart of the gender of conversational <span style="color:SteelBlue">AI</span> for the Overall, Journal, Loebner prize, Facebook Messenger and Popular cases.</p> ">
Abstract
:1. Introduction
2. Background
2.1. Retrieval and Generation Approaches
2.2. Evaluation
Subjective Evaluation of Conversational AI
2.3. Characteristics of Human Conversations
- Usually, one speaker talks at a time.
- The turn order varies.
- The turn size varies.
- The length of a conversation is not known in advance.
- The number of speakers/parties may vary.
- Techniques for allocating turns may be used.
- Content of the conversation is not known in advance.
- The relative distribution of turns is unknown in advance.
- Different turn-constructional unit may be used, e.g., words or sentences.
- Repair mechanisms for correcting turn-taking errors exist.
2.4. Ethics
3. Benefits of Conversational AI
- Provide support for users with disabilities, such as blindness [20]. Speech-to-text (STT) and text-to-speech (TTS) technologies combined with conversational AI can make life easier for people with disabilities.
- A channel for providing domain/world knowledge [20]. The IR approach discussed earlier can make it possible to have up-to-date information on specific domains or topics through conversational AI.
- The provision of educational content or information in a concise fashion [46]. As mentioned earlier, the content and length of a conversation are not known in advance, so it is possible to construct utterances that are relatively concise and to the point.
- Automated machine–machine generation of quality data for low-resource languages [14]. The challenge of data scarcity for low-resource languages may be mitigated through quality data generated from autonomous machine–machine conversations on various topics and about different entities.
- The possibility of modelling human psychiatric/psychological treatment [2] on the basis of favorable behavior determined from experiments which are designed to modify input–output behaviour.
4. Methods
5. Results of Survey: Models
5.1. BlenderBot 1 & 2
5.2. Meena
5.3. DLGNet
5.4. DialoGPT 1 and 2 (RetGen)
5.5. GPT-3 and GPT-2
5.6. T5
6. Results and Discussion of Survey: Ethics of Gender
Discussion
7. Existing Challenges of Open-Domain Conversational AI
- Lack of utterance diversity [17].
- Lack of empathetic responses from conversational systems [12].
- Lack of memory to personalise user experiences.
- Multiple initiative coordination [2].
- Poor inference and implicature during conversation [15].
- Lack of world knowledge.
- Hallucination of facts when generating responses [74].
- Obsolete facts, which are frozen in the models’ weights during training.
- Training requires a large amount of data [74].
- Lack of common-sense reasoning [74].
- Large models use so many parameters that make them complex and may impede transparency [74].
8. Open-Domain Conversational AI for Low-Resource Languages
9. Related Work
10. Limitation
11. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- States, U. Preparing for the Future of Artificial Intelligence; 2016. Available online: https://obamawhitehouse.archives.gov/sites/default/files/whitehouse_files/microsites/ostp/NSTC/preparing_for_the_future_of_ai.pdf (accessed on 25 May 2022).
- Jurafsky, D.; Martin, J. Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition; Dorling Kindersley Pvt, Ltd.: London, UK, 2020. [Google Scholar]
- Weizenbaum, J. A Computer Program for the Study of Natural Language. Fonte: Stanford. 1969. Available online: Http://web.stanford.edu/class/linguist238/p36 (accessed on 25 May 2022).
- Turing, A.M. Computing machinery and intelligence. Mind 1950, 59, 433–460. [Google Scholar] [CrossRef]
- Adiwardana, D.; Luong, M.T.; So, D.R.; Hall, J.; Fiedel, N.; Thoppilan, R.; Yang, Z.; Kulshreshtha, A.; Nemade, G.; Lu, Y.; et al. Towards a human-like open-domain chatbot. arXiv 2020, arXiv:2001.09977. [Google Scholar]
- Chowdhary, K. Natural Language Processing for Word Sense Disambiguation and Information Extraction. 2020, pp. 603–649. Available online: https://arxiv.org/ftp/arxiv/papers/2004/2004.02256.pdf (accessed on 25 May 2022).
- Gabriel, R.; Liu, Y.; Gottardi, A.; Eric, M.; Khatri, A.; Chadha, A.; Chen, Q.; Hedayatnia, B.; Rajan, P.; Binici, A.; et al. Further advances in open domain dialog systems in the third alexa prize socialbot grand challenge. Alexa Prize Proc. 2020, 3. Available online: https://assets.amazon.science/0e/e6/2cff166647bfb951b3ccc67c1d06/further-advances-in-open-domain-dialog-systems-in-the-third-alexa-prize-socialbot-grand-challenge.pdf (accessed on 25 May 2022).
- Gunasekara, C.; Kim, S.; D’Haro, L.F.; Rastogi, A.; Chen, Y.N.; Eric, M.; Hedayatnia, B.; Gopalakrishnan, K.; Liu, Y.; Huang, C.W.; et al. Overview of the ninth dialog system technology challenge: Dstc9. arXiv 2020, arXiv:2011.06486. [Google Scholar]
- Schegloff, E.A. Sequencing in conversational openings 1. Am. Anthropol. 1968, 70, 1075–1095. [Google Scholar] [CrossRef]
- Eric, M.; Goel, R.; Paul, S.; Sethi, A.; Agarwal, S.; Gao, S.; Kumar, A.; Goyal, A.; Ku, P.; Hakkani-Tur, D. MultiWOZ 2.1: A Consolidated Multi-Domain Dialogue Dataset with State Corrections and State Tracking Baselines. In Proceedings of the 12th Language Resources and Evaluation Conference, Marseille, France, 20–25 June 2020; pp. 422–428. [Google Scholar]
- Allwood, J.; Grönqvist, L.; Ahlsén, E.; Gunnarsson, M. Annotations and tools for an activity based spoken language corpus. In Current and New Directions in Discourse and Dialogue; Springer: Berlin/Heidelberg, Germany, 2003; pp. 1–18. [Google Scholar] [CrossRef]
- Rashkin, H.; Smith, E.M.; Li, M.; Boureau, Y.L. Towards Empathetic Open-domain Conversation Models: A New Benchmark and Dataset. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, 28 July–2 August 2019; pp. 5370–5381. [Google Scholar] [CrossRef]
- Adewumi, T.; Brännvall, R.; Abid, N.; Pahlavan, M.; Sabry, S.S.; Liwicki, F.; Liwicki, M. Småprat: DialoGPT for Natural Language Generation of Swedish Dialogue by Transfer Learning. In Proceedings of the 5th Northern Lights Deep Learning Workshop, Tromsø, Norway, 10–12 January 2022; Volume 3. [Google Scholar] [CrossRef]
- Adewumi, T.; Adeyemi, M.; Anuoluwapo, A.; Peters, B.; Buzaaba, H.; Samuel, O.; Rufai, A.M.; Ajibade, B.; Gwadabe, T.; Traore, M.M.K.; et al. Ìtàkúròso: Exploiting Cross-Lingual Transferability for Natural Language Generation of Dialogues in Low-Resource, African Languages. arXiv 2022, arXiv:2204.08083. [Google Scholar]
- Roller, S.; Dinan, E.; Goyal, N.; Ju, D.; Williamson, M.; Liu, Y.; Xu, J.; Ott, M.; Smith, E.M.; Boureau, Y.L.; et al. Recipes for Building an Open-Domain Chatbot. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Online, 19–23 April 2021; pp. 300–325. [Google Scholar] [CrossRef]
- Chen, H.; Liu, X.; Yin, D.; Tang, J. A survey on dialogue systems: Recent advances and new frontiers. ACM SIGKDD Explor. Newsl. 2017, 19, 25–35. [Google Scholar] [CrossRef]
- Holtzman, A.; Buys, J.; Du, L.; Forbes, M.; Choi, Y. The curious case of neural text degeneration. In Proceedings of the International Conference on Learning Representations, ICLR 2020, Addis Ababa, Ethiopia, 26–30 April 2020; Available online: https://arxiv.org/pdf/1904.09751.pdf (accessed on 25 May 2022).
- Aggarwal, C.C.; Zhai, C. A survey of text classification algorithms. In Mining Text Data; Springer: Berlin/Heidelberg, Germany, 2012; pp. 163–222. [Google Scholar]
- Gehrmann, S.; Adewumi, T.; Aggarwal, K.; Ammanamanchi, P.S.; Aremu, A.; Bosselut, A.; Chandu, K.R.; Clinciu, M.A.; Das, D.; Dhole, K.; et al. The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics. In Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021), Online, 5–6 August 2021; pp. 96–120. [Google Scholar] [CrossRef]
- Reiter, E. 20 Natural Language Generation. In The Handbook of Computational Linguistics and Natural Language Processing; 2010; p. 574. Available online: https://onlinelibrary.wiley.com/doi/10.1002/9781444324044.ch20 (accessed on 25 May 2022).
- Zhang, Y.; Sun, S.; Galley, M.; Chen, Y.C.; Brockett, C.; Gao, X.; Gao, J.; Liu, J.; Dolan, B. DialoGPT: Large-Scale Generative Pre-training for Conversational Response Generation. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Online, 5–10 July 2020; pp. 270–278. [Google Scholar] [CrossRef]
- Gangal, V.; Jhamtani, H.; Hovy, E.; Berg-Kirkpatrick, T. Improving Automated Evaluation of Open Domain Dialog via Diverse Reference Augmentation. In Proceedings of the Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, Online, 1–6 August 2021; pp. 4079–4090. [Google Scholar] [CrossRef]
- Jhamtani, H.; Gangal, V.; Hovy, E.; Berg-Kirkpatrick, T. Investigating Robustness of Dialog Models to Popular Figurative Language Constructs. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 7476–7485. [Google Scholar] [CrossRef]
- Liu, C.W.; Lowe, R.; Serban, I.V.; Noseworthy, M.; Charlin, L.; Pineau, J. How not to evaluate your dialogue system: An empirical study of unsupervised evaluation metrics for dialogue response generation. arXiv 2016, arXiv:1603.08023. [Google Scholar]
- Ji, T.; Graham, Y.; Jones, G.J.; Lyu, C.; Liu, Q. Achieving Reliable Human Assessment of Open-Domain Dialogue Systems. arXiv 2022, arXiv:2203.05899. [Google Scholar]
- Tsuta, Y.; Yoshinaga, N.; Toyoda, M. uBLEU: Uncertainty-Aware Automatic Evaluation Method for Open-Domain Dialogue Systems. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, Online, 5–10 July 2020; pp. 199–206. [Google Scholar] [CrossRef]
- Venkatesh, A.; Khatri, C.; Ram, A.; Guo, F.; Gabriel, R.; Nagar, A.; Prasad, R.; Cheng, M.; Hedayatnia, B.; Metallinou, A.; et al. On evaluating and comparing conversational agents. arXiv 2018, arXiv:1801.03625. [Google Scholar]
- Guo, F.; Metallinou, A.; Khatri, C.; Raju, A.; Venkatesh, A.; Ram, A. Topic-based evaluation for conversational bots. arXiv 2018, arXiv:1801.03622. [Google Scholar]
- Deriu, J.; Tuggener, D.; von Däniken, P.; Campos, J.A.; Rodrigo, A.; Belkacem, T.; Soroa, A.; Agirre, E.; Cieliebak, M. Spot The Bot: A Robust and Efficient Framework for the Evaluation of Conversational Dialogue Systems. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 3971–3984. [Google Scholar] [CrossRef]
- Li, M.; Weston, J.; Roller, S. Acute-eval: Improved dialogue evaluation with optimized questions and multi-turn comparisons. arXiv 2019, arXiv:1909.03087. [Google Scholar]
- Mauldin, M.L. Chatterbots, tinymuds, and the turing test: Entering the loebner prize competition. In Proceedings of the AAAI, Seattle, WA, USA, 31 July–4 August 1994; Volume 94, pp. 16–21. Available online: https://www.aaai.org/Papers/AAAI/1994/AAAI94-003.pdf (accessed on 25 May 2022).
- Bradeško, L.; Mladenić, D. A survey of chatbot systems through a loebner prize competition. In Proceedings of the Slovenian Language Technologies Society Eighth Conference of Language Technologies, Ljubljana, Slovenia, 8–9 October 2012; pp. 34–37. Available online: http://nl.ijs.si/isjt12/proceedings/isjt2012_06.pdf (accessed on 25 May 2022).
- Shieber, S.M. Lessons from a restricted Turing test. arXiv 1994, arXiv:cmp-lg/9404002. [Google Scholar] [CrossRef]
- Sacks, H.; Schegloff, E.A.; Jefferson, G. A simplest systematics for the organization of turn taking for conversation. In Studies in the Organization of Conversational Interaction; Elsevier: Amsterdam, The Netherlands, 1978; pp. 7–55. [Google Scholar]
- Adewumi, T.P.; Liwicki, F.; Liwicki, M. Conversational Systems in Machine Learning from the Point of View of the Philosophy of Science—Using Alime Chat and Related Studies. Philosophies 2019, 4, 41. [Google Scholar] [CrossRef] [Green Version]
- Javed, S.; Adewumi, T.P.; Liwicki, F.S.; Liwicki, M. Understanding the Role of Objectivity in Machine Learning and Research Evaluation. Philosophies 2021, 6, 22. [Google Scholar] [CrossRef]
- White, M.D. Immanuel kant. In Handbook of Economics and Ethics; Edward Elgar Publishing: Cheltenham, UK, 2009. [Google Scholar]
- Alexander, L.; Moore, M. Deontological Ethics. 2007. Available online: https://plato.stanford.edu/entries/ethics-deontological/ (accessed on 25 May 2022).
- Paquette, M.; Sommerfeldt, E.J.; Kent, M.L. Do the ends justify the means? Dialogue, development communication, and deontological ethics. Public Relat. Rev. 2015, 41, 30–39. [Google Scholar] [CrossRef]
- Voigt, P.; Von dem Bussche, A. The EU General Data Protection Regulation (GDPR): A Practical Guide; Springer: Berlin/Heidelberg, Germany, 2017; Volume 10, pp. 10–5555. [Google Scholar]
- Neff, G.; Nagy, P. Automation, algorithms, and politics| talking to Bots: Symbiotic agency and the case of Tay. Int. J. Commun. 2016, 10, 17. [Google Scholar]
- Dinan, E.; Fan, A.; Williams, A.; Urbanek, J.; Kiela, D.; Weston, J. Queens are Powerful too: Mitigating Gender Bias in Dialogue Generation. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16–20 November 2020; pp. 8173–8188. [Google Scholar] [CrossRef]
- Henderson, P.; Sinha, K.; Angelard-Gontier, N.; Ke, N.R.; Fried, G.; Lowe, R.; Pineau, J. Ethical challenges in data-driven dialogue systems. In Proceedings of the 2018 AAAI/ACM Conference on AI, Ethics, and Society, New Orleans, LA, USA, 2–3 February 2018; pp. 123–129. Available online: https://arxiv.org/pdf/1711.09050.pdf (accessed on 25 May 2022).
- Maedche, A. Gender Bias in Chatbot Design. In Chatbot Research and Design; Springer: Heidelberg, Germany, 2020; p. 79. [Google Scholar]
- Ruane, E.; Birhane, A.; Ventresque, A. Conversational AI: Social and Ethical Considerations. In Proceedings of the AICS—27th AIAI Irish Conference on Artificial Intelligence and Cognitive Science, Galway, Ireland, 5–6 December 2019; pp. 104–115. [Google Scholar]
- Kerry, A.; Ellis, R.; Bull, S. Conversational agents in E-Learning. In Proceedings of the International Conference on Innovative Techniques and Applications of Artificial Intelligence, Cambridge, UK, 9–11 December 2008; pp. 169–182. Available online: https://link.springer.com/chapter/10.1007/978-1-84882-215-3_13 (accessed on 25 May 2022).
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- Zou, Y.; Liu, Z.; Hu, X.; Zhang, Q. Thinking Clearly, Talking Fast: Concept-Guided Non-Autoregressive Generation for Open-Domain Dialogue Systems. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, Online, Punta Cana, Dominican Republic, 7–11 November 2021; pp. 2215–2226. [Google Scholar] [CrossRef]
- Smith, E.M.; Williamson, M.; Shuster, K.; Weston, J.; Boureau, Y.L. Can You Put it All Together: Evaluating Conversational Agents’ Ability to Blend Skills. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5–10 July 2020; pp. 2021–2030. [Google Scholar] [CrossRef]
- Komeili, M.; Shuster, K.; Weston, J. Internet-augmented dialogue generation. arXiv 2021, arXiv:2107.07566. [Google Scholar]
- Xu, J.; Szlam, A.; Weston, J. Beyond goldfish memory: Long-term open-domain conversation. arXiv 2021, arXiv:2107.07567. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.u.; Polosukhin, I. Attention is All you Need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. Available online: https://arxiv.org/pdf/1706.03762v5.pdf (accessed on 25 May 2022).
- Hochreiter, S.; Schmidhuber, J. Long short-term memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef] [PubMed]
- Bahdanau, D.; Cho, K.; Bengio, Y. Neural machine translation by jointly learning to align and translate. In Proceedings of the International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May 2015. [Google Scholar] [CrossRef]
- Olabiyi, O.; Mueller, E.T. Multiturn dialogue response generation with autoregressive transformer models. arXiv 2019, arXiv:1908.01841. [Google Scholar]
- Zhang, Y.; Sun, S.; Gao, X.; Fang, Y.; Brockett, C.; Galley, M.; Gao, J.; Dolan, B. Joint Retrieval and Generation Training for Grounded Text Generation. arXiv 2021, arXiv:2105.06597. [Google Scholar]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.D.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 2020, 33, 1877–1901. [Google Scholar]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language models are unsupervised multitask learners. OpenAI Blog 2019, 1, 9. [Google Scholar]
- Xue, L.; Constant, N.; Roberts, A.; Kale, M.; Al-Rfou, R.; Siddhant, A.; Barua, A.; Raffel, C. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Online, 6–11 June 2021; pp. 483–498. [Google Scholar] [CrossRef]
- Adewumi, T.; Alkhaled, L.; Alkhaled, H.; Liwicki, F.; Liwicki, M. ML_LTU at SemEval-2022 Task 4: T5 Towards Identifying Patronizing and Condescending Language. arXiv 2022, arXiv:2204.07432. [Google Scholar]
- Sabry, S.S.; Adewumi, T.; Abid, N.; Kovacs, G.; Liwicki, F.; Liwicki, M. HaT5: Hate Language Identification using Text-to-Text Transfer Transformer. arXiv 2022, arXiv:2202.05690. [Google Scholar]
- Abercrombie, G.; Cercas Curry, A.; Pandya, M.; Rieser, V. Alexa, Google, Siri: What are Your Pronouns? Gender and Anthropomorphism in the Design and Perception of Conversational Assistants. In Proceedings of the 3rd Workshop on Gender Bias in Natural Language Processing, Online, 5 August 2021; pp. 24–33. [Google Scholar] [CrossRef]
- West, M.; Kraut, R.; Ei Chew, H. I’d Blush If I Could: Closing Gender Divides in Digital Skills through Education. 2019. Available online: https://unesdoc.unesco.org/ark:/48223/pf0000367416 (accessed on 25 May 2022).
- Silvervarg, A.; Raukola, K.; Haake, M.; Gulz, A. The effect of visual gender on abuse in conversation with ECAs. In Proceedings of the International Conference on Intelligent Virtual Agents, Santa Cruz, CA, USA, 12–14 September 2012; pp. 153–160. Available online: https://link.springer.com/chapter/10.1007/978-3-642-33197-8_16 (accessed on 25 May 2022).
- Forlizzi, J.; Zimmerman, J.; Mancuso, V.; Kwak, S. How interface agents affect interaction between humans and computers. In Proceedings of the 2007 Conference on Designing Pleasurable Products and Interfaces, Helsinki, Finland, 22–25 August 2007; pp. 209–221. Available online: https://dl.acm.org/doi/pdf/10.1145/1314161.1314180 (accessed on 25 May 2022).
- Louwerse, M.M.; Graesser, A.C.; Lu, S.; Mitchell, H.H. Social cues in animated conversational agents. Appl. Cogn. Psychol. 2005, 19, 693–704. [Google Scholar] [CrossRef]
- Muir, B.M. Trust between humans and machines, and the design of decision aids. Int. J. Man-Mach. Stud. 1987, 27, 527–539. [Google Scholar] [CrossRef]
- Nass, C.I.; Brave, S. Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship; MIT Press: Cambridge, MA, USA, 2005. [Google Scholar]
- Lee, M.; Ackermans, S.; Van As, N.; Chang, H.; Lucas, E.; IJsselsteijn, W. Caring for Vincent: A chatbot for self-compassion. In Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, Glasgow, UK, 4–9 May 2019; pp. 1–13. Available online: https://dl.acm.org/doi/pdf/10.1145/3290605.3300932 (accessed on 25 May 2022).
- Li, J.; Galley, M.; Brockett, C.; Gao, J.; Dolan, B. A Diversity-Promoting Objective Function for Neural Conversation Models. In Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, San Diego, CA, USA, 12–17 June 2016; pp. 110–119. [Google Scholar] [CrossRef] [Green Version]
- Bahl, L.; Brown, P.; De Souza, P.; Mercer, R. Maximum mutual information estimation of hidden Markov model parameters for speech recognition. In Proceedings of the ICASSP’86, IEEE International Conference on Acoustics, Speech, and Signal Processing, Tokyo, Japan, 7–11 April 1986; Volume 11, pp. 49–52. Available online: http://mirlab.org/users/davidson.chen/relatedPapers/others/1986%20ICASSP%20Maximum%20Mutual%20Information%20Estimation%20of%20Hidden%20Markov%20Model%20Parameters%20for%20Speech%20Recognition.pdf (accessed on 25 May 2022).
- Welleck, S.; Kulikov, I.; Roller, S.; Dinan, E.; Cho, K.; Weston, J. Neural text generation with unlikelihood training. arXiv 2019, arXiv:1908.04319. [Google Scholar]
- Adewumi, T.; Liwicki, F.; Liwicki, M. Vector Representations of Idioms in Conversational Systems. arXiv 2022, arXiv:2205.03666. [Google Scholar]
- Marcus, G. Deep learning: A critical appraisal. arXiv 2018, arXiv:1801.00631. [Google Scholar]
- Adewumi, T.P.; Liwicki, F.; Liwicki, M. The Challenge of Diacritics in Yoruba Embeddings. arXiv 2020, arXiv:2011.07605. [Google Scholar]
- Nekoto, W.; Marivate, V.; Matsila, T.; Fasubaa, T.; Fagbohungbe, T.; Akinola, S.O.; Muhammad, S.; Kabongo Kabenamualu, S.; Osei, S.; Sackey, F.; et al. Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages. In Proceedings of the Findings of the Association for Computational Linguistics: EMNLP 2020, Online, 16–20 November 2020; pp. 2144–2160. [Google Scholar] [CrossRef]
- Pfeiffer, J.; Rücklé, A.; Poth, C.; Kamath, A.; Vulić, I.; Ruder, S.; Cho, K.; Gurevych, I. AdapterHub: A Framework for Adapting Transformers. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020): Systems Demonstrations, Online, 16–20 November 2020; pp. 46–54. Available online: https://arxiv.org/pdf/2007.07779.pdf (accessed on 25 May 2022).
- Virtanen, A.; Kanerva, J.; Ilo, R.; Luoma, J.; Luotolahti, J.; Salakoski, T.; Ginter, F.; Pyysalo, S. Multilingual is not enough: BERT for Finnish. arXiv 2019, arXiv:1912.07076. [Google Scholar]
- Rönnqvist, S.; Kanerva, J.; Salakoski, T.; Ginter, F. Is multilingual BERT fluent in language generation? arXiv 2019, arXiv:1910.03806. [Google Scholar]
- Caldarini, G.; Jaf, S.; McGarry, K. A Literature Survey of Recent Advances in Chatbots. Information 2022, 13, 41. [Google Scholar] [CrossRef]
- Fu, T.; Gao, S.; Zhao, X.; Wen, J.r.; Yan, R. Learning towards conversational AI: A survey. AI Open 2022, 3, 14–28. [Google Scholar] [CrossRef]
- Ni, J.; Young, T.; Pandelea, V.; Xue, F.; Adiga, V.; Cambria, E. Recent advances in deep learning based dialogue systems: A systematic survey. arXiv 2021, arXiv:2105.04387. [Google Scholar]
- Khatri, C.; Hedayatnia, B.; Venkatesh, A.; Nunn, J.; Pan, Y.; Liu, Q.; Song, H.; Gottardi, A.; Kwatra, S.; Pancholi, S.; et al. Advancing the state of the art in open domain dialog systems through the alexa prize. arXiv 2018, arXiv:1812.10757. [Google Scholar]
Retrieval | Generation | Hybrid |
---|---|---|
Pros | ||
Possibility to incorporate domain/world knowledge [15] | Relatively unique tokens may be produced [16] | Combines the pros of both the retrieval and generation approaches |
Up-to-date information in response from online sources | The use of decoding algorithms, in addition to other hyperparameters such as temperature, can deliver relatively diverse outputs [17] | The retrieval component may be used to provide additional context for the generation [15] |
Possibility of more fluent or precise responses [16] | Possibility for up-to-date responses and world/domain knowledge | |
Cons | ||
Possible low diversity in the outputs | Low diversity in the overall generated outputs | Harder to implement efficiently than any of the single approaches |
Limitation based on the size of repository | High probability of repetitive generated output [15] | Some of the cons of the combined approaches may still exist, such as limitation of repository size |
Lack of memory to recall certain facts | Lack of memory to recall certain facts [5,15] | |
Poor output distribution compared to human conversation | Poor output distribution compared to human conversation [17] | |
Lack of domain/world knowledge |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Adewumi, T.; Liwicki, F.; Liwicki, M. State-of-the-Art in Open-Domain Conversational AI: A Survey. Information 2022, 13, 298. https://doi.org/10.3390/info13060298
Adewumi T, Liwicki F, Liwicki M. State-of-the-Art in Open-Domain Conversational AI: A Survey. Information. 2022; 13(6):298. https://doi.org/10.3390/info13060298
Chicago/Turabian StyleAdewumi, Tosin, Foteini Liwicki, and Marcus Liwicki. 2022. "State-of-the-Art in Open-Domain Conversational AI: A Survey" Information 13, no. 6: 298. https://doi.org/10.3390/info13060298