An autonomous debating system


Artificial intelligence (AI) is defined as the ability of machines to perform tasks that are usually associated with intelligent beings. Argument and debate are fundamental capabilities of human intelligence, essential for a wide range of human activities, and common to all human societies. The development of computational argumentation technologies is therefore an important emerging discipline in AI research1. Here we present Project Debater, an autonomous debating system that can engage in a competitive debate with humans. We provide a complete description of the system’s architecture, a thorough and systematic evaluation of its operation across a wide range of debate topics, and a detailed account of the system’s performance in its public debut against three expert human debaters. We also highlight the fundamental differences between debating with humans as opposed to challenging humans in game competitions, the latter being the focus of classical ‘grand challenges’ pursued by the AI research community over the past few decades. We suggest that such challenges lie in the ‘comfort zone’ of AI, whereas debating with humans lies in a different territory, in which humans still prevail, and for which novel paradigms are required to make substantial progress.

Fig. 1: Debate flow.
Fig. 2: System architecture.
Fig. 3: Evaluation of Project Debater.
Fig. 4: Content type analysis.

Data availability

The full transcripts of the three public debates in which Project Debater participated are available in Supplementary Information section 11, including information that elucidates the system’s operation throughout, and the results of the audience votes. In addition, multiple datasets that were constructed and used while developing Project Debater are available at https://www.research.ibm.com/haifa/dept/vst/debating_data.shtml. Source data are provided with this paper for Fig. 3Source data are provided with this paper.

Most of the underlying capabilities of Project Debater, including the argument mining components, are freely available for academic research upon request as cloud services via https://early-access-program.debater.res.ibm.com/academic_use (in which the terminology differs: what we call here ‘motion’ and ‘topic’ is denoted as ‘topic’ and ‘concept’, respectively.).


We thank E. Aharoni, D. Carmel, S. Fine, M. Levinger, and L. Haas for invaluable help during the early stages of this work. We thank A. Aaron and R. Fernandez for help in developing the Project Debater voice; P. Levin-Slesarev for work on the figures; G. Feigenblat and J. Daxenberger for help in generating baseline results; Y. Katsis for comments on the draft; N. Ovadia, D. Zafrir and H. Natarajan for their sportsmanship; and I. Dagan, I. Gurevych, C. Reed, B. Stein, H. Wachsmuth and U. Zakai for many discussions. We are indebted to the in-house annotators and in-house debaters, and especially to A. Polnarov and H. Goldlist-Eichler, who worked on this project over the years.

N.S. conceived the idea of Project Debater. N.S., Y.B., C.A., R.B.-H., B.B., F.B., L.C., E.C.-K., L.D., L.E., L.E.-D, R.F.-M, A. Gavron, A. Gera., M.G., S.G., D.G., A.H., D.H., R.H., Y.H., S.H., M.J., C.J., Y. Kantor, Y. Katz, D. Konopnicki, Z.K., L.K., D. Krieger, D.L., T.L., R.L., N.L., Y.M., A.M., S.M., G.M., M.O., E.R., R.R., S.S., D.S., E.S., I.S., A. Spector, B.S., A.T., O.T.-R., E.V. and R.A. designed and built Project Debater, with guidance from S.O.-K. and A. Soffer. N.S., Y.B., R.F.-M, and R.A. designed the evaluation framework. N.S., Y.B., and R.A. wrote the paper, with contribution from A. Gera to the In Depth Analysis section. N.S., Y.B., R.B.-H., L.C., L.D., L.E.-D., A. Gera, R.F.-M., S.G., C.J., Y. Kantor, D.L., G.M., M.O., E.S., A.T., E.V. and R.A. wrote the Supplementary Information. Y. Katz led the software engineering of the project. N.S. and R.A. led the team, with D.G. co-leading during the early stages of the project.

Correspondence to Noam Slonim.

Supplementary Information

This file contains Supplementary Information Sections 1-11, including Supplementary Tables 1-3, Supplementary Figures 1-6 and Supplementary References – see contents pages for details.

This file contains additional information, including: query_sentiment_lexicon - a lexicon of sentiment words, used as a building block to create queries for sentence retrieval in the claim detection and evidence detection components; action_verb_expansions - a mapping between common action verbs and their syntactic and semantic expansions; claim_verb_phrases - a list of verb phrases commonly found in sentences containing claims; contrastive_expressions - a lexicon of expressions indicating contrast and study_conclusions - a list of phrases (unigrams to 5-grams) that frequently appear in reports of study results and conclusions.

Source data

Slonim, N., Bilu, Y., Alzate, C. et al. An autonomous debating system. Nature 591, 379–384 (2021). https://doi.org/10.1038/s41586-021-03215-w

