Duper: A Proof-Producing Superposition Theorem Prover for Dependent Type Theory

Abstract

We present Duper, a proof-producing theorem prover for Lean based on the superposition calculus. Duper can be called directly as a terminal tactic in interactive Lean proofs, but is also designed with proof reconstruction for a future Lean hammer in mind. In this paper, we describe Duper’s underlying approach to proof search and proof reconstruction with a particular emphasis on the challenges of working in a dependent type theory. We also compare Duper’s performance to Metis' on pre-existing benchmarks to give evidence that Duper is performant enough to be useful for proof reconstruction in a hammer.

Jürgen Avenhaus, Jörg Denzinger, and Matthias Fuchs. DISCOUNT: A system for distributed equational deduction. In Jieh Hsiang, editor, Rewriting Techniques and Applications, 6th International Conference, RTA-95, Kaiserslautern, Germany, April 5-7, 1995, Proceedings, volume 914 of Lecture Notes in Computer Science, pages 397-402. Springer, 1995. URL: https://doi.org/10.1007/3-540-59200-8_72.
Jeremy Avigad, Leonardo de Moura, Soonho Kong, and Sebastian Ullrich. Theorem proving in Lean 4. URL: https://leanprover.github.io/theorem_proving_in_lean4/.
Leo Bachmair and Harald Ganzinger. Rewrite-based equational theorem proving with selection and simplification. J. Log. Comput., 4(3):217-247, 1994. URL: https://doi.org/10.1093/LOGCOM/4.3.217.
Alexander Bentkamp, Jasmin Blanchette, Visa Nummelin, Sophie Tourret, Petar Vukmirović, and Uwe Waldmann. Mechanical mathematicians. Communications of the ACM, 66(4):80-90, March 2023. URL: https://doi.org/10.1145/3557998.
Alexander Bentkamp, Jasmin Blanchette, Sophie Tourret, and Petar Vukmirovic. Superposition for higher-order logic. J. Autom. Reason., 67(1):10, 2023. URL: https://doi.org/10.1007/S10817-022-09649-9.
Alexander Bentkamp, Jasmin Blanchette, Sophie Tourret, Petar Vukmirovic, and Uwe Waldmann. Superposition with lambdas. J. Autom. Reason., 65(7):893-940, 2021. URL: https://doi.org/10.1007/S10817-021-09595-Y.
Yves Bertot and Pierre Castéran. Interactive Theorem Proving and Program Development - Coq'Art: The Calculus of Inductive Constructions. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2004. URL: https://doi.org/10.1007/978-3-662-07964-5.
Jasmin Blanchette, Qi Qiu, and Sophie Tourret. Verified given clause procedures. In Brigitte Pientka and Cesare Tinelli, editors, Automated Deduction - CADE 29 - 29th International Conference on Automated Deduction, Rome, Italy, July 1-4, 2023, Proceedings, volume 14132 of Lecture Notes in Computer Science, pages 61-77. Springer, 2023. URL: https://doi.org/10.1007/978-3-031-38499-8_4.
Jasmin C. Blanchette, Cezary Kaliszyk, Lawrence C. Paulson, and Josef Urban. Hammering towards QED. J. Formaliz. Reason., 9(1):101-148, 2016. URL: https://doi.org/10.6092/ISSN.1972-5787/4593.
Ana Bove, Peter Dybjer, and Ulf Norell. A brief overview of Agda - A functional language with dependent types. In Stefan Berghofer, Tobias Nipkow, Christian Urban, and Makarius Wenzel, editors, Theorem Proving in Higher Order Logics, 22nd International Conference, TPHOLs 2009, Munich, Germany, August 17-20, 2009. Proceedings, volume 5674 of Lecture Notes in Computer Science, pages 73-78. Springer, 2009. URL: https://doi.org/10.1007/978-3-642-03359-9_6.
Chad E. Brown, Thibault Gauthier, Cezary Kaliszyk, Geoff Sutcliffe, and Josef Urban. GRUNGE: A grand unified ATP challenge. In Pascal Fontaine, editor, Automated Deduction - CADE 27 - 27th International Conference on Automated Deduction, Natal, Brazil, August 27-30, 2019, Proceedings, volume 11716 of Lecture Notes in Computer Science, pages 123-141. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-29436-6_8.
Mario Carneiro, Chad E. Brown, and Josef Urban. Automated theorem proving for metamath. In Adam Naumowicz and René Thiemann, editors, 14th International Conference on Interactive Theorem Proving, ITP 2023, July 31 to August 4, 2023, Białystok, Poland, volume 268 of LIPIcs, pages 9:1-9:19. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2023. URL: https://doi.org/10.4230/LIPICS.ITP.2023.9.
Lukasz Czajka. Practical proof search for Coq by type inhabitation. In Nicolas Peltier and Viorica Sofronie-Stokkermans, editors, Automated Reasoning - 10th International Joint Conference, IJCAR 2020, Paris, France, July 1-4, 2020, Proceedings, Part II, volume 12167 of Lecture Notes in Computer Science, pages 28-57. Springer, 2020. URL: https://doi.org/10.1007/978-3-030-51054-1_3.
Lukasz Czajka and Cezary Kaliszyk. Hammer for coq: Automation for dependent type theory. J. Autom. Reason., 61(1-4):423-453, 2018. URL: https://doi.org/10.1007/S10817-018-9458-4.
Leonardo de Moura and Sebastian Ullrich. The lean 4 theorem prover and programming language. In André Platzer and Geoff Sutcliffe, editors, Automated Deduction - CADE 28 - 28th International Conference on Automated Deduction, Virtual Event, July 12-15, 2021, Proceedings, volume 12699 of Lecture Notes in Computer Science, pages 625-635. Springer, 2021. URL: https://doi.org/10.1007/978-3-030-79876-5_37.
Martin Desharnais, Petar Vukmirovic, Jasmin Blanchette, and Makarius Wenzel. Seventeen provers under the hammer. In June Andronick and Leonardo de Moura, editors, 13th International Conference on Interactive Theorem Proving, ITP 2022, August 7-10, 2022, Haifa, Israel, volume 237 of LIPIcs, pages 8:1-8:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2022. URL: https://doi.org/10.4230/LIPICS.ITP.2022.8.
Jeanne Ferrante and Charles Rackoff. A decision procedure for the first order theory of real addition with order. SIAM J. Comput., 4(1):69-76, 1975. URL: https://doi.org/10.1137/0204006.
Simon Foster and Georg Struth. Integrating an automated theorem prover into Agda. In Mihaela Gheorghiu Bobaru, Klaus Havelund, Gerard J. Holzmann, and Rajeev Joshi, editors, NASA Formal Methods - Third International Symposium, NFM 2011, Pasadena, CA, USA, April 18-20, 2011. Proceedings, volume 6617 of Lecture Notes in Computer Science, pages 116-130. Springer, 2011. URL: https://doi.org/10.1007/978-3-642-20398-5_10.
Benjamin Grégoire and Assia Mahboubi. Proving equalities in a commutative ring done right in Coq. In Joe Hurd and Thomas F. Melham, editors, Theorem Proving in Higher Order Logics, 18th International Conference, TPHOLs 2005, Oxford, UK, August 22-25, 2005, Proceedings, volume 3603 of Lecture Notes in Computer Science, pages 98-113. Springer, 2005. URL: https://doi.org/10.1007/11541868_7.
John Harrison. Optimizing proof search in model elimination. In Michael A. McRobbie and John K. Slaney, editors, Automated Deduction - CADE-13, 13th International Conference on Automated Deduction, New Brunswick, NJ, USA, July 30 - August 3, 1996, Proceedings, volume 1104 of Lecture Notes in Computer Science, pages 313-327. Springer, 1996. URL: https://doi.org/10.1007/3-540-61511-3_97.
Joe Hurd. First-order proof tactics in higher-order logic theorem provers. In Design and Application of Strategies/Tactics in Higher Order Logics (STRATA 2003), pages 56-68, 2003. URL: http://www.gilith.com/papers.
Cezary Kaliszyk and Josef Urban. Hol(y)hammer: Online ATP service for HOL light. Math. Comput. Sci., 9(1):5-22, 2015. URL: https://doi.org/10.1007/S11786-014-0182-0.
Cezary Kaliszyk and Josef Urban. Mizar 40 for mizar 40. J. Autom. Reason., 55(3):245-256, 2015. URL: https://doi.org/10.1007/S10817-015-9330-8.
Donald E. Knuth and Peter B. Bendix. Simple word problems in universal algebras. In John Leech, editor, Computational Problems in Abstract Algebra, pages 263-297. Pergamon, 1970. URL: https://doi.org/10.1016/b978-0-08-012975-4.50028-x.
Laura Kovács and Andrei Voronkov. First-order theorem proving and vampire. In Natasha Sharygina and Helmut Veith, editors, Computer Aided Verification - 25th International Conference, CAV 2013, Saint Petersburg, Russia, July 13-19, 2013. Proceedings, volume 8044 of Lecture Notes in Computer Science, pages 1-35. Springer, 2013. URL: https://doi.org/10.1007/978-3-642-39799-8_1.
Jannis Limperg and Asta Halkjær From. Aesop: White-box best-first proof search for lean. In Robbert Krebbers, Dmitriy Traytel, Brigitte Pientka, and Steve Zdancewic, editors, Proceedings of the 12th ACM SIGPLAN International Conference on Certified Programs and Proofs, CPP 2023, Boston, MA, USA, January 16-17, 2023, pages 253-266. ACM, 2023. URL: https://doi.org/10.1145/3573105.3575671.
Bernd Löchner. Things to know when implementing KBO. J. Autom. Reason., 36(4):289-310, 2006. URL: https://doi.org/10.1007/S10817-006-9031-4.
Ewing L. Lusk. Controlling redundancy in large search spaces: Argonne-style theorem proving through the years. In Andrei Voronkov, editor, Logic Programming and Automated Reasoning,International Conference LPAR'92, St. Petersburg, Russia, July 15-20, 1992, Proceedings, volume 624 of Lecture Notes in Computer Science, pages 96-106. Springer, 1992. URL: https://doi.org/10.1007/BFB0013052.
The mathlib Community. The Lean mathematical library. In Jasmin Blanchette and Catalin Hritcu, editors, Proceedings of the 9th ACM SIGPLAN International Conference on Certified Programs and Proofs, CPP 2020, New Orleans, LA, USA, January 20-21, 2020, pages 367-381. ACM, 2020. URL: https://doi.org/10.1145/3372885.3373824.
William McCune and Larry Wos. Otter–the CADE-13 competition incarnations. Journal of Automated Reasoning, 18(2):211-220, 1997. URL: https://doi.org/10.1023/a:1005843632307.
Jia Meng and Lawrence C. Paulson. Translating higher-order clauses to first-order clauses. J. Autom. Reason., 40(1):35-60, 2008. URL: https://doi.org/10.1007/S10817-007-9085-Y.
Jia Meng and Lawrence C. Paulson. Lightweight relevance filtering for machine-generated resolution problems. J. Appl. Log., 7(1):41-57, 2009. URL: https://doi.org/10.1016/J.JAL.2007.07.004.
Tobias Nipkow. Term rewriting and beyond - theorem proving in Isabelle. Formal Aspects Comput., 1(4):320-338, 1989. URL: https://doi.org/10.1007/BF01887212.
Lawrence C. Paulson. A generic tableau prover and its integration with Isabelle. J. Univers. Comput. Sci., 5(3):73-87, 1999. URL: https://doi.org/10.3217/JUCS-005-03-0073.
Lawrence C. Paulson and Jasmin Christian Blanchette. Three years of experience with sledgehammer, a practical link between automatic and interactive theorem provers. In Geoff Sutcliffe, Stephan Schulz, and Eugenia Ternovska, editors, The 8th International Workshop on the Implementation of Logics, IWIL 2010, Yogyakarta, Indonesia, October 9, 2011, volume 2 of EPiC Series in Computing, pages 1-11. EasyChair, 2010. URL: https://doi.org/10.29007/36DT.
Lawrence C. Paulson and Kong Woei Susanto. Source-level proof reconstruction for interactive theorem proving. In Klaus Schneider and Jens Brandt, editors, Theorem Proving in Higher Order Logics, 20th International Conference, TPHOLs 2007, Kaiserslautern, Germany, September 10-13, 2007, Proceedings, volume 4732 of Lecture Notes in Computer Science, pages 232-245. Springer, 2007. URL: https://doi.org/10.1007/978-3-540-74591-4_18.
William W. Pugh. The test: a fast and practical integer programming algorithm for dependence analysis. In Joanne L. Martin, editor, Proceedings Supercomputing '91, Albuquerque, NM, USA, November 18-22, 1991, pages 4-13. ACM, 1991. URL: https://doi.org/10.1145/125826.125848.
Konrad Slind and Michael Norrish. A brief overview of HOL4. In Otmane Aït Mohamed, César A. Muñoz, and Sofiène Tahar, editors, Theorem Proving in Higher Order Logics, 21st International Conference, TPHOLs 2008, Montreal, Canada, August 18-21, 2008. Proceedings, volume 5170 of Lecture Notes in Computer Science, pages 28-32. Springer, 2008. URL: https://doi.org/10.1007/978-3-540-71067-7_6.
Robert Solovay, R. D. Arthan, and John Harrison. Some new results on decidability for elementary algebra and geometry. Ann. Pure Appl. Log., 163(12):1765-1802, 2012. URL: https://doi.org/10.1016/J.APAL.2012.04.003.
G. Sutcliffe. The CADE-27 Automated Theorem Proving System Competition - CASC-27. AI Communications, 32(5-6):373-389, 2020.
G. Sutcliffe. The 10th IJCAR Automated Theorem Proving System Competition - CASC-J10. AI Communications, 34(2):163-177, 2021.
G. Sutcliffe and M. Desharnais. The CADE-28 Automated Theorem Proving System Competition - CASC-28. AI Communications, 34(4):259-276, 2022.
Geoff Sutcliffe. The logic languages of the TPTP world. Log. J. IGPL, 31(6):1153-1169, 2023. URL: https://doi.org/10.1093/JIGPAL/JZAC068.
Geoff Sutcliffe and Martin Desharnais. The 11th IJCAR automated theorem proving system competition - CASC-J11. AI Commun., 36(2):73-91, 2023. URL: https://doi.org/10.3233/AIC-220244.
Petar Vukmirović, Alexander Bentkamp, Jasmin Blanchette, Simon Cruanes, Visa Nummelin, and Sophie Tourret. Making Higher-Order Superposition Work, pages 415-432. Springer International Publishing, 2021. URL: https://doi.org/10.1007/978-3-030-79876-5_24.
Petar Vukmirović, Alexander Bentkamp, and Visa Nummelin. Efficient full higher-order unification. Logical Methods in Computer Science, Volume 17, Issue 4, December 2021. URL: https://doi.org/10.46298/lmcs-17(4:18)2021.

Duper: A Proof-Producing Superposition Theorem Prover for Dependent Type Theory

Authors Joshua Clune , Yicheng Qian, Alexander Bentkamp , Jeremy Avigad

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Thanks for your feedback!

Could not send message

Duper: A Proof-Producing Superposition Theorem Prover for Dependent Type Theory

Authors Joshua Clune , Yicheng Qian, Alexander Bentkamp , Jeremy Avigad

File

Document Identifiers

Author Details

Acknowledgements

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

Supplementary Materials

References

Thanks for your feedback!

Could not send message