Skip to content

An evaluation benchmark for undergraduate competition math in Lean4, Isabelle, Coq, and natural language.

License

Notifications You must be signed in to change notification settings

Adarsh321123/PUTNAM-Adarsh

 
 

Repository files navigation

PutnamBench

PutnamBench is a benchmark for evaluation of theorem proving algorithms on competition mathematics problems sourced from the William Lowell Putnam Mathematical Competition years 1965 - 2023. Our formalizations currently support three formal languages : Lean 4 $\land$ Isabelle $\land$ Coq. PutnamBench comprises of over 1400 manual formalizations, aggregated over all languages.

PutnamBench aims to support research in automated mathematical reasoning by providing a multilingual benchmark for evaluating theorem-proving algorithms. It is released under permissive licenses (Apache 2.0 for Lean 4 and Isabelle, MIT for Coq), and we encourage community contributions (TODO: After initial release)

Citation

The associated paper for PutnamBench is {TODO}. Please consider including the following citation if you find PutnamBench useful. {TODO}

Statistics

Language Count
Lean 4 586
Isabelle 550
Coq 348

We also report the number of problems in a certain category. Note that some problems fall under multiple categories.

Category Total Quantity
Algebra 219
Analysis 176
Number Theory 97
Linear Algebra 43
Abstract Algebra 25
Geometry 22
Combinatorics 12
Set Theory 4
Probability 2

Versioning

  • Version: v-1
  • Not yet officially released. Please refrain from opening issues or making PRs until the initial release.

About

An evaluation benchmark for undergraduate competition math in Lean4, Isabelle, Coq, and natural language.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Lean 95.1%
  • Python 4.9%