Skip to content

Model description

Johannes Czech edited this page Apr 4, 2021 · 3 revisions

The following models are available for download:

Release 0.7.0

For release 0.7.0, a pre-initialized model on human games was updated by applying reinforcement learning.

The 45th model update was obtained after generating ~ 1.05 million self-play games. It was also used for the 100 evaluation games with Multi-Variant-Stockfish:

The reinforcement learning loop was then continued until a total of ~ 2.37 million self-play games, resulting in the 96th model update:

Release 0.6.0

For release 0.6.0 and all previous releases, models were solely based on supervised learning on the lichess.org data set as described in our paper.

The following table summarizes the metric evaluation for different model architectures:

Model Policy Loss Value Loss Policy Accuracy Trained on data set Best suited for
4-value-8-policy 1.2184 0.7596 0.5986 lichess.org GPU
8-value-16-policy 1.2212 0.7601 0.5965 lichess.org GPU
8-value-policy-map 1.2008 0.7577 0.6023 lichess.org GPU
8-value-policy-map-mobile / RISEv2 1.1968 0.7619 0.6032 lichess.org CPU & GPU (fastest)
8-value-policy-map-preAct-relu+bn 1.1938 0.7663 0.6042 lichess.org GPU
RISEv1, Info 1.3358 0.4407 0.5658 lichess.org, Stockfish GPU (strongest)
Clone this wiki locally