-
Notifications
You must be signed in to change notification settings - Fork 0
Model description
Johannes Czech edited this page Apr 4, 2021
·
3 revisions
The following models are available for download:
For release 0.7.0, a pre-initialized model on human games was updated by applying reinforcement learning.
The 45th model update was obtained after generating ~ 1.05 million self-play games. It was also used for the 100 evaluation games with Multi-Variant-Stockfish:
The reinforcement learning loop was then continued until a total of ~ 2.37 million self-play games, resulting in the 96th model update:
For release 0.6.0 and all previous releases, models were solely based on supervised learning on the lichess.org data set as described in our paper.
- 4-value-8-policy
- 8-value-16-policy
- 8-value-policy-map
- 8-value-policy-map-mobile / RISEv2-mobile
- 8-value-policy-map-preAct-relu+bn
- RISEv1 (CrazyAraFish weights)
The following table summarizes the metric evaluation for different model architectures:
Model | Policy Loss | Value Loss | Policy Accuracy | Trained on data set | Best suited for |
---|---|---|---|---|---|
4-value-8-policy | 1.2184 | 0.7596 | 0.5986 | lichess.org | GPU |
8-value-16-policy | 1.2212 | 0.7601 | 0.5965 | lichess.org | GPU |
8-value-policy-map | 1.2008 | 0.7577 | 0.6023 | lichess.org | GPU |
8-value-policy-map-mobile / RISEv2 | 1.1968 | 0.7619 | 0.6032 | lichess.org | CPU & GPU (fastest) |
8-value-policy-map-preAct-relu+bn | 1.1938 | 0.7663 | 0.6042 | lichess.org | GPU |
RISEv1, Info | 1.3358 | 0.4407 | 0.5658 | lichess.org, Stockfish | GPU (strongest) |