Pinned Loading
-
-
Optimal-Policies-Tend-To-Seek-Power
Optimal-Policies-Tend-To-Seek-Power PublicCode for the paper "Optimal Policies Tend To Seek Power"
Mathematica 1
-
alignment-research-dataset
alignment-research-dataset PublicForked from moirage/alignment-research-dataset
A dataset of alignment research and code to reproduce it
Python
-
-
white-box
white-box PublicForked from AlignmentResearch/tuned-lens
Tools for understanding how transformer predictions are built layer-by-layer
Jupyter Notebook
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.