Sandbox for semi-structured projects
This work explores the use of a transformer model for navigating a toy symbolic reasoning task, specifically pathfinding in a binary trees.
This work is mostly a replication of Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task. (arxiv, github)
src/
contains core source code: training loop, data generation etc.notebooks/
contain experiment entry points and plotsconf/
contains yaml files with hyperparameters.environment.yml
(too verbose) dump of my environment to enhance reproducibility
Open wandb πͺπ project with experiment logs