On the Properties of Neural Circuits Derived from Optimally Thresholded Edge Attribution Scores

Ciruit hypothesis tests from Shi et al. 2024, implemented for Auto-Circuit, and code for reproducing experiments in (TODO add paper link after review period)

Getting Started

Clone the repository (using --recursive to pull the proper branch of the auto-circuit submodule)

git clone --recursive [email protected]:reml-lab/auto-circuit-tests.git

Install project and dependencies

pip install -e .

Open find_and_test_circuit.ipynb, play around with experiment configurations in Config, and generally explore the worflow

class Config: 
    task: str = "Docstring Component Circuit"
    use_abs: bool = True
    ablation_type: Union[AblationType, str] = AblationType.TOKENWISE_MEAN_CORRUPT
    grad_func: Optional[Union[GradFunc, str]] = GradFunc.LOGIT
    answer_func: Optional[Union[AnswerFunc, str]] = AnswerFunc.MAX_DIFF
    ig_samples: int = 10
    alpha: float = 0.05
    epsilon: Optional[float] = 0.0
    q_star: float = 0.9 
    grad_func_mask: Optional[Union[GradFunc, str]] = None
    answer_func_mask: Optional[Union[AnswerFunc, str]] = None
    # clean_corrupt: Optional[str] = None #TODO: make enum
    sample_type: Union[SampleType, str] = SampleType.RANDOM_WALK
    side: Optional[Union[Side, str]] = None
    max_edges_to_test_in_order: int = 100 #TODO: change to 125
    max_edges_to_test_without_fail: int = 500 #TODO: change to 125
    max_edges_to_sample: int = 100 # TODO: change to 125
    save_cache: bool = True

Reproducing Experiments

We use submitit to run experiments from a jupyter notebook. Open run_experiments.ipynb,

edit the submitit code with to work with your cluster/available resources:

# setup the executor
out_dir = repo_path_to_abs_path(OUTPUT_DIR / "hypo_test_out_logs" / datetime.now().strftime("%Y-%m-%d_%H-%M-%S"))
out_dir.mkdir(exist_ok=True, parents=True)
executor = submitit.AutoExecutor(folder=out_dir)
num_jobs_parallel = 8
executor.update_parameters(
    timeout_min=60*24,
    mem_gb=40,
    gres="gpu:1",
    cpus_per_task=8,
    nodes=1,
    slurm_qos="high", 
    slurm_array_parallelism=num_jobs_parallel
)

run all cells under Setup Executor and Run

After all jobs have completed, run the remaining cells under Analyze Results

Name		Name	Last commit message	Last commit date
Latest commit History 69 Commits
auto-circuit @ da0834d		auto-circuit @ da0834d
auto_circuit_tests		auto_circuit_tests
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md
compute_act_patch_prune_scores.ipynb		compute_act_patch_prune_scores.ipynb
compute_act_patch_prune_scores.py		compute_act_patch_prune_scores.py
convert_act_path.sh		convert_act_path.sh
convert_find_and_test.sh		convert_find_and_test.sh
find_and_test_circuit.ipynb		find_and_test_circuit.ipynb
find_and_test_circuit.py		find_and_test_circuit.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
run_experiments.ipynb		run_experiments.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

On the Properties of Neural Circuits Derived from Optimally Thresholded Edge Attribution Scores

Getting Started

Reproducing Experiments

About

Releases

Packages

Languages

reml-lab/auto-circuit-tests

Folders and files

Latest commit

History

Repository files navigation

On the Properties of Neural Circuits Derived from Optimally Thresholded Edge Attribution Scores

Getting Started

Reproducing Experiments

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages