Skip to content

THU-KEG/Linguistic-SAE

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SAELing: Sparse Auto-Encoder for Linguistic Mechanism Analysis

The repository implements a system for interpreting language mechanisms using sparse autoencoders, named SAELing. The system aims to reveal and control the internal linguistic knowledge of large language models. We use SAELing to extract a large number of causal features from large language models. For details, see Sparse Auto-Encoder Interprets Linguistic Features in Large Language Models.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages