Skip to content

ZehaoJin/causalbh

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Causalbh

1. Installation

1.1. clone this repository to your machine

git clone [email protected]:ZehaoJin/causalbh.git

or

git clone https://github.com/ZehaoJin/causalbh.git

1.2. Install dependencies

  • We highly recommand install dependencies in a virtual python environment via conda:

    conda create --name causalbh
    conda activate causalbh
    
  • Install the GPU version of jax

  • To visualize causal graphs, perform analysis around causal graphs, install networkx, pygraphviz, and causallearn

  • some basic dependencies such as numpy, scipy, pandas, matplotlib, seaborn, tqdm

  • This repository has been tested on:

    python 3.12.2
    jax 0.4.24
    networkx 3.1
    pygraphviz 1.12
    causallearn 0.1.3.8
    

2. Causal discovery with BGe exact posterior calculation

exact posterior is not recommanded for number of nodes $n>7$, see 3. Extensions: PC, FCI and DAG-GFN for cases $n>7$.

2.1. Generate all possible DAGs for n nodes

  • Run generate_all_dags.py. Specify the output location and number of nodes $n$ in the script. It will take hours ~ days to run $n$=7.
  • Or, use this multiprocessing version generate_all_dags_mp.py. It is also recommanded to verify the generated DAGs are valid using Verify_DAGs.ipynb if generated by the multiprocessing version

2.2 Compute BGe exact posteriors, edge/path marginals with GPU

Follow marginals.ipynb to calculate exact posteriors, and plot edge/path marginals. It will take minutes ~ hours to run for $n$=7.

2.3 A CPU workaround

We here also offer a CPU version to calculate the BGe scores in the case without access to a GPU. After generating all possible DAGs, use cal_bge_cpu.py. This CPU approach is fairly fast for $n\leq5$, but won't be practical for $n>7$.

3. Extensions: PC, FCI and DAG-GFN

3.1. Constriant-Based methods such as PC and FCI

We recommand using the causallearn implementation of PC and FCI alogrithm. Code example can be found here

3.2 DAG-GFlowNet

When the exact posterior approach is computationally infeasible (usually $n>7$), DAG-GFN can be used to approximate the exact posteriors. See DAG-GFN for the implementation of DAG-GFN

4. Data: Black hole mass - galaxy property catalog

5. Reproduce paper plots

6. Cite this work

If you use this repository or would like to refer the paper, please use the following BibTeX entry:

@article{Jin_2025,
        doi = {10.3847/1538-4357/ad9ded},
        url = {https://dx.doi.org/10.3847/1538-4357/ad9ded},
        year = {2025},
        month = {jan},
        publisher = {The American Astronomical Society},
        volume = {979},
        number = {2},
        pages = {212},
        author = {Jin, Zehao and Pasquato, Mario and Davis, Benjamin L. and Deleu, Tristan and Luo, Yu and Cho, Changhyun and Lemos, Pablo and Perreault-Levasseur, Laurence and Bengio, Yoshua and Kang, Xi and Macciò, Andrea Valerio and Hezaveh, Yashar},
        title = {Causal Discovery in Astrophysics: Unraveling Supermassive Black Hole and Galaxy Coevolution},
        journal = {The Astrophysical Journal},
        abstract = {Correlation does not imply causation, but patterns of statistical association between variables can be exploited to infer a causal structure (even with purely observational data) with the burgeoning field of causal discovery. As a purely observational science, astrophysics has much to gain by exploiting these new methods. The supermassive black hole (SMBH)–galaxy interaction has long been constrained by observed scaling relations, which is low-scatter correlations between variables such as SMBH mass and the central velocity dispersion of stars in a host galaxy's bulge. This study, using advanced causal discovery techniques and an up-to-date data set, reveals a causal link between galaxy properties and dynamically measured SMBH masses. We apply a score-based Bayesian framework to compute the exact conditional probabilities of every causal structure that could possibly describe our galaxy sample. With the exact posterior distribution, we determine the most likely causal structures and notice a probable causal reversal when separating galaxies by morphology. In elliptical galaxies, bulge properties (built from major mergers) tend to influence SMBH growth, while, in spiral galaxies, SMBHs are seen to affect host galaxy properties, potentially through feedback in gas-rich environments. For spiral galaxies, SMBHs progressively quench star formation, whereas, in elliptical galaxies, quenching is complete, and the causal connection has reversed. Our findings support theoretical models of hierarchical assembly of galaxies and active galactic nuclei feedback regulating galaxy evolution. Our study suggests the potentiality for further exploration of causal links in astrophysical and cosmological scaling relations, as well as any other observational science.}
        }

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published