Skip to content

Latest commit

 

History

History
141 lines (93 loc) · 7.45 KB

README.md

File metadata and controls

141 lines (93 loc) · 7.45 KB

Causality
Causal-Copilot: An Autonomous Causal Analysis Agent

[Demo][Code]


Introduction

Understanding causal relationships is fundamental to scientific discovery, enabling researchers to move beyond mere correlation and establish the underlying mechanisms that drive natural and social phenomena. Recent years have witnessed significant theoretical advancements in causal discovery, yielding a diverse array of sophisticated methodologies. However, the complexity of these methods—each with its distinct assumptions, applicability conditions, and technical nuances—has created substantial barriers for scientists outside the field of causal analysis, often deterring them from adopting these powerful analytical tools in their research.

Causal-Copilot is a LLM-oriented toolkit for automatic causal analysis that uniquely integrates domain knowledge from large language models with established expertise from causal discovery researchers. Designed for scientific researchers and data scientists, it facilitates the identification, analysis, and interpretation of causal relationships within real-world datasets through natural dialogue. The system autonomously orchestrates the entire analytical pipeline-analyzing statistics, selecting optimal causal analysis algorithms, configuring appropriate hyperparameters, synthesizing executable code, conducting uncertainty quantification, and generating comprehensive PDF reports—while requiring minimal expertise in causal methods. This seamless integration of conversational interaction and rigorous methodology culminates enables researchers across disciplines to focus on domain-specific insights rather than technical implementation details.

🔍 Try out our interactive demo: Causal-Copilot Live Demo


Demo

Video Demo

Demo Video

Report Examples

We provide some examples of our system automatically generated reports for open-source datasets generated as follows:


Table of Contents


Features

  • Automated Causal Analysis: Harnesses the power of large language models combined with domain expertise to select optimal causal analysis algorithms and hyperparameters. Incorporates proven methodological insights from causal discovery researchers to ensure the analytical reliability, without the requirements in expertise about causality and extensive parameter tuning.
  • Statistical-LLM Hybrid Post Processing: Present the edge uncertainty examination (bootstrap), as well as graph pruning and direction revision driven by LLM's prior knowledge.
  • Chat-based User-friendly Interface: Navigate complex causal analysis through natural dialogue, and visualize data statistics and causal graphs through clear, intuitive figures, without wrestling with technical details.
  • Comprehensive Analysis Report: Provide well-formulated scientific report for the whole causal analysis process, containing detailed explanation documenting the complete analytical process, intuitive visualization and in-depth interpretation of the findings.
  • Extensibility: Maintain open interfaces for integrating new causal analysis algorithms and support seamless incorporation of emerging causality-related libraries and methodologies

Architecture Details

  • Our Causal-Copilot consists of four components, namely preprocessing, decision making, post processing and intepretation parts, which are all supported by SOTA LLMs (e.g., GPT-4o, GPT-4o-mini).

Causality

Evaluation on Simulated Data

  • We evaluate the automatic causal discovery ability of our Causal-Copilot on in total 180 simulated datasets including different types of functional forms, graph sparsity, noise types and heterogeneity, compared with a robust baseline, PC algorithm with the default setting.
  • The results show that our Causal-Copilot can achieve much better performance, indicating the effectiveness of its automatic algorithm selection and hyper-parameter setting strategy, in a autonomous manner.
Metric Baseline Causal-Copilot
Precision 78.6% 81.6%
Recall 78.2% 81.0%
F1-score 76.1% 79.3%

Getting Started

Online Demo

🔍 Try out our interactive demo: Causal-Copilot Live Demo

Local Deployment

  • Python 3.8+
  • Required Python libraries (specified in requirements.txt)

Ensure you have the necessary dependencies installed by running:

pip install -r requirements.txt

Usage

python main.py --data_file your_data --apikey your_openai_apikey --initial_query your_user_query

License

Distributed under the MIT License. See LICENSE for more information.

Resource


Contributor

Contact

For additional information, questions, or feedback, please contact ours at [email protected], [email protected], [email protected], [email protected] and [email protected]. We welcome contributions! Come and join us now!

If you use Causal-Copilot in your research, please cite it as follows:

@inproceedings{causalcopilot,
  title={Causal-Copilot: An Autonomous Causal Analysis Agent},
  author={Wang, Xinyue and Zhou, Kun and Wu, Wenyi and Nan, Fang and Huang, Biwei},
  year={2024}
}