Skip to content

Commit da90220

Browse files
init
0 parents  commit da90220

File tree

94 files changed

+804420
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

94 files changed

+804420
-0
lines changed

Diff for: .gitignore

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
HELP.md
2+
target/
3+
!.mvn/wrapper/maven-wrapper.jar
4+
!**/src/main/**/target/
5+
!**/src/test/**/target/
6+
7+
### STS ###
8+
.apt_generated
9+
.classpath
10+
.factorypath
11+
.project
12+
.settings
13+
.springBeans
14+
.sts4-cache
15+
16+
### IntelliJ IDEA ###
17+
.idea
18+
*.iws
19+
*.iml
20+
*.ipr
21+
22+
### NetBeans ###
23+
/nbproject/private/
24+
/nbbuild/
25+
/dist/
26+
/nbdist/
27+
/.nb-gradle/
28+
build/
29+
!**/src/main/**/build/
30+
!**/src/test/**/build/
31+
32+
### VS Code ###
33+
.vscode/

Diff for: .gitmodules

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
[submodule "contrib/ClickHouse"]
2+
path = contrib/ClickHouse
3+
url = https://github.com/ClickHouse/ClickHouse.git
4+
# branch = v23.3.7.5-lts
5+
branch = v22.8.17.17-lts

Diff for: CONTRIBUTING.md

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
## Contributing
2+
3+
Welcome to report Issues or pull requests.
4+
5+
When you contribute code, you affirm that the contribution is your original work and that you
6+
license the work to the project under the project's open source license. Whether or not you
7+
state this explicitly, by submitting any copyrighted material via pull request, email, or
8+
other means you agree to license the material under the project's open source license and
9+
warrant that you have the legal authority to do so.

Diff for: LICENSE

+18
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,18 @@
1+
Tencent is pleased to support the open source community by making fast-causal-inference available.
2+
3+
Copyright (C) 2023 THL A29 Limited, a Tencent company. All rights reserved. The below software in this distribution may have been modified by THL A29 Limited ("Tencent Modifications"). All Tencent Modifications are Copyright (C) THL A29 Limited.
4+
5+
fast-causal-inference is licensed under the BSD 3-Clause License:.
6+
7+
8+
Terms of the BSD 3-Clause License:
9+
--------------------------------------------------------------------
10+
Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
11+
12+
1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
13+
14+
2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
15+
16+
3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
17+
18+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Diff for: NOTICE

+5
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
Tencent is pleased to support the open source community by making Fast-Causal-Inference available.
2+
3+
Copyright (C) 2016 THL A29 Limited, a Tencent company. All rights reserved.
4+
5+
If you have downloaded a copy of the Fast-Causal-Inference binary from Tencent, please note that the Mars binary is licensed under the BSD 3-Clause License.

Diff for: README.md

+33
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
## Fast-Causal-Inference
2+
3+
[![license](https://img.shields.io/badge/license-BSD-brightgreen.svg?style=flat)](https://github.com/Tencent/fast-causal-inference/blob/master/LICENSE)
4+
[![Release Version](https://img.shields.io/badge/release-0.1.0-red.svg)](https://github.com/Tencent/fast-causal-inference/releases)
5+
[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg)](https://github.com/Tencent/fast-causal-inference/pulls)
6+
### Introduction
7+
It is a high-performance causal inference (statistical model) computing
8+
library based on OLAP, which solves the performance bottleneck of the existing
9+
statistical model library (R/Python) under big data. At the same time,
10+
the threshold for using statistical models is lowered through the SQL language,
11+
making it easy to use in production environments.
12+
13+
![topology](images/algorithm_architecture_diagram.png)
14+
![topology](images/engineering_structure_diagram.png)
15+
### Feature
16+
17+
### Docuementations
18+
19+
### Getting started
20+
21+
Compile From Source
22+
One-Click Deployment:
23+
> sh bin/build.sh
24+
25+
If the following log is displayed, fast-causal-inference is successfully deployed.
26+
> build success
27+
28+
other evironment reference: https://clickhouse.com/docs/en/install#from-sources
29+
30+
### Change log
31+
32+
### Support
33+

Diff for: bin/build.sh

+16
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
#!/bin/bash
2+
set -e
3+
history_path=`pwd`
4+
base_path=$(cd $(dirname $0)/..; pwd)
5+
cd $base_path
6+
if [ ! -f "$base_path/contrib/ClickHouse/LICENSE" ];then
7+
echo "fetch contrib submodule"
8+
git submodule update --init --recursive
9+
fi
10+
cp $base_path/src/udf/ClickHouse/src/AggregateFunctions/* $base_path/contrib/ClickHouse/src/AggregateFunctions/
11+
cd $base_path/contrib/ClickHouse/; mkdir -p build
12+
cmake -S . -B build
13+
cd build; ninja clickhouse
14+
mv ./programs/clickhouse $base_path/clickhouse
15+
cd ${history_path}
16+
echo "build success"

Diff for: bin/install.sh

+93
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,93 @@
1+
#!/bin/sh -e
2+
3+
OS=$(uname -s)
4+
ARCH=$(uname -m)
5+
6+
DIR=
7+
8+
if [ "${OS}" = "Linux" ]
9+
then
10+
if [ "${ARCH}" = "x86_64" -o "${ARCH}" = "amd64" ]
11+
then
12+
# Require at least x86-64 + SSE4.2 (introduced in 2006). On older hardware fall back to plain x86-64 (introduced in 1999) which
13+
# guarantees at least SSE2. The caveat is that plain x86-64 builds are much less tested than SSE 4.2 builds.
14+
HAS_SSE42=$(grep sse4_2 /proc/cpuinfo)
15+
if [ "${HAS_SSE42}" ]
16+
then
17+
DIR="amd64"
18+
else
19+
DIR="amd64compat"
20+
fi
21+
elif [ "${ARCH}" = "aarch64" -o "${ARCH}" = "arm64" ]
22+
then
23+
# If the system has >=ARMv8.2 (https://en.wikipedia.org/wiki/AArch64), choose the corresponding build, else fall back to a v8.0
24+
# compat build. Unfortunately, the ARM ISA level cannot be read directly, we need to guess from the "features" in /proc/cpuinfo.
25+
# Also, the flags in /proc/cpuinfo are named differently than the flags passed to the compiler (cmake/cpu_features.cmake).
26+
HAS_ARMV82=$(grep -m 1 'Features' /proc/cpuinfo | awk '/asimd/ && /sha1/ && /aes/ && /atomics/ && /lrcpc/')
27+
if [ "${HAS_ARMV82}" ]
28+
then
29+
DIR="aarch64"
30+
else
31+
DIR="aarch64v80compat"
32+
fi
33+
elif [ "${ARCH}" = "powerpc64le" -o "${ARCH}" = "ppc64le" ]
34+
then
35+
DIR="powerpc64le"
36+
fi
37+
elif [ "${OS}" = "FreeBSD" ]
38+
then
39+
if [ "${ARCH}" = "x86_64" -o "${ARCH}" = "amd64" ]
40+
then
41+
DIR="freebsd"
42+
fi
43+
elif [ "${OS}" = "Darwin" ]
44+
then
45+
if [ "${ARCH}" = "x86_64" -o "${ARCH}" = "amd64" ]
46+
then
47+
DIR="macos"
48+
elif [ "${ARCH}" = "aarch64" -o "${ARCH}" = "arm64" ]
49+
then
50+
DIR="macos-aarch64"
51+
fi
52+
fi
53+
54+
if [ -z "${DIR}" ]
55+
then
56+
echo "Operating system '${OS}' / architecture '${ARCH}' is unsupported."
57+
exit 1
58+
fi
59+
60+
clickhouse_download_filename_prefix="clickhouse"
61+
clickhouse="$clickhouse_download_filename_prefix"
62+
63+
if [ -f "$clickhouse" ]
64+
then
65+
read -p "ClickHouse binary ${clickhouse} already exists. Overwrite? [y/N] " answer
66+
if [ "$answer" = "y" -o "$answer" = "Y" ]
67+
then
68+
rm -f "$clickhouse"
69+
else
70+
i=0
71+
while [ -f "$clickhouse" ]
72+
do
73+
clickhouse="${clickhouse_download_filename_prefix}.${i}"
74+
i=$(($i+1))
75+
done
76+
fi
77+
fi
78+
79+
URL="https://builds.clickhouse.com/master/${DIR}/clickhouse"
80+
echo
81+
echo "Will download ${URL} into ${clickhouse}"
82+
echo
83+
curl "${URL}" -o "${clickhouse}" && chmod a+x "${clickhouse}" || exit 1
84+
echo
85+
echo "Successfully downloaded the ClickHouse binary, you can run it as:
86+
./${clickhouse}"
87+
88+
if [ "${OS}" = "Linux" ]
89+
then
90+
echo
91+
echo "You can also install it:
92+
sudo ./${clickhouse} install"
93+
fi

Diff for: contrib/ClickHouse

Submodule ClickHouse added at e897207

Diff for: docker/.gitkeep

Whitespace-only changes.

Diff for: docs/.gitkeep

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+

Diff for: images/algorithm_architecture_diagram.png

330 KB
Loading

Diff for: images/engineering_structure_diagram.png

40 KB
Loading

Diff for: src/package_util/python/causal_inference/.gitignore

+166
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,166 @@
1+
# From https://github.com/github/gitignore/blob/main/Python.gitignore
2+
# Byte-compiled / optimized / DLL files
3+
__pycache__/
4+
*.py[cod]
5+
*$py.class
6+
7+
# C extensions
8+
*.so
9+
10+
# Distribution / packaging
11+
.Python
12+
build/
13+
develop-eggs/
14+
dist/
15+
downloads/
16+
eggs/
17+
.eggs/
18+
lib64/
19+
parts/
20+
sdist/
21+
var/
22+
wheels/
23+
share/python-wheels/
24+
*.egg-info/
25+
.installed.cfg
26+
*.egg
27+
MANIFEST
28+
29+
# PyInstaller
30+
# Usually these files are written by a python script from a template
31+
# before PyInstaller builds the exe, so as to inject date/other infos into it.
32+
*.manifest
33+
*.spec
34+
35+
# Installer logs
36+
pip-log.txt
37+
pip-delete-this-directory.txt
38+
39+
# Unit test / coverage reports
40+
htmlcov/
41+
.tox/
42+
.nox/
43+
.coverage
44+
.coverage.*
45+
.cache
46+
nosetests.xml
47+
coverage.xml
48+
*.cover
49+
*.py,cover
50+
.hypothesis/
51+
.pytest_cache/
52+
cover/
53+
54+
# Translations
55+
*.mo
56+
*.pot
57+
58+
# Django stuff:
59+
*.log
60+
local_settings.py
61+
db.sqlite3
62+
db.sqlite3-journal
63+
64+
# Flask stuff:
65+
instance/
66+
.webassets-cache
67+
68+
# Scrapy stuff:
69+
.scrapy
70+
71+
# Sphinx documentation
72+
docs/_build/
73+
74+
# PyBuilder
75+
.pybuilder/
76+
target/
77+
78+
# Jupyter Notebook
79+
.ipynb_checkpoints
80+
81+
# IPython
82+
profile_default/
83+
ipython_config.py
84+
85+
# pyenv
86+
# For a library or package, you might want to ignore these files since the code is
87+
# intended to run in multiple environments; otherwise, check them in:
88+
# .python-version
89+
90+
# pipenv
91+
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
92+
# However, in case of collaboration, if having platform-specific dependencies or dependencies
93+
# having no cross-platform support, pipenv may install dependencies that don't work, or not
94+
# install all needed dependencies.
95+
#Pipfile.lock
96+
97+
# poetry
98+
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
99+
# This is especially recommended for binary packages to ensure reproducibility, and is more
100+
# commonly ignored for libraries.
101+
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
102+
#poetry.lock
103+
104+
# pdm
105+
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
106+
#pdm.lock
107+
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
108+
# in version control.
109+
# https://pdm.fming.dev/#use-with-ide
110+
.pdm.toml
111+
112+
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
113+
__pypackages__/
114+
115+
# Celery stuff
116+
celerybeat-schedule
117+
celerybeat.pid
118+
119+
# SageMath parsed files
120+
*.sage.py
121+
122+
# Environments
123+
.env
124+
.venv
125+
env/
126+
venv/
127+
ENV/
128+
env.bak/
129+
venv.bak/
130+
131+
# Spyder project settings
132+
.spyderproject
133+
.spyproject
134+
135+
# Rope project settings
136+
.ropeproject
137+
138+
# mkdocs documentation
139+
/site
140+
141+
# mypy
142+
.mypy_cache/
143+
.dmypy.json
144+
dmypy.json
145+
146+
# Pyre type checker
147+
.pyre/
148+
149+
# pytype static type analyzer
150+
.pytype/
151+
152+
# Cython debug symbols
153+
cython_debug/
154+
155+
# PyCharm
156+
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
157+
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
158+
# and can be added to the global gitignore or merged into this file. For a more nuclear
159+
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
160+
.idea/
161+
162+
# test dir
163+
test/
164+
tmp/
165+
local_conf/
166+
fast_causal_inference.zip

0 commit comments

Comments
 (0)