This instruction is for people in FAIR (Meta AI) to run experiments.
Run echo $CC
to verify the environment variables related to compilers are set. It should output a path of the clang compiler.
Follow these steps to set up a development environment on devfair:
- Setup conda environment:
conda create -n rlcompopt python=3.8 cmake pandoc patchelf
conda activate rlcompopt
- Install bazel: Bazel is used to compile the C++/python package. Here we
will use bazelisk to manage our bazel installation and download it to
~/.local/bin
:
mkdir -p ~/.local/bin
wget https://github.com/bazelbuild/bazelisk/releases/download/v1.7.5/bazelisk-linux-amd64 -O bazel
chmod +x bazel && mkdir -p ~/.local/bin && mv -v bazel ~/.local/bin
export PATH=~/.local/bin:$PATH
-
Install PyTorch: The codebase requires 2.0 > PyTorch >= 1.12.1. We can install it following here. We recommend using conda to install PyTorch to avoid possible dependencies conflict. You need to find the correct command according to the CUDA version your GPU driver supports (check
nvidia-smi
). For example, I found my GPU driver supported CUDA 11.6, so I runconda install pytorch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia
to install pytorch 1.13.1. After the installation, verify PyTorch is usable on GPU by runningpython -c "import torch; print(torch.matmul(torch.rand(2, 8).to(0), torch.rand(8, 4).to(0)).shape)"
. If it outputstorch.Size([2, 4])
then we can go to next step, otherwise try to fix the issues by reinstall PyTorch. -
Install
torch-geometric
,pyzmq
, and logging tools: We recommend using conda to installtorch-geometric
andpyzmq
to avoid possible dependencies conflict.
conda install -c pyg pyg=2.1.0
conda install -c anaconda pyzmq=23.2.0
conda install -c dglteam dgl=1.1.0
cd ..
git clone https://github.com/yuandong-tian/tools2.git
cd tools2
python -m pip install .
- Clone CompilerGym and this repo: We will check out both this repo and
CompilerGym and install all development dependencies by running the following commands. Note that we clone the specific folk of CompilerGym that includes the type graph patch. We change to a desired directory to clone the repo:
cd /path/of/your/choice
.
cd ..
git clone --depth 1 --branch rlcompopt https://github.com/youweiliang/CompilerGym.git
cd CompilerGym
make init
cd ..
git clone https://github.com/facebookresearch/RLCompOpt.git
cd RLCompOpt
make init
- Build and install CompilerGym from source.
Run pip install setuptools==65.5.0
so that the library gym can be installed properly.
cd ../CompilerGym
make install
If you want to modify the CompilerGym codebase, you need to make your desired changes and then re-run make install
.
- Install this repo:
cd ../RLCompOpt
make install
If you modify this repo, you will need to reinstall it to make any changes to take effect.
- Use RAM rather than NFS for faster environments: CompilerGym does quite a lot of disk operations which can be slow on the cluster NFS. Force CompilerGym to instead keep everything in memory using:
export COMPILER_GYM_SITE_DATA=/dev/shm/compiler_gym_site_data
- (Optional) Automate the environment setup: Create a script to set up these environment variables so that you don't have to redo it next time you spawn a shell:
cat <<EOF > ~/.rlcompopt_env
conda activate rlcompopt
export PATH=$HOME/.local/bin:$PATH
export COMPILER_GYM_SITE_DATA=/dev/shm/compiler_gym_site_data
EOF
Now you can do source ~/.rlcompopt_env
to restore the environment.
The data files can be downloaded from this Google Drive. You can install gdown to download it:
conda install -c conda-forge gdown
gdown --folder https://drive.google.com/drive/folders/1lATNWBKmsubw8bGeFyDlBHXlYbcRrw7S?usp=sharing
The commands should save the files under a folder named data.
Or you can download it from the website and place the data folder under the repo, which results in the following file structure.
data
├── all_ssl_vocab.db
...
Note that you may need to load CUDA/CUDNN modules.
Same as the instructions in README. If you submit the training jobs to Slurm, the testing script will be automatically submitted to Slurm once the training is done. So you don't need to start testing manually.
Same as the instructions in README.
See the CONTRIBUTING file for how to help out.
RLCompOpt is MIT licensed, as found in the LICENSE file.
@InProceedings{liang2023rlcompopt,
title={Learning Compiler Pass Orders using Coreset and Normalized Value Prediction},
author={Liang, Youwei and Stone, Kevin and Shameli, Ali and Cummins, Chris and Elhoushi, Mostafa and Guo, Jiadong and Steiner, Benoit and Yang, Xiaomeng and Xie, Pengtao and Leather, Hugh and Tian, Yuandong},
year={2023},
booktitle={Proceedings of the 40th International Conference on Machine Learning}
}