MPI3SNP is a parallel software tool dedicated to genome-wide association studies, performing a third-order exhaustive search. It is targeted to cluster architectures, and mitigates the cubic time complexity inherent to third-order searches by exploiting the several layers of parallelism present in a supercomputer. CPU and GPU implementations are offered.
Support is currently limited to linux distributions only.
- CMake (>3.0 version)
- A C++14 compatible compiler
- MPI library
Optional:
- CUDA
CMake is the project build manager. CMake should be able to determine installed compilers and libraries. If this is
is not the case, please refer to your CMake version's documentation. By default, CMake will check for a CUDA
installation and set the target architecture accordingly. This behaviour can be manually controlled by setting the
TARGET_ARCH
CMake variable to CPU
or GPU
.
Building the sources looks like this:
cd MPI3SNP/project/path
mkdir build
cd build/
cmake ..
make -j4
MPI3SNP takes two files as the input, using the PLINK/TPED format, and writes the results to a third file. All file paths are provided to the program as positional arguments as follows:
./MPI3SNP <path/to/tped> <path/to/tfam> <path/to/output>
Additional configuration options (specific to either CPU or GPU implementation) are available to the user, and can be
consulted using the -h
flag.
Sample files can be found on MPI3SNP's wiki. These are a syntetic dataset used for performance evaluation, which describe the input file format and can be used for verification/evaluation purposes.
Support is currently limited to linux distributions only. If you are having trouble building/using the application, please submit a new issue to get help.
This software is licensed under the GPU GPLv3 license. Check the LICENSE file for details.