protein conformational spaces meet machine learning
molearn is a Python package streamlining the implementation of machine learning models dedicated to the generation of protein conformations from example data obtained via experiment or molecular simulation.
Included in this repository are the following:
- Source code in the
molearnfolder - Software documentation (API and FAQ) in the
docsfolder, also accessible at molearn.readthedocs.io. - Example training and analysis scripts, along with example data, in the
examplesfolder
The current version of molearn only supports Linux, and has verified to support Python >=3.9.
- numpy
- PyTorch (1.7+)
- Biobox
- MDAnalysis
To prepare a raw trajectory for training:
To run energy evaluations with OpenMM:
To evaluate Sinkhorn distances during training:
To calculate DOPE and Ramachandran scores during analysis:
To run the GUI:
The most recent release can be obtained through Anaconda:
conda install molearn -c conda-forge or the much faster mamba install -c conda-forge molearn
We advise the installation is carried out in a new environment.
Manual installation requires the following three steps:
- Clone the repository
git clone https://github.com/Degiacomi-Lab/molearn.git - Install all required packages (see section Dependencies > Required Packages, above). The easiest way is by calling
mamba install -c conda-forge --only-deps molearn, where the option--only-depswill install the molearn required dependencies but not molearn itself. Optionally, packages enabling additional molearn functionalities can also be installed. This has to be done manually (see links in Dependencies > Optional Packages). - Use pip to install molearn from within the molearn directory
python -m pip install .
Molearn can used without installation by making the sure the requirements above are met, and adding the src directory to your path at the beginning of every script. For instance, to install all requirements in a new environment molearn_env:
conda env create --file environment.yml -n molearn_env
Then, within this environment, run scripts starting with:
import sys
sys.path.insert(0, 'path/to/molearn/src')
import molearn
Note in case of any installation issue, please consult our FAQ
- See example scripts in the
examplesfolder. - Jupyter notebook tutorials describing the usage of a trained neural network are available here.
- software API and a FAQ page are available at molearn.readthedocs.io.
If you use molearn in your work, please cite: S.C. Musson and M.T. Degiacomi (2023). Molearn: a Python package streamlining the design of generative models of biomolecular dynamics. Journal of Open Source Software, 8(89), 5523
Theory and benchmarks of a neural network training against protein conformational spaces are presented here: V.K. Ramaswamy, S.C. Musson, C.G. Willcocks, M.T. Degiacomi (2021). Learning protein conformational space with convolutions and latent interpolations, Physical Review X 11
For information on how to report bugs, request new features, or contribute to the code, please see CONTRIBUTING.md. For any other question please contact [email protected].