diff --git a/docs.it4i/software/machine-learning/deepdock.md b/docs.it4i/software/machine-learning/deepdock.md new file mode 100644 index 0000000000000000000000000000000000000000..d912becb7361b36b0da3e7a7a2be38373d9ca24b --- /dev/null +++ b/docs.it4i/software/machine-learning/deepdock.md @@ -0,0 +1,190 @@ +# DeepDock + +Adapted from [https://github.com/OptiMaL-PSE-Lab/DeepDock](https://github.com/OptiMaL-PSE-Lab/DeepDock) + +Code related to: [O. Mendez-Lucio, M. Ahmad, E.A. del Rio-Chanona, J.K. Wegner, A Geometric Deep Learning Approach to Predict Binding Conformations of Bioactive Molecules, Nature Machine Intelligence volume 3, pages1033–1039 (2021)](https://rdcu.be/cDy5f) + +Open access preprint [available here](https://doi.org/10.26434/chemrxiv.14453106.v1) + +## Getting Started + +### Main Requirements: + +* PyTorch = 1.10.0 +* CUDA Toolkit = 11.3 +* Python = 3.6.9 +* RDKIT = 2019.09.1 + +### Prerequisites + +* create Conda environment + + ```sh + conda create -n "current" python=3.6.9 + ``` + +* activate newly created Conda environment + + ```sh + conda activate current + ``` + +* install PyTorch 1.10.0 and other dependencies (more info at [https://pytorch.org/get-started/previous-versions/](https://pytorch.org/get-started/previous-versions/)) + + ```sh + conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge + ``` + +* if not yet available in your system, load CUDA 11.3 + + ```sh + ml load CUDA/11.3.1 + ``` + +* install PyTorch scatter, sparse and geometric + + ```sh + pip install torch-scatter -f https://data.pyg.org/whl/torch-1.10.0+cu113.html + pip install torch-sparse -f https://data.pyg.org/whl/torch-1.10.0+cu113.html + pip install torch-geometric + ``` + +* uninstall PyTorch spline + + ```sh + pip uninstall torch-spline-conv + ``` + +* uninstall PyMesh (necessary to generate `.ply` files) + + ```sh + wget --no-check-certificate https://github.com/PyMesh/PyMesh/releases/download/v0.2.0/pymesh2-0.2.0-cp36-cp36m-linux_x86_64.whl + pip install pymesh2-0.2.0-cp36-cp36m-linux_x86_64.whl + git clone https://github.com/shenwanxiang/ChemBench.git + cd ChemBench + pip install -e . + ``` + +* install Trimesh + + ```sh + conda install -c conda-forge trimesh + ``` + +* if not yet available in your system, fixed issues with the version of `libstdc` + + ```sh + module load GCC/9.3.0 + ``` + +* install other dependencies + + ```sh + conda install Biopython + conda install cmake + conda install automake + conda install bison + conda install flex + conda install -c anaconda swig + conda install -c conda-forge apbs + conda install -c conda-forge pdb2pqry + ``` + +* install AmberTools to replace reduce3.34 (for protonation) - since it is deprecated and no longer available, it is now included in AmberTools + + ```sh + conda install -c conda-forge ambertools + ``` + +* install requirements.txt + + ```sh + pip install -r requirements.txt + ``` + +* install RDKIT + + ```sh + conda install -c conda-forge rdkit=2019.09.1 + ``` + +### Installation + +1. Clone the repo + + ```sh + git clone https://github.com/paulo308/deepdock + ``` + +2. Move into the project folder and update submodules + + ```sh + cd DeepDock + git submodule update --init --recursive + ``` + +3. Install prerequisite packages + + ```sh + pip install -r requirements.txt + ``` + +4. Install DeepDock pacakge + + ```sh + pip install -e . + ``` + +### Configuration + +* navigate to the location of your `apbs-pdb2pqr/pdb2pqr` installation and run the Python (2.7) script to link with your current Conda environment. For more information, refer to the Dockerfile (lines 60 to 72) + + ```sh + [your conda environment WORK DIRECTORY]/install/apbs-pdb2pqr/pdb2pqr + python2.7 scons/scons.py install PREFIX="[your conda ENVIRONMENT PATH]/bin/pdb2pqr" + ``` + +* move the "multivalue" file to your Conda envirnoment path + + ```sh + cp multivalue [your conda environment path]/share/apbs/tools/mesh/multivalue + ``` + +* setup necessary environment variables with the tools and respective paths + + ```sh + export MSMS_BIN=[your conda environment path]/bin/msms + export APBS_BIN=[your conda environment path]/bin/apbs + export PDB2PQR_BIN=[your conda environment path]/bin/pdb2pqr/pdb2pqr.py + export MULTIVALUE_BIN=[your conda environment path]/share/apbs/tools/mesh/multivalue + ``` + +## Data + +You can get training and testing data following the next steps. + +1. Move into the project data folder + + ```sh + cd DeepDock/data + ``` + +2. Use the following line to download the preprocessed data used to train and test the model. This will download two files, one containing PDBbind (2.3 GB) used for training and another containing CASF-2016 (32 MB) used for testing. These two files are enough to run all [examples](https://github.com/OptiMaL-PSE-Lab/DeepDock/blob/main/examples). + + ```sh + source get_deepdock_data.sh + ``` + +2. In case you want to reproduce all results of the paper you will need to download the complete CASF-2016 set (~1.5 GB). You can do so with this command line from the data folder. + + ```sh + source get_CASF_2016.sh + ``` + +## Example Usage + +Usage examples can be seen directly in the Jupyter Notebooks included in the repo. We added examples for: + +* [Training the model](https://github.com/OptiMaL-PSE-Lab/DeepDock/blob/main/examples/Train_DeepDock.ipynb) +* [Score molecules](https://github.com/OptiMaL-PSE-Lab/DeepDock/blob/main/examples/Score_example.ipynb) +* [Predict binding conformation (docking)](https://github.com/OptiMaL-PSE-Lab/DeepDock/blob/main/examples/Docking_example.ipynb) diff --git a/mkdocs.yml b/mkdocs.yml index 9a59f54637ef3e980673ba6e572f340fec6c6613..d11b5302138f6ccc9e46bcb8554bcbe36cc8e662 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -207,6 +207,8 @@ nav: - Introduction: software/machine-learning/introduction.md - NetKet: software/machine-learning/netket.md - TensorFlow: software/machine-learning/tensorflow.md + - Deep Learning: + - DeepDock: software/machine-learning/deepdock.md - MPI: - Introduction: software/mpi/mpi.md - OpenMPI Examples: software/mpi/ompi-examples.md