Skip to content

Chemprop

Chemprop is used for "training and evaluating message-passing neural networks (MPNNs) for molecular property prediction."

Availability

Cluster Module/Version
BOSE chemprop/2.2.1
BGSC Not Available

Note: You can simply use module load chemprop to activate the most recently installed version of this software.

Arguments / Options

This is a list of arguments for the chemprop command that we wanted to highlight. Use chemprop --help or the official documentation for a more complete list.

Option Description
chemprop train used for training a model
chemprop predict use a trained model to make a prediction

Python Support

Chemprop is available to use within a Python script, however it currently only contains the packages required to run Chemprop. Please contact the HPC team if you have suggestions on others we should include.

Chemprop Python Tutorials

Python Usage
module load chemprop
python script.py
script.py
import numpy as np
from rdkit import Chem
from chemprop.data.datapoints import LazyMoleculeDatapoint, MoleculeDatapoint, ReactionDatapoint

mol = Chem.MolFromInchi("InChI=1S/C2H6/c1-2/h1-2H3")
smi = "CC"
n_targets = 1
y = np.random.rand(n_targets)

(Taken from the official guide for Datapoints)

Jupyter Kernel

Chemprop can be made available as a kernel within Jupyter, but it is not the default. You can install your own kernel for Chemprop at anytime by using the following commands in the shell:

module load chemprop
python -m ipykernel install --user --name="chemprop-2.2.1" --display-name="Chemprop (2.2.1)"

Once that is added, then your kernel will appear as an option in Jupyter.

Sample Slurm Script

submit.sh
#!/bin/bash
# -- SLURM SETTINGS -- #
# [..] other settings here [..]

# The following settings are for the overall request to Slurm
#SBATCH --ntasks-per-node=32     # How many CPU cores do you want to request
#SBATCH --nodes=1                # How many nodes do you want to request

# -- SCRIPT COMMANDS -- #

# Load the needed modules
module load chemprop    # Load Chemprop
chemprop train --data-path path/to/regression.csv --type-task regression --output-dir training_data

Real Example

Has your research group used Chemprop in a project? Contact the HPC Team and we'd be glad to feature your work.

Citation

Please include the following citations in your papers to support continued development of Chemprop.

Kevin Yang, Kyle Swanson, Wengong Jin, Connor Coley, Philipp Eiden, Hua Gao, Angel Guzman-Perez, Timothy Hopper, Brian Kelley, Miriam Mathea, Andrew Palmer, Volker Settels, Tommi Jaakkola, Klavs Jensen, and Regina Barzilay. Analyzing learned molecular representations for property prediction. Journal of Chemical Information and Modeling, 59(8):3370–3388, 2019. PMID: 31361484. URL: https://doi.org/10.1021/acs.jcim.9b00237, arXiv:https://doi.org/10.1021/acs.jcim.9b00237, doi:10.1021/acs.jcim.9b00237.

Esther Heid, Kevin P. Greenman, Yunsie Chung, Shih-Cheng Li, David E. Graff, Florence H. Vermeire, Haoyang Wu, William H. Green, and Charles J. McGill. Chemprop: a machine learning package for chemical property prediction. Journal of Chemical Information and Modeling, 64(1):9–17, 2024. PMID: 38147829. URL: https://doi.org/10.1021/acs.jcim.3c01250, arXiv:https://doi.org/10.1021/acs.jcim.3c01250, doi:10.1021/acs.jcim.3c01250.

Resources