This repository contains python scripts used to obtain comprehensive monomer-level properties for synthetically accessible Open Macromolecular Genome (OMG) polymers. as described in the paper. Trained ML models, conformer geometry, and ML-based monomer-level properties for 12M OMG polymers with prediction uncertainties are available at Zenodo.
conda env create -f environment.yml
To run a script, a file path in the script should be modified to be consistent with an attempted directory.
This directory contains scripts to perform active learning campaign based on evidential learning.
To use trained models, download pareto_greedy_check_point.tar.gz
from Zenodo and extract pareto_greedy_check_point.tar.gz
with the following command:
tar -xvzf pareto_greedy_check_point.tar.gz
The current file path in the scripts assume that the pareto_greedy_check_point
directory is located at ./active_learning
This directory contains quantum chemistry calculation results during the active learning campaign.
For example, ./active_learning/pareto_greedy/OMG_train_batch_0_chemprop_with_reaction_id.csv
contains quantum chemistry calculation results from the initial train data.
Similarly, ./active_learning/pareto_greedy/OMG_train_batch_1_chemprop_with_reaction_id.csv
contains quantum chemistry calculation results from Round 1.
This directory also contains 200 RDKit features (./rdkit_features
) used in training ML models to overcome the local nature of message passing in graph neural networks.
The examples
contains one example reaction of quantum chemistry calculations for different conformers.
The full data can be downloaded at Zenodo.
This directory contains scripts to estimate Flory-Huggins interaction parameters from COSMO-SAC calculations.
This directory contains external packages including D-MPNN networks and a Pareto front search algorithm from GitHub.
This directory contains scripts to draw figures in the paper.
This directory contains example scripts to perform active learning on QM9 for the figures in Supporting Information.
This directory contains scripts to run quantum chemistry calculations.
- batch_conformer_FF.py (conformer search and geometry optimization with UFF)
- batch_conformer_xtb.py (geometry optimization with XTB2)
- batch_rdkit_dft_property_calculation.py (DFT and TD-DFT calculations)
- batch_save_dft_chi_results.py (save DFT results and calculate Flory-Huggins interaction parameters)
- batch_save_total_results.py (save results)
This directory contains functions needed to run scripts.
Seonghwan Kim, Charles M. Schroeder, and Nicholas E. Jackson
This work was supported by the IBM-Illinois Discovery Accelerator Institute. N.E.J. thanks the 3M Nontenured Faculty Award for support of this research.