Code to reproduce the experiments of the paper Evaluation of uncertainty estimations for Gaussian process regression based machine learning interatomic potentials. All experiments can be run from the command line via hydra. The repository also already contains the results of the experiments, as well as notebooks and code to plot them.
Clone the repository and run
pip install -e .inside it.
Before running experiments, the data has to be stored appropriately. To run the experiments with the datasets used in the paper, create a directory named datasets in the same directory as this repository. Inside the datasets directory, include the following:
-
For the rMD17 dataset:
- Download and extract the
.zipfile from rMD17 on Figshare. - Place the extracted
rmd17directory directly inside thedatasetsdirectory, i.e.,datasets/rmd17.
- Download and extract the
-
For the WS22 dataset:
- Download the
.npzfiles for the individual molecules from WS22 on Zenodo. - Create a directory
WS22inside thedatasetsdirectory and place all.npzfiles there, i.e.,datasets/WS22/*.npz.
- Download the
-
For the Porphyrin dataset:
- Download the
.npzfile from porphyrin on Zenodo. - Create a directory
dftbinside thedatasetsdirectory and place the.npzfile there, i.e.,datasets/dftb/porphyrin.npz.
- Download the
If your datasets are stored in different locations, or if you want to use other datasets, you can specify the dataset paths by overriding the corresponding Hydra configurations.
All experiments from the paper can be reproduced using Hydra commands. To run an experiment, navigate to the src/GPR_MLIP directory and execute the following command:
python cli.pyYou can choose between three experiments by overriding the experiment value in the command line. By default, the results are stored in a directory named experiments in the same directory as the repository. The results are organized by experiment, dataset, GPR method used, and a specified result_name. If no dataset is specified by overriding Hydra configurations, the experiment will run using the benzene molecule from the rMD17 dataset. If no method configurations or representations are specified, the default model used will be GPR with Coulomb representations. All hydra configurations are defined in src/GPR_MLIP/config.
python cli.py experiment=cross_validation result_name=defaultThis command runs cross-validation and maximum marginal likelihood hyperparameter optimization. Cross-validation is performed with different initial values for the maximum marginal likelihood optimization.
python cli.py experiment=uncertainty_error_calculation result_name=defaultThis command runs an experiment that calculates errors and uncertainties for a test set. The hyperparameters optimized during cross-validation are used. It will automatically load the hyperparameters from the directory corresponding to the dataset and GPR method settings, with result_name=default.
python cli.py --multirun hydra/launcher=submitit_slurm experiment=uncertainty_sampling experiment.uncertainty=absolute_error,bootstrap_aggregation,random,std_dev,two_sets hydra=gpu_pleiades result_name=defaultThis command runs the uncertainty sampling experiment. Again the hyperparameters corresponding to the dataset and GPR method settings with result_name=default will be used. Extensive uncertainty sampling runs should be executed on a GPU. To submit jobs via SLURM, specify the job script in a .yaml file under config/hydra, as done in config/hydra/gpu_pleiades.yaml. The presented command performs a multirun over five uncertainties, launching separate jobs (e.g., on five GPUs). Results are stored in subdirectories 0, 1, ..., following the order of uncertainties listed in the command.
Before each experiment, the dataset is randomly shuffled and split into training, active learning, and test sets. The number of training samples is set via experiment.n_train, and the number of test samples via prepare_data.function.n_test (default: 2000). The remaining samples are used for active learning.
For hyperparameter optimization and uncertainty error calculation, the default number of training samples is 1000. If the seed prepare_data.function.seed is not overridden, both experiments will use the same split and therefore the same training data and models.
In the uncertainty error calculation experiment, predictions are made for the active learning set by default. In the uncertainty sampling experiment, the initial model is trained on 200 samples.
The paper includes a calibration analysis and uncertainty sampling runs for benzene and aspirin from rMD17, SMA and O-HBDI from WS22, as well as data of porphyrin calculated with DFTB. Running the commands from the previous section will reproduce the results for benzene using GPR with the Coulomb representation.
To reproduce the results for GPR with Coulomb on the other datasets, run:
python cli.py experiment=... dataset=rmd17 dataset.molecule_name=aspirin result_name=defaultpython cli.py experiment=... dataset=ws22 dataset.molecule_name=sma result_name=defaultpython cli.py experiment=... dataset=ws22 dataset.molecule_name=o-hbdi result_name=defaultpython cli.py experiment=... dataset=dftb dataset.molecule_name=porphyrin result_name=defaultTo do the same for GPR with SOAP one has to run:
python cli.py experiment=... dataset=... dataset.molecule_name=... method/kernel=atomistic_sum1 representation=soap_atomistic result_name=defaultFor uncertainty_sampling one should add the respective additional settings as done in the prior section.