This repository contains the scripts and data to reproduce the results of the work by Vinod et. al. titled "Optimized Multi-Fidelity Machine Learning for Quantum Chemistry" (available at [https://arxiv.org/abs/2312.05661]). The raw data of molecules for the QM7b dataset can be downloaded from [https://achs-prod.acs.org/doi/10.1021/acs.jctc.8b00832#article_content-right]. The rawdata for the Excitation State Energies can be downloaded from [https://github.com/SM4DA/MultiFidelityMachineLearning-for-MolecularExcitationEnergies] with explanation present in Vinod et. al. (2023) available at [https://pubs.acs.org/doi/10.1021/acs.jctc.3c00882].
The scripts in this repository and the plots they reproduce are listed below:
QM7b/GenerateSLATM.pygenerates the Global SLATM representation for the 7211 molecules of the QM7b data.QM7b/LearningCurves_QM7b.pygenerates data to reproduce Figure 3-5 of the main manuscript and Figure 1 of the supplementary text.QM7b/pople_MFML_outs.pygenerates the single fidelity learning curve from these figures.QM7b/Coeff_analysis_removed_fidelity.pycompares the full o-MFML model and reduced o-MFML model as per the analysis of hte coefficients.ExcitedState/LearningCurves_ExcitedState.pygenerates data for Figure 6,7 of the main text, and Figure 2,3 of the Supplementary text.ExcitedState/CompareMFMLtypes.pygenerates data for Table 1 in the supplementary text.
All the plotting routines for the QM7b segment are found in QM7b/QM7bPlots.ipynb and those for the Excitation state can be found in ExcitedState/ExcitedStatePlots.ipynb.