Interpretable Spectral Learning for Self-Driving Labs
InSpecLearn4SDL is the official implementation of the methods described in
📄 "Interpretable Spectral Features Predict Conductivity in Self-Driving Doped Conjugated Polymer Labs" (arXiv:2509.21330).
The repository provides an interpretable QSPR pipeline that predicts the electrical conductivity of doped conjugated polymers using optical spectra and processing parameters.
It is designed for integration into Self-Driving Labs (SDLs), enabling data-efficient, automated, and interpretable property prediction workflows.
-
🔍 Spectral Featurization:
Genetic Algorithm is used to adaptively select important spectral regions and use the area under the curve (AUC) as features -
🧩 Interpretability:
Domain-knowledge-driven feature expansion and SHAP-based feature selection retain physically meaningful descriptors. -
📈 Performance:
The hybrid model (expert + data-driven features) achieves high predictive accuracy while reducing experimental effort by ~33%. -
🔬 Generalizable:
Extendable to other spectroscopy–property relationships (e.g. Raman, FTIR, XANES).
-
🧫 Processing Conditions:
Variation in solvent concentration and annealing temperature across experiments. -
🌈 Spectroscopic Measurements:
Includes three spectral types —
Pre-anneal UV–Vis, Post-anneal UV–Vis, and Post-dope UV–Vis–NIR spectra. -
📊 Dataset Size:
A total of 128 doped conjugated polymer samples, each with paired spectral and conductivity measurements.
-
Code/
main_final_QSPR_models.ipynb
— 💡 Main Jupyter notebook containing the complete QSPR modeling workflowhelper.py
— 🧰 Utility functions for data preprocessing and analysisgenerate_adaptive_boundaries_optimization.py
— 🧬 Genetic Algorithm for adaptive spectral boundary optimizationfind_correct_clusters_and_do_ks_test.py
— 📊 Train/test data clustering, KS-tests, and statistical validation
-
Data/
spring24_solvtemp_jpm.csv
— 📁 Main experimental dataset (solvent concentration and annealing temperature)[experiment folders]/
— 🧪 Individual experiment spectral data files