Spin–Orbit Coupling and Quantum Chemistry Tools
A hybrid Python/C++ package for advanced spin–orbit coupling (SOC) calculations, leveraging Libint2, Eigen, pybind11, and the scientific Python stack.
- High‑performance C++ backend (Libint2, Eigen, OpenMP)
- Seamless Python interface via pybind11
- SOC driver to reconstruct KS/Fock in the AO basis and compute spinors
- Optional export of spinors to
.npz
for downstream analysis (e.g., fuzzy bands) - Easy integration with Python workflows (NumPy, SciPy, Matplotlib, Pymatgen)
- Modern build system — install everything with one command
- Linux, macOS, or WSL (Windows Subsystem for Linux)
- Python ≥ 3.9
- C++17 compiler (provided by conda/mamba)
- Libint2 ≥ 2.6
- Eigen3 ≥ 3.4
- pybind11 ≥ 2.10
(All dependencies are handled viaenvironment.yml
.)
mamba env create -f environment.yml
mamba activate soc
pip install .
This will compile the C++ extension using your environment’s compilers and libraries. After this, you can import the Python modules and the libint_cpp
extension directly.
To reconstruct the KS/Fock matrix in the AO basis and compute SOC spinors, you must print all relevant MOs (occupied + enough virtuals to cover the unoccupied spectrum you care about). The most robust approach is to print all MOs (no MO_INDEX_RANGE
).
After geometry optimization, perform a single‑point calculation with this SCF block:
&SCF
MAX_SCF 1
EPS_SCF 1.0E-3
ADDED_MOS 100000
SCF_GUESS RESTART
# &OT
# MINIMIZER DIIS
# N_DIIS 7
# PRECONDITIONER FULL_SINGLE_INVERSE
# &END OT
&END SCF
- Do not use the OT optimizer here (keep it commented).
ADDED_MOS
must be large enough so that all unoccupied MOs of interest are printed. When in doubt, overshoot — missing LUMOs cannot be reconstructed later.
Add this to your input (note the absence of MO_INDEX_RANGE
so all MOs are printed):
&PRINT
&MO
&EACH
QS_SCF 100
&END
COEFFICIENTS
NDIGITS 16
ADD_LAST NUMERIC
FILENAME MOs
&END
&END PRINT
- Omitting
MO_INDEX_RANGE
ensures the full MO set is printed, which is required to reliably reconstruct the KS/Fock matrix in the AO basis.
Remove headers and non‑coefficient lines before ingesting into SOC.
RKS (closed‑shell):
#!/bin/bash
# clean_mos_rks.sh
INPUT="MOs.txt"
OUTPUT="MOs_cleaned.txt"
awk '
BEGIN { skip=0; count=0 }
/EIGENVALUES/ {
count++
if (count == 1) { skip=1; next }
else if (count == 2) { skip=0; next }
}
skip == 0 { print }
' "$INPUT" | \
sed 's/MO|/ /g' | \
grep -v -E '^[[:space:]]*$' | \
grep -v 'E(Fermi)' | \
grep -v 'Band gap' > "$OUTPUT"
echo "✅ Cleaned file written to: $OUTPUT"
UKS (open‑shell):
#!/bin/bash
# clean_mos_uks.sh
set -euo pipefail
INPUT="${1:-MOs.txt}"
ALPHA_RAW="MOs_alpha_raw.txt"
BETA_RAW="MOs_beta_raw.txt"
ALPHA_OUT="MOs_alpha.txt"
BETA_OUT="MOs_beta.txt"
awk '
BEGIN { section=0 }
/^[[:space:]]*MO\|[[:space:]]*[Aa][Ll][Pp][Hh][Aa]/ { section=1; print > a; next }
/^[[:space:]]*MO\|[[:space:]]*[Bb][Ee][Tt][Aa]/ { section=2; print > b; next }
section==1 { print > a }
section==2 { print > b }
' a="$ALPHA_RAW" b="$BETA_RAW" "$INPUT"
clean_mos() {
local IN="$1"
local OUT="$2"
sed 's/MO|/ /g' "$IN" \
| grep -viE "alpha|beta" \
| grep -v -E "^[[:space:]]*$" \
| grep -v "E(Fermi)" \
| grep -v "Band gap" > "$OUT"
}
if [ -s "$ALPHA_RAW" ]; then
clean_mos "$ALPHA_RAW" "$ALPHA_OUT"
else
echo "Warning: no ALPHA section found in $INPUT" >&2
: > "$ALPHA_OUT"
fi
if [ -s "$BETA_RAW" ]; then
clean_mos "$BETA_RAW" "$BETA_OUT"
else
echo "Note: no BETA section found in $INPUT" >&2
: > "$BETA_OUT"
fi
rm -f "$ALPHA_RAW" "$BETA_RAW"
echo "Alpha MOs written to: $ALPHA_OUT"
echo "Beta MOs written to: $BETA_OUT"
Parsing large MO text files can be slow. On first run, SOC will create a cached .npz
(CSR) file for fast reuse in subsequent calculations.
Minimal periodic example:
soc \
--xyz geom.xyz \
--mo MOs.txt \ # or MOs_csr.npz (faster on re‑runs)
--basis-txt BASIS_MOLOPT \
--basis-name DZVP-MOLOPT-SR-GTH \
--periodic \
--A1 2.5128664117413759E+001 0.0000000000000000E+000 0.0000000000000000E+000 \
--A2 -1.2564332058706880E+001 2.1762061488846783E+001 0.0000000000000000E+000 \
--A3 0.0000000000000000E+000 0.0000000000000000E+000 3.0619090536956048E+001 \
--write-spinors \
--mo-window 100 100
--xyz
: Geometry used in the DFT run (Cartesian, matches the printed MOs).--mo
: Cleaned MO coefficient file, or its cached*_csr.npz
.--basis-txt
: Basis set text file (e.g.BASIS_MOLOPT
) available in the run folder.--basis-name
: Basis name used in CP2K (e.g.DZVP-MOLOPT-SR-GTH
).--periodic
: Enable for periodic systems.--A1/--A2/--A3
: Relaxed lattice vectors (only if--periodic
is set).--write-spinors
: Store SOC spinors in an.npz
for downstream use (e.g. fuzzy bands).--mo-window NOCC NVIRT
: Truncate the MO space to a subspace around the frontier orbitals. See guidance below.
--mo-window nocc nvirt
selects a subspace of the full MO space to diagonalize the SOC Hamiltonian:
nocc
= number of occupied orbitals kept below the HOMOnvirt
= number of virtual orbitals kept above the LUMO
This is a standard subspace (perturbative) treatment of SOC. Because SOC is local and short‑range in typical semiconductors/insulators and molecular systems, a truncated subspace around the frontier often suffices for accurate gaps and near‑edge states. Practical guidance:
- Target property:
- HOMO/LUMO gap or near‑edge band splittings → start with
--mo-window 100 100
. - Deeper states or core–valence effects → increase both (e.g.
200 200
or more).
- HOMO/LUMO gap or near‑edge band splittings → start with
- Convergence check: Increase
nocc/nvirt
until observables (gap, SOC splitting, key level ordering) change by < 1–2 meV (or your tolerance). - Where truncation can fail:
- Metals/very small gaps, strong near‑degeneracies, heavy elements with very strong SOC coupling far‑from‑edge states → use a larger window or consider the full space if feasible.
- Performance: Dense diagonalization scales ~O(N³) with subspace size. On a laptop, sizes up to ~1000×1000 are typically fine; for many QD use‑cases, 100–300 per side is already adequate.
soc_spinors.npz
(when--write-spinors
): spinor coefficients and relevant metadata for post‑processing.- Console log with timing, convergence, and subspace details.
- Optional intermediate caches (CSR) for faster MO reads on subsequent runs.
SOC/
├── libint/ # C++ extension sources (pybind11, Libint2)
│ ├── CMakeLists.txt
│ ├── bindings.cpp
│ ├── integrals_core.cpp
│ └── integrals_core.hpp
├── src/ # Python modules
│ ├── analysis.py
│ ├── config.py
│ ├── hamiltonian.py
│ ├── parsers.py
│ ├── soc_driver.py
│ └── utils.py
├── environment.yml
├── pyproject.toml
├── README.md
├── LICENSE
└── .gitignore
- Source code: https://github.com/nlesc-nano/SOC
- To contribute, fork the repo and submit a pull request.
- Please open an issue for bug reports, feature requests, or questions.
Advanced/dev build (normally unnecessary):
cd libint
mkdir -p build && cd build
cmake ..
make
MIT License — see LICENSE.
If you use SOC in your research, please cite:
Ivan Infante, SOC: Spin–Orbit Coupling Python/C++ Package (2025)
GitHub: https://github.com/nlesc-nano/SOC
- For help, open a GitHub Issue: https://github.com/nlesc-nano/SOC/issues
- For scientific consulting or collaboration, contact: [email protected]
Quick recap: CP2K SP (print all MOs, no OT) → clean MOs (RKS/UKS scripts) → soc
with --mo-window
subspace (converge by increasing NOCC/NVIRT) → optional --write-spinors
for fuzzy‑band post‑processing.
Happy computing!