This repository contains the official implementation of the paper Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation by François Rozet, Ruben Ohana, Michael McCabe, Gilles Louppe, François Lanusse, and Shirley Ho.
The steep computational cost of diffusion models at inference hinders their use as fast physics emulators. In the context of image and video generation, this computational drawback has been addressed by generating in the latent space of an autoencoder instead of the pixel space. In this work, we investigate whether a similar strategy can be effectively applied to the emulation of dynamical systems and at what cost. We find that the accuracy of latent-space emulation is surprisingly robust to a wide range of compression rates (up to 1000x). We also show that diffusion-based emulators are consistently more accurate than non-generative counterparts and compensate for uncertainty in their predictions with greater diversity. Finally, we cover practical design choices, spanning from architectures to optimizers, that we found critical to train latent-space emulators.
The majority of the code is written in Python. Neural networks are implemented and trained using the PyTorch automatic differentiation framework. To run the experiments, it is necessary to have access to a Slurm cluster, to login to a Weights & Biases account and to install the lola module as a package.
First, create a new Python environment, for example with venv.
python -m venv ~/.venvs/lola
source ~/.venvs/lola/bin/activate
Then, install the lola module as an editable package with its dependencies.
pip install --editable .[all] --extra-index-url https://download.pytorch.org/whl/cu121
Optionally, we provide pre-commit hooks to automatically detect code issues.
pre-commit install --config pre-commit.yaml
The lola directory contains the implementations of the neural networks, the autoencoders, the diffusion models, the emulation routines, and others.
The experiments directory contains the training scripts, the evaluation scripts and their configurations. The euler, rayleigh_benard and gravity_cooling directories contain the notebooks that produced the figures of the paper.
We rely on a Ceph File System partition to store the data. If your cluster uses a different file system, we recommend to create a symbolic link in your home folder.
ln -s /mnt/filesystem/users/you ~/ceph
The datasets (Euler, Rayleigh-Bénard and Turbulence Gravity Cooling) are downloaded from The Well.
the-well-download --base-path ~/ceph/the_well --dataset euler_multi_quadrants_openBC
the-well-download --base-path ~/ceph/the_well --dataset euler_multi_quadrants_periodicBC
the-well-download --base-path ~/ceph/the_well --dataset rayleigh_benard
the-well-download --base-path ~/ceph/the_well --dataset turbulence_gravity_cooling
This could take a while!
We start with training the autoencoders. For clarity, we provide the commands for a single compression rate. To replicate the other experiments, modify the number of latent channels.
python train_ae.py dataset=euler_all optim.learning_rate=1e-5 ae.lat_channels=64
python train_ae.py dataset=rayleigh_benard optim.learning_rate=1e-5 ae.lat_channels=64
python train_ae.py dataset=gravity_cooling optim.learning_rate=1e-5 ae=dcae_3d_f8c64_large ae.lat_channels=64
Once the above jobs are completed (1-4 days), we encode the entire dataset with each trained autoencoder and cache the resulting latent trajectories permanently on disk. For instance, for the autoencoder run named di2j3rpb_rayleigh_benard_dcae_f32c64_large
,
python cache_latents.py dataset=rayleigh_benard split=train repeat=4 run=~/ceph/lola/runs/ae/di2j3rpb_rayleigh_benard_dcae_f32c64_large
python cache_latents.py dataset=rayleigh_benard split=valid run=~/ceph/lola/runs/ae/di2j3rpb_rayleigh_benard_dcae_f32c64_large
The stored latent trajectories are then used to train latent-space emulators (deterministic and diffusion-based), without needing to load and encode high-dimensional samples on the fly.
python train_surrogate.py dataset=rayleigh_benard ae_run=~/ceph/lola/runs/ae/di2j3rpb_rayleigh_benard_dcae_f32c64_large # neural solver
python train_diffusion.py dataset=rayleigh_benard ae_run=~/ceph/lola/runs/ae/di2j3rpb_rayleigh_benard_dcae_f32c64_large # diffusion model
We also train pixel-space deterministic emulators, which require more compute resources.
python train_surrogate.py dataset=euler_all surrogate=vit_pixel compute.nodes=2
python train_surrogate.py dataset=rayleigh_benard surrogate=vit_pixel compute.nodes=2
Finally, we evaluate each trained emulator on the test set.
python eval.py start=16 seed=0 run=~/ceph/lola/runs/sm/lrg1qgi2_rayleigh_benard_f32c64_vit_large # neural solver
python eval.py start=16 seed=0 run=~/ceph/lola/runs/dm/0fqjt3js_rayleigh_benard_f32c64_vit_large # diffusion model
Each
train_*.py
script schedules a Slurm job to train a model, log the training statistics withwandb
, and store the weights in the~/ceph/lola/runs
directory. You will likely have to adapt the requested resources, either in the config files or in the command line.
If you find this project useful for your research, please consider citing
@unpublished{rozet2025lost,
title = {Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emulation},
author = {François Rozet, Ruben Ohana, Michael McCabe, Gilles Louppe, François Lanusse, and Shirley Ho},
year = {2025},
url = {https://arxiv.org/abs/2507.02608}
}
The authors would like to thank Géraud Krawezik and the Scientific Computing Core at the Flatiron Institute, a division of the Simons Foundation, for the compute facilities and support. We gratefully acknowledge use of the research computing resources of the Empire AI Consortium, Inc., with support from the State of New York, the Simons Foundation, and the Secunda Family Foundation. Polymathic AI acknowledges funding from the Simons Foundation and Schmidt Sciences, LLC.