Skip to content

theislab/FlatVI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

69 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

FlatVI

This is the official repository for the ICML 2025 spotlight paper Enforcing Latent Euclidean Geometry in Single-Cell VAEs for Manifold Interpolation. Many tools for single-cell RNA-seq (scRNA-seq) operate under the assumption that the latent space exhibits approximate Euclidean geometry, using straight lines to estimate cell state transitions and distances. To support and enhance this assumption, we introduce FlatVI, a representation learning model for scRNA-seq data that promotes locally flat geometry in the latent space, making it a natural complement to existing single-cell analysis pipelines.

FlatVI is a Variational Autoencoder (VAE) trained with a negative binomial likelihood tailored to single-cell data, and augmented with geometric regularisation. The VAE's decoder maps latent representations to parameters of a statistical manifold defined by negative binomial distributions. The local geometry of the latent space is governed by the pullback metric, which we regularise toward a scaled identity matrix. This encourages the latent space to adopt a local Euclidean structure.

Find our work at:

Implementation notes

The repository is currently undergoing significant restructuring and simplification, as the software is being adapted to the scvi-tools framework structure (see the official repo). The current folder supporting the ICML 2025 publication will be preserved in a dedicated branch.

Data availability

All the used datasets and checkpoints will be made publicly available on Zenodo by the time of the conference. Nonetheless, the datasets in this study are public and can be accessed from their original publications.

Installation

  1. Clone our repository
git clone https://github.com/theislab/FlatVI.git
  1. Create the conda environment:
conda env create -f environment.yml
  1. Activate the environment:
conda activate flatvi
  1. Install the FlatVI package in development mode:
cd directory_where_you_have_your_git_repos/FlatVI
pip install -e . 
  1. Create symlink to the storage folder for experiments:
cd directory_where_you_have_your_git_repos/FlatVI
ln -s folder_for_experiment_storage project_folder
  1. Create an experiment and dataset folder.
cd project_folder
mkdir datasets
mkdir experiments

Repository structure

Requirements

See environment.yml for the required packages.

Hydra

Our implementation leverages hydra to handle experiments. The configuration hierarchy can be found in the configs_hydra folder.

FlatVI

The source folder for the model is in the flatvi folder.

Training scripts

Training scripts are in flatvi/train_hydra:

  • train_cfm.py trains conditional flow matching.
  • train_vae.py trains the negative binomial variational autoencoder (either with or without regularization).
  • train_geodesic_vae.py trains the geodesic autoencoder baseline.

Models

Model scripts are in the flatvi/models folder:

  • In flatvi/models/base we have standard modules for the variational autoencoder, both with and without regularization, and the geodesic autoencoder baseline.
  • In flatvi/models/cfm we have modules for implementing Conditional Flow Matching, inspired by the torchCFM repo.
  • In flatvi/models/manifold we have the modules to deal with operations on manifolds, such as geodesic distance approximations or metric computations.

Training

Bash scripts to launch training are in scripts. To retrain the models, first create a logs folder in the scripts folder of interest. It will be used to dump the slurm error and output files. The scripts by default assume the use of the slurm scheduling system but can be adapted to standard bash commands.

Notebook

We provide example notebooks in the notebook folder.

Reference

@inproceedings{
  palma2025enforcing,
  title={Enforcing Latent Euclidean Geometry in Single-Cell {VAE}s for Manifold Interpolation},
  author={Alessandro Palma and Sergei Rybakov and Leon Hetzel and Stephan G{\"u}nnemann and Fabian J Theis},
  booktitle={Forty-second International Conference on Machine Learning},
  year={2025},
  url={https://openreview.net/forum?id=DoDXFkF10S}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages