Skip to content

hms-dbmi/PICTURE

Repository files navigation

PyTorch Lightning Config: Hydra Template

Uncertainty-Aware Ensemble of Foundation Models Differentiates Glioblastoma from its Mimics – A Multi-Center Study

Pathology Imaging Characterization with Uncertainty-aware Rapid Evaluation (PICTURE)

Abstract

Accurate pathological diagnosis is crucial in guiding personalized treatments for patients with central nervous system (CNS) cancers. Distinguishing glioblastoma and primary central nervous system lymphoma (PCNSL) is particularly challenging due to their overlapping pathology features, despite the distinct treatments required. To address this challenge, we established the Pathology Image Characterization Tool with Uncertainty-aware Rapid Evaluations (PICTURE) system using 2,141 pathology slides collected worldwide. PICTURE employed Bayesian inference, deep ensemble, and normalizing flow to account for the uncertainties in its predictions and training set labels. PICTURE accurately diagnosed glioblastoma and PCNSL with an area under the receiver operating characteristic curve (AUROC) of 0.989, with the results validated in five independent cohorts (AUROC = 0.924-0.996). In addition, PICTURE identified samples belonging to 67 types of rare CNS cancers that are neither gliomas nor lymphomas. Our approaches provide a generalizable framework for differentiating pathological mimics and enable rapid diagnoses for CNS cancer patients.

Installation

conda create -n PICTURE -f enviroment.yml python=3.10 -y 
conda activate PICTURE
pip install --upgrade pip

Suggested System Requirements (Linux-based high performance computing (HPC) platform at Harvard Medical School)

Linux: Ubuntu 20.04 LTS CUDA: 12.1 Nvidia GPU. (All experiemnts were conducted using Nvidia A100. However, the inference should be able to use any CUDA supported GPU.)

Publicly Available Datasets

  1. TCGA provides publicly available tissue slides for PCNSL (TCGA-DLBC) and Gliomblastoma (TCGA-GBM). [Note: One could include IDH-wildtype from TCGA-LGG, according to 2021 WHO guidelines.] https://portal.gdc.cancer.gov/projects/TCGA-DLBC https://portal.gdc.cancer.gov/projects/TCGA-GBM

  2. the Medical University in Vienna provides an online portal, where researchers are welcome to download both PCNSL, Gliomblastoma and other CNS tumors (out-of-distribution): https://www.ebrains.eu/tools/human-brain-atlas

Trained Model Weights.

Simply use our trained model for differentiating Glioblastoma from others (e.g., PCNSL, OOD). The weights have been trained using the data from the Mayo Clinics, where class 0 is Glioblastoma.

python main_exp.py

Preprocessing (Tiling)

python preprocessing/WSI_tile_extraction.py

Cell Quantification

See the ReadMe in cell_quantification

Heatmap Visualization

python Heatmap_Vis/generate.py --region' $x_s $y_s $x_e $y_e '--label '$label' --column '$col' --slide-path '$s_path' --model-path '$m_path

Uncerntaity

Train model with chosen experiment configuration from configs/experiment/

python uncertainty_quantification/OOD_UQ/src/train.py experiment=experiment_name.yaml

You can override any parameter from the command line like this

python uncertainty_quantification/OOD_UQ/src/train.py trainer.max_epochs=20 datamodule.batch_size=64

Reproducibility of uncertainty results

The weights are stored in :

uncertainty_quantification/OOD_UQ/best_ckpts/

In order to perform the hyper parameter sweep which we used to obtain the final model:

wandb sweep uncertainty_quantification/OOD_UQ/sweep_yamls/sweepCV_vienna_CTransFeature_fold[FOLD].yaml

This will return the bash command in order to run the sweep, for example:

wandb agent uncertainty_quantification/OOD_UQ/sylin/uncertainty_vienna_CTransFeature_wMoreBenign_fold[FOLD]/nqabs50g

In order to directly train using the best hyperparameters we found:

python uncertainty_quantification/OOD_UQ/src/train.py experiment=best_uncertainty_vienna_fold[FOLD].yaml

Slide-level AUC using confident tiles can be estimated using:

python uncertainty_quantification/OOD_UQ/AUC_analysis.py --files "path/to/fold1_prediction.csv" "path/to/fold2_prediction.csv" ... "path/to/fold10_prediction.csv"

UMAP visualization can be obtained with:

python uncertainty_quantification/OOD_UQ/script_visualize.py --fold [FOLD]

In order to reproduce the results and validate the model, please run:

python uncertainty_quantification/OOD_UQ/script_CTrans_feature.py --checkpoint_path="path/to/checkpoint.ckpt"

About

Pathology Imaging Characterization with Uncertainty-aware Rapid Evaluation (PICTURE)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •