Skip to content

amirgroup-codes/LifeTracer

Repository files navigation

LifeTracer Logo

LifeTracer: comprehensive Python package for 2D gas chromatography analysis and signature discovery

Project Website Version Python License

Overview

LifeTracer is a comprehensive Python package for 2D gas chromatography analysis and molecular classification. This package provides tools to distinguish between abiotic and biotic organic compounds in meteorite and terrestrial samples using machine learning on mass spectrometry data.

Project Website: https://life-tracer.github.io/

Publication

Paper Title: Discriminating Abiotic and Biotic Organics in Meteorite and Terrestrial Samples Using Machine Learning on Mass Spectrometry Data

Abstract

With the upcoming sample return missions to the Solar System where traces of past, extinct, or present life may be found, there is an urgent need to develop unbiased methods that can distinguish molecular distributions of organic compounds synthesized abiotically from those produced biotically but were subsequently altered through diagenetic processes. We conducted untargeted analyses on a collection of meteorite and terrestrial geologic samples using two-dimensional gas chromatography coupled with high-resolution time-of-flight mass spectrometry (GC×GC-HRTOF-MS) and compared their soluble non-polar and semi-polar organic species. To deconvolute the resulting large dataset, we developed LifeTracer, a computational framework for processing and downstream machine learning analysis of mass spectrometry data. LifeTracer identified predictive molecular features that distinguish abiotic from biotic origins and enabled a robust classification of meteorites from terrestrial samples based on the composition of their non-polar soluble organics.

Tools

  • Data Processing: Comprehensive preprocessing pipeline for GC×GC-HRTOF-MS data
  • Feature Extraction: Automated extraction of molecular features from mass spectra
  • Machine Learning: Built-in classification algorithms optimized for distinguishing abiotic/biotic origins
  • Visualization: Tools for visualizing chromatographic data and classification results

Reproducing the Results

To replicate the findings from our research paper, we provide a detailed Jupyter notebook that walks through the full pipeline—from raw chromatographic data to the final trained classification models: 👉 Full Paper Results Notebook

For reproducing the logistic regression results along with their corresponding plots, please refer to this dedicated notebook: 👉 Logistic Regression Results Notebook

Alternatively, you can also run LifeTracer directly in the free version of Google Colab: 👉 Run Logistic Regression on Colab

Installation

Requirements

  • Python 3.10.8
  • Anaconda or Miniconda

Linux/macOS Installation

Step 1: Create Environment

conda create -n LifeTracer python=3.10.8

Step 2: Activate Environment

conda activate LifeTracer

Step 3: Install Package

# Clone the repository
git clone https://github.com/amirgroup-codes/LifeTracer.git

# Navigate to project directory
cd LifeTracer

# Install in development mode
pip install -e .

Step 4: Verify Installation

python -c "import lifetracer; print('LifeTracer installed successfully!')"

Windows Installation

The installation process is identical to Linux/macOS. Use Anaconda Prompt or PowerShell:

# Create environment
conda create -n LifeTracer python=3.10.8

# Activate environment
conda activate LifeTracer

# Clone and navigate to repository
git clone https://github.com/amirgroup-codes/LifeTracer.git
cd LifeTracer

# Install package
pip install -e .

# Verify installation
python -c "import lifetracer; print('LifeTracer installed successfully!')"

See this link for step by step conda installation on Windows.

Troubleshooting

Issue Solution
Permission Errors Use pip install --user -e .
Environment Issues Run conda clean --all and recreate environment
Path Problems Ensure you're in the correct project directory
Import Errors Check Python version matches 3.10.8

Data

The package works with both raw and processed chromatography data. Data files can be downloaded from hugginface:

Raw Data (GC×GC-HRTOF-MS CSVs)

Group Sample / Item Link Description
Meteorite Murchison Pristine 2.0 (replicate -003) click here to download Raw instrument export; meteorite extract; 300 µL DCM; 100 °C for 24 h.
Meteorite EET96029 click here to download Raw instrument export; meteorite extract; 300 µL DCM; 100 °C for 24 h.
Meteorite Orgueil (replicate -001) click here to download Raw instrument export; meteorite extract; 300 µL DCM; 100 °C for 24 h.
Meteorite ALH83100 click here to download Raw instrument export; meteorite extract; 300 µL DCM; 100 °C for 24 h.
Meteorite LON94101 click here to download Raw instrument export; meteorite extract; 300 µL DCM; 100 °C for 24 h.
Meteorite LEW85311 click here to download Raw instrument export; meteorite extract; 300 µL DCM; 100 °C for 24 h.
Meteorite AZ click here to download Raw instrument export; meteorite extract; 400 µL DCM; 100 °C for 24 h.
Meteorite Jbilet Winselwan click here to download Raw instrument export; meteorite extract; 300 µL DCM; 100 °C for 24 h.
Soil Atacama Soil (replicate -001) click here to download Raw instrument export; soil extract; 300 µL DCM; 100 °C for 24 h.
Soil Rio Tinto Soil click here to download Raw instrument export; soil extract; 300 µL DCM; 100 °C for 24 h.
Soil Murchison Soil (replicate -001) click here to download Raw instrument export; soil extract; 300 µL DCM; 100 °C for 24 h.
Soil Antarctica Soil (replicate -001) click here to download Raw instrument export; soil extract; 300 µL DCM; 100 °C for 24 h.
Soil Jarosite Soil click here to download Raw instrument export; soil extract; 300 µL DCM; 100 °C for 24 h.
Soil Green River Shale Soil click here to download Raw instrument export; soil extract; 500 µL DCM; 100 °C for 24 h.
Soil GSFC Soil click here to download Raw instrument export; soil extract; 300 µL DCM; 100 °C for 24 h.
Soil Lignite (replicate -001) click here to download Raw instrument export; soil/organic sediment extract; 300 µL DCM; 100 °C for 24 h.
Soil Utah Soil click here to download Raw instrument export; soil extract; 300 µL DCM; 100 °C for 24 h.
Soil Iceland Soil click here to download Raw instrument export; soil extract; 300 µL DCM; 100 °C for 24 h.

Processed Data

Set File / Part Link Description
Unaligned TIIs (heatmaps) heatmaps.tar.gz.part-aa click here to download Split archive part; unaligned Total Ion Image heatmaps. Concatenate parts, then extract.
Unaligned TIIs (heatmaps) heatmaps.tar.gz.part-ab click here to download Split archive part; unaligned TII heatmaps.
Unaligned TIIs (heatmaps) heatmaps.tar.gz.part-ac click here to download Split archive part; unaligned TII heatmaps.
Unaligned TIIs (heatmaps) heatmaps.tar.gz.part-ad click here to download Split archive part; unaligned TII heatmaps.
Aligned TII TII_aligned.tar.gz.part-aa click here to download Split archive part; aligned Total Ion Images. Concatenate parts, then extract.
Aligned TII TII_aligned.tar.gz.part-ab click here to download Split archive part; aligned TIIs.
Aligned TII TII_aligned.tar.gz.part-ac click here to download Split archive part; aligned TIIs.
Aligned TII TII_aligned.tar.gz.part-ad click here to download Split archive part; aligned TIIs.
Aligned TII TII_aligned.tar.gz.part-ae click here to download Split archive part; aligned TIIs.
Aligned TII TII_aligned.tar.gz.part-af click here to download Split archive part; aligned TIIs.
Peaks peaks.zip click here to download Detected peak tables per sample (e.g., retention times, intensities).
Features features.zip click here to download Feature matrix/metadata derived from peaks/TII (aligned features for modeling).
Calibration Phase calibration_phase.zip click here to download Intermediate files for automatic parameter selection
Parameter Selection parameters_selection.zip click here to download Parameter sweeps/selection results used for final model.
Final Paper Results lr_l2_results.zip click here to download Final result bundle (e.g., logistic-regression L2 results reported in the paper).
Model Evaluations eval.zip click here to download Evaluation outputs and metrics for trained models.

Note: For any .tar.gz.part-xx sets, concatenate parts in order (e.g., cat file.tar.gz.part-* > file.tar.gz) before extracting.

Contributing

We welcome contributions! Please submit pull requests, report issues, or request features.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Contact

For questions, issues, or collaborations, please:

Acknowledgments

We thank all contributors and collaborators who have helped develop LifeTracer and the institutions that supported this research. US Antarctic meteorite samples are recovered by the Antarctic Search for Meteorites (ANSMET) program, which has been funded by NSF and NASA and characterized and curated by the Department of Mineral Sciences of the Smithsonian Institution and Astromaterials Curation Office at NASA Johnson Space Center. The authors would like to thank T. McCoy, J. Hoskin, and the Smithsonian National Museum of Natural History - Division of Meteorites; and J.-C. Viennet and the curatorial team at the Muséum National d’Histoire Naturelle for providing the meteorite samples used in this study. This research was supported in part by the Parker H. Petit Institute for Bioengineering and Biosciences (IBB) interdisciplinary seed grant, the Institute of Matter and Systems (IMS) Exponential Electronics seed grant, and the Georgia Institute of Technology start-up funds, and by NASA’s Planetary Science Division Internal Scientist Funding Program through the Fundamental Laboratory Research (FLaRe) Work Package at NASA Goddard Space Flight Center.

About

LifeTracer: comprehensive Python package for 2D gas chromatography analysis and signature discovery

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published