Skip to content

esa/MCTED

Repository files navigation

MCTED - Mars CTX Terrain-Elevation Dataset

Description

MCTED is a machine-learning-ready dataset of paired terrain image and corresponding DEM patches, allowing for training and fine tuning of machine learning models for DEM generation from single optical images.

MCTED has been developed using CTX derived orthoimages and DEMs generated by the NASA Ames Stereo Pipeline in Mars Human Exploration Zone DEM Archive 2023 by Day et al.

Access the dataset on HuggingFace!

Check out the paper on arXiv!

MCTED data

The MCTED dataset consists of 80,898 data samples, each data sample consisting of four different files of 518x518 px size:

  • Orthoimage patch - a patch from the original CTX derived orthoimage showing a fragment of Martian terrain. Each image is monochromatic, but saved in an RGB format, with all 3 channels indentical,
  • DEM patch - corresponding elevation model to the image patch. This is essentially a 2D array of 32-bit floating point values, each representing the elevation in meters Martian datum the for a given point,
  • Invalid NaN mask - a binary mask which indicates which pixels in te original sample from Dat et al. contained missing data, which has been filled in by our processing pipeline,
  • Deviation mask - a binary mask which indcates values that were considered elevation outliers during processing in the process of artifact removal. dataset_samples

Prerequisites

Environment

To ensure reproducibility, use:

  • Python 3.10
  • See requirements.txt for package dependencies

Day et al. repository

To fully reproduce the results of this work, the CTX portion of the Day et al. repository is necessary. To download the necessary components refer to the instructions found here or download the data directly from here.

Hardware

This project requirest powerful hardware to be reproduced. For training and evaluation a GPU is strongly recommended. The project was developed using an Nvidia L40S GPU.

Project setup

1. Clone the repository

git clone --recursive https://github.com/esa-datalabs/mcted
cd mcted

Important

The repository needs to be cloned with the --recursive flag to clone also the Mars_DEMs repository, containing the original index file, which we convert to a .csv file and add the cluster information to.

2. Set up the environment

python3 -m venv mcted

# Activate the environment

# Linux
source mcted/bin/activate
# Windows
.\mcted\Scripts\activate

pip install -r requirements.txt

Reproducing Results

All of the scripts default parameters have been set to reproduce the results of the paper. You only need to adjust the data paths to reproduce the results.

Clustering samples

To enable safe dataset division into training and validation splits, we group the samples

python cluster_samples.py --help

Tip

Refer to the help flag for full CLI options.

Important

This script creates the DEM_index.csv file which in mandatory for running the training and evaluation scripts.

Dataset patch generation

Generates training/evaluation patches from DEMs and corresponding CTX imagery. This script uses MPI for parallel processing.

mpiexec -n 2 python patch_generation/process_dataset.py

Warning

The complete set of parameters for processing can be found in patch_generation/config/default_config.py file. Please refer to it to set your data paths correctly. By default the script expects the Day et al. repository to be under ctx_orthoimages and will try saving the generated dataset in an output directory.

The script will generate various metadata files and two main directories accepted_patches and rejected_patches. Accepted patches is what compromises the MCTED dataset.

Training the model

Trains the U-Net-based monocular depth estimation model on the generated dataset.

python train_unet.py --data_path path_to_mcted_accepted_patches

Tip

Refer to the help flag for full CLI options by running python train_unet.py --help.

Evaluating trained models

Performs evaluations on the validation split of MCTED for a given trained model and the chosen version of DepthAnythingV2.

python evaluate.py --help

Tip

Refer to the help flag for full CLI options.

Reproducing figures

Jupyter notebooks for recreating plots/figures from the paper are available in:

paper_plots/

Open and run notebooks interactively for visualizations and analysis.

Important

Please remember to adjust the paths to where the Day et al. repository as well as the MCTED dataset can be found, as they are not contained inside of this repository.


Browse results without reproducing

Some of the results contained in the paper have been placed inside the .artifacts directory. These include:

  • clustering/
    • dataset_samples_split.yaml - this file contains the names of samples used in the training and validation splits respectively. All patches from each sample are being used in the same split to ensure no data leakage between splits.
    • DEM_index.csv - the modified index file from Mars_DEMs that is generated as a result of the clustering script.
  • evaluation/ - this directory contains .csv files with the results of the evaluation.
  • training/ - this directory contains some of the artifacts generated during model training. The trained model checkpoints can be found there as well as the parameters used for training, loss curves and values.

License

This project is licensed under the European Space Agency Public License (ESA-PL). See LICENCE.txt for full details.

About

Mars CTX Terrain-Elevation Dataset

Resources

License

Stars

Watchers

Forks

Packages

No packages published