DIST-S1 Model

This is a repository that includes the transformer model and relevant training routines. It is a greatly distilled version of Harris Hardiman-Mostow's research repository with optimizations and improvements specifically tailored for the DIST-S1 product written by Diego Martinez. There are also additional notebooks to inspect the input dataset and visualize the model application to existing OPERA RTC data.

Installation

Environment Setup

Install the environment using mamba:
```
mamba env create -f environment_gpu.yml
```
Activate the environment:
```
conda activate dist-s1-model
```

Data Setup

Download Required Datasets

Training data (~53 GB): <url>
Test data (~13 GB): <url>

Data Configuration

Update the data paths in your configuration file (see Configuration section below).

Usage

Note:

We currently support two different datasets:

sequential time-series to establish baselines and
another that uses windows around anniversary date from the target/post-image acquisition to establish a baseline.

The former is the original work that was done to prototype the algorithm and the latter is what OPERA project aims to support to be in line with the OPERA DIST suite. Currently all the *-redux or Redux are for the latter more recent dataset regarding windows around anniversary dates (.i.e. 2). We will support both for provenance, though our current focus will be on the newer dataset with the project's goal in mind.

YAML Configuration File

Create a configuration file (e.g., config.yml) with the following structure:

# Data configuration
data:
  train_path: "/path/to/your/train_data.pt"
  test_path: "/path/to/your/test_data.pt"

# Model configuration
model_config:
  type: "SpatioTemporalTransformer"
  # Add your model-specific parameters here

# Training configuration
train_config:
  batch_size: 8
  learning_rate: 0.001
  num_epochs: 100
  seed: 42
  step_size: 30
  gamma: 0.1
  checkpoint_freq: 10
  input_size: 16  # Patch size for processing

# Save directories
save_dir:
  models: "./saved_models"
  checkpoints: "./checkpoints"
  visualizations: "./visualizations"

# Validation configuration (optional)
validation:
  enable_visual_validation: true
  enable_intermediate_validation: true
  intermediate_validation_freq: 10
  apply_smoothing: true
  smooth_sigma: 0.5
  blend_mode: "gaussian"

# Weights & Biases logging (optional)
use_wandb: true
wandb_project: "dist-s1-training"
wandb_entity: "your-entity"



# Resume training (optional)
# resume_checkpoint: "/path/to/checkpoint.pth"

Accelerate Configuration

Option 1: Interactive Configuration

Set up Accelerate configuration interactively:

accelerate config

Follow the prompts to configure:

Compute environment (local machine or cluster)
Machine type (multi-GPU, multi-node, etc.)
Number of processes/GPUs
Mixed precision settings

Option 2: Manual Configuration

Create an Accelerate config file (accelerate_config.yml):

compute_environment: LOCAL_MACHINE
debug: false
distributed_type: MULTI_GPU  # or NO for single GPU
gpu_ids: all  # or specify specific GPUs like "0,1"
machine_rank: 0
main_training_function: main
num_machines: 1
num_processes: 2  # Number of GPUs to use
rdzv_backend: static
same_network: true
tpu_env: []
tpu_use_cluster: false
tpu_use_sudo: false
use_cpu: false

Training

Single GPU Training

python trainer.py config.yml

or

python trainer_redux.py config_redux.yml

Multi-GPU Training with Accelerate

Using Default Accelerate Config

accelerate launch trainer.py config.yml

Using Custom Accelerate Config

accelerate launch --config_file accelerate_config.yml train.py config.yml

Direct Launch with Parameters

accelerate launch --num_processes 2 train.py config.yml

Advanced Training Options

Disable Torch Compilation

If you encounter issues with PyTorch's dynamo compilation, you can disable it by setting the environment variable:

export TORCH_COMPILE_DISABLE=1
accelerate launch train.py config.yml

Resume Training from Checkpoint

Add the checkpoint path to your config:

resume_checkpoint: "/path/to/checkpoint_epoch_X.pth"

Preserve Standard I/O

To capture training logs:

accelerate launch train.py config.yml > training.log 2> training.err

Monitoring and Validation

Weights & Biases Integration

The training script supports Weights & Biases logging. Configure in your YAML:

use_wandb: true
wandb_project: "your-project-name"
wandb_entity: "your-entity"

Wandb Setup

Before using wandb for the first time, you must open a terminal session, activate the dist-s1-model env and run wandb login. The command line will prompt you for an API key that can be found at https://wandb.ai/home.

Visual Validation

Enable visual validation to monitor training progress:

validation:
  enable_visual_validation: true
  enable_intermediate_validation: true
  intermediate_validation_freq: 10

Checkpointing

Checkpoints are automatically saved based on the checkpoint_freq setting. The training script creates:

Regular checkpoints: checkpoint_epoch_X_MM-DD-YYYY_HH-MM.pth
Model weights: ModelType_MM-DD-YYYY_HH-MM_epoch_X.pth
Final checkpoint: final_checkpoint_MM-DD-YYYY_HH-MM.pth
Emergency checkpoints: Saved automatically on interruption

Troubleshooting

Common Issues

CUDA Out of Memory: Reduce batch_size in your configuration
Compilation Errors: Set environment variable TORCH_COMPILE_DISABLE=1
Multi-GPU Issues: Ensure proper Accelerate configuration
Data Loading Errors: Verify data paths in configuration file

Performance Tips

Adjust input_size based on available GPU memory
Enable gradient accumulation in Accelerate config for larger effective batch sizes

Graceful Interruption

The training script supports graceful interruption (Ctrl+C). It will:

Save an emergency checkpoint
Preserve training metrics
Clean up resources properly

Application

See the included notebooks for model application examples. This section is currently under development.

Data Curation

A separate repository for SAR data curation is planned. This is currently a work in progress.

References

OPERA Disturbance Suite: https://www.jpl.nasa.gov/go/opera/products/dist-product-suite/
Hardiman-Mostow, Harris, Charles Marshak, and Alexander L. Handwerger. "Deep Self-Supervised Disturbance Mapping with the OPERA Sentinel-1 Radiometric Terrain Corrected SAR Backscatter Product." IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2025). arXiv

License

[Add your license information here]

Contributing

[Add contributing guidelines here]

Support

For issues and questions, please create an issue in this repository or contact the maintainers.

Name		Name	Last commit message	Last commit date
Latest commit History 93 Commits
ValidationData		ValidationData
model_data		model_data
notebooks		notebooks
src		src
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
config.yml		config.yml
config_redux.yml		config_redux.yml
environment.yml		environment.yml
environment_gpu.yml		environment_gpu.yml
ruff.toml		ruff.toml
trainer.py		trainer.py
trainer_redux.py		trainer_redux.py

License

opera-adt/dist-s1-model

Folders and files

Latest commit

History

Repository files navigation

DIST-S1 Model

Installation

Environment Setup

Data Setup

Download Required Datasets

Data Configuration

Usage

YAML Configuration File

Accelerate Configuration

Option 1: Interactive Configuration

Option 2: Manual Configuration

Training

Single GPU Training

Multi-GPU Training with Accelerate

Using Default Accelerate Config

Using Custom Accelerate Config

Direct Launch with Parameters

Advanced Training Options

Disable Torch Compilation

Resume Training from Checkpoint

Preserve Standard I/O

Monitoring and Validation

Weights & Biases Integration

Wandb Setup

Visual Validation

Checkpointing

Troubleshooting

Common Issues

Performance Tips

Graceful Interruption

Application

Data Curation

References

License

Contributing

Support

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Uh oh!

Languages

Packages