Skip to content

Commit b270d05

Browse files
committed
Merge branch 'main' of github.com:PolymathicAI/AstroCLIP
2 parents 1874376 + 37b14ca commit b270d05

File tree

1 file changed

+9
-15
lines changed

1 file changed

+9
-15
lines changed

README.md

Lines changed: 9 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,33 @@
11
# AstroCLIP
22

3-
The goal of this project is to demonstrate the ability of contrastive pre-training between two different kinds of astronomical data modalities (multi-band imaging, and optical spectra), to yield a meaningful embedding space which captures physical information about galaxies and is shared between both modalities.
3+
Official PyTorch implementation and pre-trained models for paper **AstroCLIP: A Cross-Modal Foundation Model for Galaxies**.
44

55
![image](assets/im_embedding.png)
66

7-
8-
## Getting Started
9-
TODO: Link tutorial notebook.
7+
AstroCLIP is a novel, cross-modal, self-supervised foundation model that creates a shared embedding space for multi-band imaging and optical spectra of galaxies. These embeddings encode meaningful physical information shared between both modalities, and can be used as the basis for competitive zero- and few-shot learning on a variety of downstream tasks, including similarity search, redshift estimation, galaxy property prediction, and morphology classification.
108

119
## Installation
12-
The training and evaluation code requires PyTorch 2.0. Additionally, an up-to-date eventlet is required for wandb. Note that teh code has only been tested with the specified versions and also expects a Linux environment. To install the AstroCLIP package and its corresponding dependencies, please follow the code below.
13-
14-
The following packages are excluded from the project's dependencies to allow for a more flexible system configuration (i.e. allow the use of module subsystem).
10+
The training and evaluation code requires PyTorch 2.0. Additionally, an up-to-date eventlet is required for wandb. Note that the code has only been tested with the specified versions and also expects a Linux environment. To install the AstroCLIP package and its corresponding dependencies, please follow the code below.
1511

1612
```bash
1713
pip install --upgrade pip
1814
pip install --upgrade eventlet torch lightning[extra]
1915
pip install -e .
2016
```
21-
2217
It is possible to override default storage path by changing the flag in `astroclip/env.py`
2318

2419
## Training
2520

26-
AstroCLIP is trained using a two-step process.
21+
AstroCLIP is trained using a two-step process.First, we pre-train a single-modal galaxy image encoder and a single-modal galaxy spectrum encoder separately. Then, we CLIP align these two encoders on a paired image-spectrum dataset.
2722

28-
First, we pre-train a single-modal galaxy image encoder and a single-modal galaxy spectrum encoder separately.
23+
### DINOv2 ViT Image Pretraining:
24+
AstroCLIP uses a Vision Transformer (ViT) to encode galaxy images. Pretraining is performed using the [DINOv2](https://github.com/facebookresearch/dinov2/tree/2302b6bf46953431b969155307b9bed152754069) package, which combines self-distillation, masked-modeling, and contrastive objectives. Overall, we use largely the same training regime, however we modify some of the contrastive augmentations to suite an astrophysics context.
2925

30-
### Image encoder:
31-
The AstroDINO model is based on the DINO_v2 model and can be run from the astrodino subdirectory.
32-
33-
Run with
26+
Model training can be launched with the following command:
3427
```
3528
image_trainer -c astroclip/astrodino/config.yaml
3629
```
30+
Ultimately, we run training using 20 A100 GPUs (on 5 nodes) for 250k steps using the config provided [here](https://github.com/PolymathicAI/AstroCLIP_v2/blob/master/astroclip/astrodino/config.yaml), which takes roughly 46 hours.
3731

3832
### Spectrum encoder:
3933

@@ -43,7 +37,7 @@ spectrum_trainer fit -c config/specformer.yaml
4337
4438
```
4539

46-
## Training alignment model
40+
### CLIP alignment:
4741

4842
AstroCLIP model can be run with:
4943
```

0 commit comments

Comments
 (0)