Skip to content

Commit 228500e

Browse files
committed
2 parents eb2b9fe + b270d05 commit 228500e

File tree

5 files changed

+22
-7
lines changed

5 files changed

+22
-7
lines changed

README.md

+6-5
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
11
# AstroCLIP
22

3-
Official PyTorch implementation and pre-trained models for paper **AstroCLIP: A Cross-Modal Foundation Model for Galaxies**.
3+
Official PyTorch implementation and pre-trained models for paper **AstroCLIP: A Cross-Modal Foundation Model for Galaxies**.
44

55
![image](assets/im_embedding.png)
66

7-
AstroCLIP is a novel, cross-modal, self-supervised foundation model that creates a shared embedding space for multi-band imaging and optical spectra of galaxies. These embeddings encode meaningful physical information shared between both modalities, and can be used as the basis for competitive zero- and few-shot learning on a variety of downstream tasks, including similarity search, redshift estimation, galaxy property prediction, and morphology classification.
7+
AstroCLIP is a novel, cross-modal, self-supervised foundation model that creates a shared embedding space for multi-band imaging and optical spectra of galaxies. These embeddings encode meaningful physical information shared between both modalities, and can be used as the basis for competitive zero- and few-shot learning on a variety of downstream tasks, including similarity search, redshift estimation, galaxy property prediction, and morphology classification.
88

99
## Installation
1010
The training and evaluation code requires PyTorch 2.0. Additionally, an up-to-date eventlet is required for wandb. Note that the code has only been tested with the specified versions and also expects a Linux environment. To install the AstroCLIP package and its corresponding dependencies, please follow the code below.
@@ -14,6 +14,7 @@ pip install --upgrade pip
1414
pip install --upgrade eventlet torch lightning[extra]
1515
pip install -e .
1616
```
17+
It is possible to override default storage path by changing the flag in `astroclip/env.py`
1718

1819
## Pretrained Models
1920

@@ -77,10 +78,10 @@ The directory is organized into south and north surveys, where each survey is sp
7778

7879
## Training
7980

80-
AstroCLIP is trained using a two-step process. First, we pre-train a single-modal galaxy image encoder and a single-modal galaxy spectrum encoder separately. Then, we CLIP align these two encoders on a paired image-spectrum dataset.
81+
AstroCLIP is trained using a two-step process.First, we pre-train a single-modal galaxy image encoder and a single-modal galaxy spectrum encoder separately. Then, we CLIP align these two encoders on a paired image-spectrum dataset.
8182

82-
### Image Pretraining - ViT with DINOv2:
83-
AstroCLIP uses a Vision Transformer (ViT) to encode galaxy images. Pretraining is performed using the [DINOv2](https://github.com/facebookresearch/dinov2/tree/2302b6bf46953431b969155307b9bed152754069) package, which combines self-distillation, masked-modeling, and contrastive objectives. Overall, we use largely the same training regime, however we modify some of the contrastive augmentations to better suit the astrophysical context.
83+
### DINOv2 ViT Image Pretraining:
84+
AstroCLIP uses a Vision Transformer (ViT) to encode galaxy images. Pretraining is performed using the [DINOv2](https://github.com/facebookresearch/dinov2/tree/2302b6bf46953431b969155307b9bed152754069) package, which combines self-distillation, masked-modeling, and contrastive objectives. Overall, we use largely the same training regime, however we modify some of the contrastive augmentations to suit an astrophysics context.
8485

8586
Model training can be launched with the following command:
8687
```

astroclip/env.py

+7-2
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,11 @@
99
WARN_ONCE = True
1010

1111

12+
# TODO: change here the defaults
13+
ASTROCLIP_ROOT = "/mnt/ceph/users/polymathic/astroclip"
14+
WANDB_ENTITY_NAME = "flatiron-scipt"
15+
16+
1217
def default_dotenv_values():
1318
"""Use a default .env but tell the user how to create their own."""
1419

@@ -22,8 +27,8 @@ def default_dotenv_values():
2227
global WARN_ONCE
2328

2429
# TODO: these should be replaced with a folder in the project's root
25-
f.write('ASTROCLIP_ROOT="/mnt/ceph/users/polymathic/astroclip"\n')
26-
f.write('WANDB_ENTITY_NAME="flatiron-scipt"\n')
30+
f.write("ASTROCLIP_ROOT={ASTROCLIP_ROOT}\n")
31+
f.write('WANDB_ENTITY_NAME="{WANDB_ENTITY_NAME}"\n')
2732
f.flush()
2833

2934
if WARN_ONCE:

astroclip/models/__init__.py

+1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
from . import astroclip
22
from .astroclip import AstroClipModel
3+
from .loader import load_model
34
from .moco_v2 import Moco_v2
45
from .specformer import SpecFormer

astroclip/models/loader.py

+7
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
import joblib
2+
from huggingface_hub import hf_hub_download
3+
4+
5+
def load_model(repo_id, filename):
6+
model = joblib.load(hf_hub_download(repo_id=repo_id, filename=filename))
7+
return model

requirements.txt

+1
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
11
astropy
22
datasets
33
dinov2 @ git+https://github.com/facebookresearch/dinov2.git@2302b6bf46953431b969155307b9bed152754069
4+
huggingface_hub
45
jaxtyping
56
lightning[extra]
67
plotly

0 commit comments

Comments
 (0)