Skip to content

Commit 7b2e411

Browse files
committed
decals images
1 parent 692d905 commit 7b2e411

File tree

2 files changed

+16
-5
lines changed

2 files changed

+16
-5
lines changed

README.md

Lines changed: 16 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -146,7 +146,9 @@ dset = load_dataset('astroclip/datasets/legacy_survey.py')
146146

147147
For reproducibility, we include the scripts to generate the cross-matched datasets [here]().
148148

149-
### Image Pretraining
149+
### Image Pretraining Dataset
150+
151+
![image](assets/decals.png)
150152

151153
While the AstroCLIP and Spectrum Encoder models are trained on the image-spectrum dataset, we pretrain the galaxy image model separately on full Stein, et al. (2022) image dataset, which consists of 76M galaxy images. This dataset can be accessed using this globus endpoint:
152154

@@ -157,10 +159,10 @@ The directory is organized into south and north surveys, where each survey is sp
157159

158160
## Pretraining
159161

160-
AstroCLIP is trained using a two-step process.First, we pre-train a single-modal galaxy image encoder and a single-modal galaxy spectrum encoder separately. Then, we CLIP align these two encoders on a paired image-spectrum dataset.
162+
AstroCLIP is trained using a two-step process. First, we pre-train a single-modal galaxy image encoder and a single-modal galaxy spectrum encoder separately. Then, we CLIP align these two encoders on a paired image-spectrum dataset.
161163

162-
### DINOv2 ViT Image Pretraining:
163-
AstroCLIP uses a Vision Transformer (ViT) to encode galaxy images. Pretraining is performed using the [DINOv2](https://github.com/facebookresearch/dinov2/tree/2302b6bf46953431b969155307b9bed152754069) package, which combines self-distillation, masked-modeling, and contrastive objectives. Overall, we use largely the same training regime, however we modify some of the contrastive augmentations to suit an astrophysics context.
164+
### Image Pretraining - DINOv2 ViT:
165+
AstroCLIP uses a Vision Transformer (ViT) to encode galaxy images. Pretraining is performed using the [DINOv2](https://github.com/facebookresearch/dinov2/) package, which combines self-distillation, masked-modeling, and contrastive objectives. Overall, we use largely the same training regime, however we modify some of the contrastive augmentations to suit an astrophysics context.
164166

165167
Model training can be launched with the following command:
166168
```
@@ -177,7 +179,7 @@ spectrum_trainer fit -c config/specformer.yaml
177179
```
178180
We train the model using 4 A100 GPUs (on 1 node) for 30k steps which takes roughly 12 hours.
179181

180-
### CLIP alignment:
182+
### CLIP Alignment:
181183

182184
Once pretrained, we align the image and spectrum encoder using cross-attention projection heads. Model training can be launched with the following command:
183185
```
@@ -188,3 +190,12 @@ We train the model using 4 A100 GPUs (on 1 node) for 15k steps which takes rough
188190
## Downstream Tasks
189191

190192
TODO
193+
194+
## Acknowledgements
195+
This reposity uses datasets and contrastive augmentations from [Stein, et al. (2022)](https://github.com/georgestein/ssl-legacysurvey/tree/main). The image pretraining is built on top of the [DINOv2 pretraining framework](https://github.com/facebookresearch/dinov2/).
196+
197+
## License
198+
AstroCLIP code and model weights are released under the MIT license. See [LICENSE](https://github.com/PolymathicAI/AstroCLIP/blob/main/LICENSE) for additional details.
199+
200+
## Citations
201+
TODO

assets/decals.png

4.77 MB
Loading

0 commit comments

Comments
 (0)