Introduce
Street views provide real-time observations but are significantly affected by occlusions from both static (blue regions occluded by the wall and door) and dynamic objects (yellow regions occluded by the vehicles). Additionally, perspective projections lead to sparse observations in distant regions. Integrating satellite imagery enhances perception, particularly in occluded areas and distant regions (orange boxes). However, a key challenge in fusing satellite and street views is the inconsistency of dynamic objects due to the temporal gap between observations (red boxes: absence of the dynamic vehicle in satellite view).Abstract
Existing vision-based 3D occupancy prediction methods are inherently limited in accuracy due to their exclusive reliance on street-view imagery, neglecting the potential benefits of incorporating satellite views. We propose SA-Occ, the first Satellite-Assisted 3D occupancy prediction model, which leverages GPS & IMU to integrate historical yet readily available satellite imagery into real-time applications, effectively mitigating limitations of ego-vehicle perceptions, involving occlusions and degraded performance in distant regions. To address the core challenges of cross-view perception, we propose: 1) Dynamic-Decoupling Fusion, which resolves inconsistencies in dynamic regions caused by the temporal asynchrony between satellite and street views; 2) 3D-Proj Guidance, a module that enhances 3D feature extraction from inherently 2D satellite imagery; and 3) Uniform Sampling Alignment, which aligns the sampling density between street and satellite views. Evaluated on Occ3D-nuScenes, SA-Occ achieves state-of-the-art performance, especially among single-frame methods, with a 39.05% mIoU (a 6.97% improvement), while incurring only 6.93 ms of additional latency per frame.Our SA-Occ exhibits enhanced robustness compared to the baseline, especially in nighttime conditions.
- 2025/03/25: Paper and Occ3D_nuScenes_SatExt dataset are also available on Hugging Face (dataset / paper).
- 2025/03/21: Paper of SA-Occ is available in arxiv.
- 2025/03/17: Code and Occ3D_nuScenes_SatExt dataset of SA-Occ are released. π
- 2025/06/26: Paper of SA-Occ is accepted by ICCV 2025!
Config | Frame | Backbone | Backbone(Sat) | Input Size |
mIoU | Model | Log |
---|---|---|---|---|---|---|---|
BEVDetOCC | 1 | R50 | - | 256x704 | 31.60 | gdrive | log |
M1: FlashOCC | 1 | R50 | - | 256x704 | 32.08 | gdrive | log |
V1: SA-OCC | 1 | R50 | R18 | 256x704 | 39.05 | gdrive | log |
BEVDetOCC-4D-Stereo | 2 | R50 | - | 256x704 | 36.1 | baidu | log |
M2:FlashOCC-4D-Stereo | 2 | R50 | - | 256x704 | 37.84 | gdrive | log |
V2: SA-OCC | 2 | R50 | R18 | 256x704 | 40.65 | gdrive | log |
V3: SA-OCC | 8 | R50 | R18 | 256x704 | 41.69 | gdrive | log |
M3:FlashOCC-4D-Stereo | 2 | Swin-B | - | 512x1408 | 43.52 | gdrive | log |
V4: SA-OCC | 2 | Swin-B | R18 | 512x1408 | 43.90 | gdrive | log |
V5: SA-OCC | 2 | Swin-B | R50 | 512x1408 | 44.29 | gdrive | log |
V5: SA-OCC* | 2 | Swin-B | R50 | 512x1408 | 44.64 | gdrive | log |
conda create --name SA-Occ python=3.8
conda activate SA-Occ
pip install torch==1.10.0+cu111 torchvision==0.11.0+cu111 torchaudio==0.10.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.5.3
pip install mmdet==2.25.1
pip install mmsegmentation==0.25.0
sudo apt-get install python3-dev
sudo apt-get install libevent-dev
sudo apt-get groupinstall 'development tools'
export PATH=/usr/local/cuda/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH
export CUDA_ROOT=/usr/local/cuda
pip install pycuda
pip install lyft_dataset_sdk
pip install networkx==2.2
pip install numba==0.53.0
pip install numpy==1.23.5
pip install nuscenes-devkit
pip install plyfile
pip install scikit-image
pip install tensorboard
pip install trimesh==2.35.39
pip install setuptools==59.5.0
pip install yapf==0.40.1
git clone [email protected]:chenchen235/SA-Occ.git
cd Path_to_SA-Occ
git clone https://github.com/open-mmlab/mmdetection3d.git
cd Path_to_SA-Occ/mmdetection3d
git checkout v1.0.0rc4
pip install -v -e .
cd Path_to_SA-Occ/projects
pip install -v -e .
You can download nuScenes 3D detection data HERE and unzip all zip files.
Like the general way to prepare dataset, it is recommended to symlink the dataset root to $MMDETECTION3D/data
.
The folder structure should be organized as follows before our processing.
βββ Path_to_SA-Occ/
βββ data
βββ nuscenes
βββ maps
βββ samples
βββ sweeps
βββ v1.0-trainval
step 2. For Occupancy Prediction task, download (only) the 'gts' from CVPR2023-3D-Occupancy-Prediction and arrange the folder as:
βββ Path_to_SA-Occ/
βββ data
βββ nuscenes
βββ v1.0-trainval (existing)
βββ sweeps (existing)
βββ samples (existing)
βββ gts (new)
The Occ3D-NuScenes-SatExt dataset is an extension of the Occ3D-nuScenes dataset, integrating satellite imagery with real-time ground-level sensor data to enhance 3D occupancy prediction tasks. This dataset is the first to systematically incorporate satellite data into real-time applications using GPS and IMU for alignment. It enables real-time access to historical satellite imagery, assisting autonomous driving systems in leveraging this data. Additionally, it provides support for egovehicle geolocation and other autonomous driving tasks.
You can download Occ3D-NuScenes-SatExt gdrive (or huggingface for more information) and unzip all zip files and arrange the folder as:
βββ Path_to_SA-Occ/
βββ data
βββ nuscenes
βββ sat
Then, the original satellite map is cropped in a directional manner using the location and orientation information from GPS & IMU, to obtain supplementary data consistent with Occ3D-Nuscenes.
python tools/gen_sat.py
thus, the folder will be ranged as following:
βββ Path_to_SA-Occ/
βββ data
βββ nuscenes
βββ v1.0-trainval (existing)
βββ sweeps (existing)
βββ samples (existing)
βββ gts (existing)
βββ sat (new)
step 4. Download nuScenes-lidarseg from nuScenes official site and put it under data/nuscenes/. Create depth and semantic labels from point cloudby running:
python tools/generate_point_label.py
thus, the folder will be ranged as following:
βββ Path_to_SA-Occ/
βββ data
βββ nuscenes
βββ v1.0-trainval (existing)
βββ sweeps (existing)
βββ samples (existing)
βββ gts (existing)
βββ sat (existing)
βββ lidarseg (new)
βββ samples_point_label (new)
python tools/create_data_bevdet.py
thus, the folder will be ranged as following:
βββ Path_to_SA-Occ/
βββ data
βββ nuscenes
βββ v1.0-trainval (existing)
βββ sweeps (existing)
βββ samples (existing)
βββ gts (existing)
βββ sat (existing)
βββ samples_point_label (existing)
βββ bevdetv2-nuscenes_infos_train.pkl (new)
βββ bevdetv2-nuscenes_infos_val.pkl (new)
# single gpu
python tools/train.py $config
# multiple gpu
./tools/dist_train.sh $config num_gpu
# single gpu
python tools/test.py $config $checkpoint --eval mAP
# multiple gpu
./tools/dist_test.sh $config $checkpoint num_gpu --eval mAP
This project is made possible by the contributions of several key open-source codebases, which we acknowledge below.
Thanks for their excellent work!
If this work is helpful for your research, please consider citing the following BibTeX entry.
@article{chen2025sa,
title={SA-Occ: Satellite-Assisted 3D Occupancy Prediction in Real World},
author={Chen, Chen and Wang, Zhirui and Sheng, Taowei and Jiang, Yi and Li, Yundu and Cheng, Peirui and Zhang, Luning and Chen, Kaiqiang and Hu, Yanfeng and Yang, Xue and others},
journal={arXiv preprint arXiv:2503.16399},
year={2025}
}