Skip to content

worldbench/3EED

Repository files navigation

3EED: Ground Everything Everywhere in 3D

Visitors

Rong Li1,*, Yuhao Dong2,*, Tianshuai Hu3,*, Ao Liang4,*, Youquan Liu5,*, Dongyue Lu4,*
Liang Pan6, Lingdong Kong4†, Junwei Liang1,3‡, Ziwei Liu2‡

1HKUST(GZ) · 2NTU · 3HKUST · 4NUS · 5FDU · 6Shanghai AI Lab

*Equal contribution   Project lead   Corresponding authors


3EED Teaser

🎯 Highlights

  • 🌍 Cross-Platform: First 3D grounding dataset spanning vehicle, drone, and quadruped platforms
  • 📊 Large-Scale: Large-scale annotated samples across diverse real-world scenarios
  • 🔀 Multi-Modal: Synchronized RGB, LiDAR, and language annotations
  • 🎯 Challenging: Complex outdoor environments with varying object densities and viewpoints
  • 📏 Reproducible: Unified evaluation protocols and baseline implementations

Statistics

3EED Dataset Statistics

📄 For detailed dataset statistics and analysis, please refer to our paper.

📰 News

  • [2025.10] 📦 Dataset and code are now publicly available on HuggingFace and GitHub!
  • [2025.09] 🎉 3EED has been accepted to NeurIPS 2025 Dataset and Benchmark Track!

📚 Table of Contents

⚙️ Installation

Environment Setup

We support both CUDA 11 and CUDA 12 environments. Choose the one that matches your system:

Option 1: CUDA 11.1 Environment
Component Version
CUDA 11.1
cuDNN 8.0.5
PyTorch 1.9.1+cu111
torchvision 0.10.1+cu111
Python 3.10 / 3.11
Option 2: CUDA 12.4 Environment
Component Version
CUDA 12.4
cuDNN 8.0.5
PyTorch 2.5.1+cu124
torchvision 0.20.1+cu124
Python 3.10 / 3.11

Custom CUDA Operators

cd ops/teed_pointnet/pointnet2_batch
python setup.py develop

cd ../roiaware_pool3d
python setup.py develop

📦 Pretrained Models

Language Encoder

Download the RoBERTa-base checkpoint from HuggingFace and move it to data/roberta_base.

💾 Dataset

Download

Download the 3EED dataset from HuggingFace:

🔗 Dataset Link: https://huggingface.co/datasets/RRRong/3EED

Dataset Structure

After extraction, organize your dataset as follows:

data/3eed/
├── drone/                    # Drone platform data
│   ├── scene-0001/
│   │   ├── 0000_0/
│   │   │   ├── image.jpg
│   │   │   ├── lidar.bin
│   │   │   └── meta_info.json
│   │   └── ...
│   └── ...
├── quad/                     # Quadruped platform data
│   ├── scene-0001/
│   └── ...
├── waymo/                    # Vehicle platform data
│   ├── scene-0001/
│   └── ...
├── roberta_base/            # Language model weights
└── splits/                  # Train/val split files
    ├── drone_train.txt
    ├── drone_val.txt
    ├── quad_train.txt
    ├── quad_val.txt
    ├── waymo_train.txt
    └── waymo_val.txt

🚀 Quick Start

Training

Train the baseline model on different platform combinations:

# Train on all platforms (recommended for best performance)
bash scripts/train_3eed.sh

# Train on single platform
bash scripts/train_waymo.sh   # Vehicle only
bash scripts/train_drone.sh   # Drone only
bash scripts/train_quad.sh    # Quadruped only

Output:

  • Checkpoints: logs/Train_<datasets>_Val_<datasets>/<timestamp>/
  • Training logs: logs/Train_<datasets>_Val_<datasets>/<timestamp>/log.txt
  • TensorBoard logs: logs/Train_<datasets>_Val_<datasets>/<timestamp>/tensorboard/

Evaluation

Evaluate trained models on validation sets:

Quick Evaluation:

# Evaluate on all platforms
bash scripts/val_3eed.sh

# Evaluate on single platform
bash scripts/val_waymo.sh    # Vehicle
bash scripts/val_drone.sh    # Drone
bash scripts/val_quad.sh     # Quadruped

⚠️ Before running evaluation:

  1. Update --checkpoint_path in the script to point to your trained model
  2. Ensure the validation dataset is downloaded and properly structured

Output:

  • Results saved to: <checkpoint_dir>/evaluation/Val_<dataset>/<timestamp>/

Visualization

Visualize predictions with 3D bounding boxes overlaid on point clouds:

# Visualize prediction results
python utils/visualize_pred.py

Visualization Output:

  • 🟢 Ground Truth: Green bounding box
  • 🔴 Prediction: Red bounding box

Output Structure:

visualizations/
├── waymo/
│   ├── scene-0001_frame-0000/
│   │   ├── pointcloud.ply
│   │   ├── pred/gt_bbox.ply
│   │   └── info.txt
│   └── ...
├── drone/
└── quad/

Baseline Checkpoints

Baseline models and predictions are available at: Huggingface

📖 Citation

If you find our work helpful, please consider citing:

@inproceedings{li2025_3eed,
  title     = {3EED: Ground Everything Everywhere in 3D},
  author    = {Rong Li and Yuhao Dong and Tianshuai Hu and Ao Liang and 
               Youquan Liu and Dongyue Lu and Liang Pan and Lingdong Kong and 
               Junwei Liang and Ziwei Liu},
  booktitle = {Advances in Neural Information Processing Systems (NeurIPS) 
               Datasets and Benchmarks Track},
  year      = {2025}
}

📄 License

This repository is released under the Apache 2.0 License (see LICENSE).

🙏 Acknowledgements

We sincerely thank the following projects and teams that made this work possible:

Codebase & Methods

  • BUTD-DETR - Bottom-Up Top-Down DETR for visual grounding
  • WildRefer - Wild referring expression comprehension

Dataset Sources


❤️ by the 3EED Team

⬆️ Back to Top

Releases

No releases published

Packages

No packages published