This repository contains the code implementation for the paper titled "Size-Modulated Deformable Attention in Spatio-Temporal Video Grounding Pipelines".
The code in this repository is based on the work done in the github repository of Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding. Our work extends and modifies the concepts introduced in the STCAT paper to incorporate size-modulated deformable attention mechanisms.
- Clone this repository:
git clone https://github.com/Hans7331/stcat-code.git cd stcat-code
- Install the required dependencies:
pip install -r requirements.txt
To run the basic experiment implemented in this repository, use the following command:
python train_net.py --config-file "experiments/HC-STVG/e2e_STCAT_R101_HCSTVG.yaml"
python train_net.py --config-file "experiments/VidSTG/e2e_STCAT_R101_VidSTG.yaml"
The code for other experiments is currently being updated and will be added to the repository soon. Stay tuned for updates!