GitHub - FudanCVL/SynFMC: [ICCV 2025] Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation

Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation

Xincheng Shuai¹ · Henghui Ding ¹ . Zhenyuan Qin¹ . Hao Luo^2,3 · Xingjun Ma¹ . Dacheng Tao⁴

¹Fudan University · ²DAMO Academy, Alibaba group · ³Hupan Lab · ⁴Nanyang Technological University, Singapore

🎯 Introduction

Controlling the movements of dynamic objects and the camera within generated videos is a meaningful yet challenging task. Due to the lack of datasets with comprehensive 6D pose annotations, existing text-to-video methods can not simultaneously control the motions of both camera and objects in 3D-aware manner. Therefore we introduce a Synthetic Dataset for Free-Form Motion Control (SynFMC). The proposed SynFMC dataset includes diverse object and environment categories and covers various motion patterns according to specific rules, simulating common and complex real-world scenarios. The complete 6D pose information facilitates models learning to disentangle the motion effects from objects and the camera in a video. To provide precise 3D-aware motion control, we further propose a method trained on SynFMC, Free-Form Motion Control (FMC). FMC can control the 6D poses of objects and camera independently or simultaneously, producing high-fidelity videos.

Figure 1. The rule-based generation pipeline of videos in the proposed Synthetic Dataset for Free-Form Motion Control (SynFMC). This example generates synthetic video with three objects: (1) The environment asset and it’s matching object assets are selected as the scene elements. (2) The motion types of objects and camera are randomly selected for trajectory generation. (3) The center region shows the resulting 3D animation sequence used for rendering. The rendered video and annotations are demonstrated in the last row.

Figure 2. The architecture of FMC. In the first stage, we randomly sample the images from synthetic videos and update the parameters from injected Domain LoRA. Next, the modules from CMC are learned. It consists of two parts: Camera Encoder and Camera Adapter, where the Camera Adapter is introduced into the temporal modules. Finally, we train the Object Encoder from OMC. It receives the 6D object pose features, which are repeated in the corresponding object region. We use Gaussian blur kernel centered at the centroid to prevent the need of precise masks. Then, the output is multiplied by the coarse masks to modulate the features in the main branch.

⚙️Quick Start

1. Setup

conda env create -f environment.yaml
conda activate fmc

2. Training

The training process of FMC consists of three stages.

2.1 Learn Domain LoRA

In the first stage, we randomly sample the images from synthetic videos and update the parameters from injected Domain LoRA.

bash dist_run_lora.bash

2.2 Learn Camera Motion Controller (CMC)

Next, the modules from CMC are learned. Inspired by Cameractrl, it consists of two parts: Camera Encoder and Camera Adapter, where the Camera Adapter is introduced into the temporal modules.

bash dist_run_cam.bash

2.3 Learn Object Motion Controller (OMC)

Finally, we train the Object Encoder from OMC. It receives the 6D object pose features, which are repeated in the corresponding object region. We use Gaussian blur kernel centered at the centroid to prevent the need of precise masks. Then, the output is multiplied by the coarse masks to modulate the features in the main branch.

bash dist_run_obj.bash

✒️ Citation

If you find our work useful for your research and applications, please kindly cite using this BibTeX:

@inproceedings{SynFMC,
        title={{Free-Form Motion Control}: Controlling the 6D Poses of Camera and Objects in Video Generation},
        author={Shuai, Xincheng and Ding, Henghui and Qin, Zhenyuan and Luo, Hao and Ma, Xingjun and Tao, Dacheng},
        booktitle={ICCV},
        year={2025}
      }

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github		.github
assets		assets
configs		configs
fmc		fmc
.gitignore		.gitignore
README.md		README.md
dist_run_cam.bash		dist_run_cam.bash
dist_run_lora.bash		dist_run_lora.bash
dist_run_obj.bash		dist_run_obj.bash
environment.yaml		environment.yaml
train_cam_ctrl.py		train_cam_ctrl.py
train_cam_obj_ctrl.py		train_cam_obj_ctrl.py
train_image_lora.py		train_image_lora.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation

🎯 Introduction

⚙️Quick Start

1. Setup

2. Training

2.1 Learn Domain LoRA

2.2 Learn Camera Motion Controller (CMC)

2.3 Learn Object Motion Controller (OMC)

✒️ Citation

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

FudanCVL/SynFMC

Folders and files

Latest commit

History

Repository files navigation

Free-Form Motion Control: Controlling the 6D Poses of Camera and Objects in Video Generation

🎯 Introduction

⚙️Quick Start

1. Setup

2. Training

2.1 Learn Domain LoRA

2.2 Learn Camera Motion Controller (CMC)

2.3 Learn Object Motion Controller (OMC)

✒️ Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages